May you network in interesting times

Part and parcel of learning about IPv6 has been establishing an internal infrastructure capable of running it, and which must be readily reconfigurable in order to test common configurations.  Stephen H and I have spent dozens of hours trying various configurations of modern, common, small-network configurations as part of the research v6 testbed.

In no particular order, here are some things we’ve learned:

 

China  – For years we used to say to people who claimed to be interested in network security, “Listen, most Internet-based attacks are coming from address blocks allocated to China, and tracerouting would seem to indicate that there are thousands and thousands and millions of break-in attempts every day.” And…they’d look at me as though I’d shown them a mouthful of black beetles whilst wearing an aluminum-lined baseball helmet, so eventually I just quit talking about it. 

Mind you, this started in the early 1990s.  As I’ve been watching network traffic, I noticed it’s only gotten worse, so I’ve blocked basically all APNIC and Russian address blocks, which has significantly cut down on the random attack traffic I’ve seen.

Why should we care?  Because one big feature of v6 is that it [potentially] brings back the end-to-end architecture that is TCP/IP’s original primary design feature.  You won’t need NAT and everything will be hunky-dory, and or peachy keen, whatever those mean.  Unfortunately, one good side effect of NAT is security-through-inaccessibility, as in you can’t attack what you can’t get to.  With e-to-e v6, though, your hosts are potentially once again available for attack.  Because c6 hasn’t been widely tested (by crackers), it’s a good bet that many implementations will have a lot of security problems in their first few years of deployment.

Windows IP Helper Service – IPv6 transitional assistance service (“Provides tunnel connectivity using IPv6 transition technologies (6to4, ISATAP, Port Proxy, and Teredo), and IP-HTTPS. If this service is stopped, the computer will not have the enhanced connectivity benefits that these technologies offer.”)  While debugging some other network issues, I discovered that at least on one machine, having this service running was generating a lot of network activity to weird random IP addresses on the Internet.  It’s off for the moment until I can confirm that this is normal behavior.

 

DNS – Domain controllers with multiple NICs that are running DNS will not necessarily return the expected IP address.  The default setting is to round-robin-return the various NIC addresses that register themselves with DNS.  Our first thought was, “turn off round robin,” but that didn’t work, and then “don’t allow DNS registration,” and that didn’t work.

Current status:  not solved

Current “solution”: from Microsoft-> don’t run DCs with multiple NICs.  Seriously, this is their advice. 

Advertisements

IP v6 – who cares and/or so what?

For reasons that escape me, many of my initial pokings at IP v6 devolve into references to things in the Hitchhiker’s Guide To The Galaxy (the novel, and not the actual guide, which doesn’t actually exist, mostly.)

IP v6 on Wikipedia: http://en.wikipedia.org/wiki/IPv6

There are several reasons above and beyond wanting to be in front of early-adopter customers:

  • Its improved address space is both larger and better organized, which makes managing information from/about large numbers of devices easier.
  • Improvements and additions to protocols, specifically I’m thinking about
    • Multicast (sending a single packet to multiple destinations)
    • potential for improved performance, especially when transiting routers because v6 routers never fragment a packet
    • Mobile routing as efficient as “regular” routing
    • Jumbograms – want 4 GiB in a packet?  OK.  NO problem.

Lastly, at least for the moment, is that IP v6 seems to be surrounded by an SEP field.  Rather than waiting for someone else to do something about it, we’re going to grasp the nettle firmly and pull ourselves up by our bootstraps and eat our own dogfood.  Or something.  It’s going to have to happen, because v4 is already creaky.  To quote a total scumbag, we really need to “get thar fust with the most men.”

The Hitchhiker’s Guide to the Galaxy has this to say about IP v6’s address space,

“Bigger than the biggest thing ever and then some. Much bigger than that in fact, really amazingly immense, a totally stunning size, real ‘wow, that’s big’, time. Infinity is just so big that by comparison, bigness itself looks really titchy. Gigantic multiplied by colossal multiplied by staggeringly huge is the sort of concept we’re trying to get across here.”

In the interest of honesty, as slim as that may be, I should also point out that the HHG actually says that about infinity, and not IP v6, but it could have.  Perhaps this quote is a bit more accurate,

The car shot forward straight into the circle of light, and suddenly Arthur had a fairly clear idea of what infinity looked like. It wasn’t infinity in fact. Infinity itself looks flat and uninteresting. Looking up into the night sky is looking into infinity — distance is incomprehensible and therefore meaningless. The chamber into which the aircar emerged was anything but infinite, it was just very very big, so that it gave the impression of infinity far better than infinity itself.

That’s why it’s important.  Because there’s room for everything.  Everything, or nearly enough everything that anything left over isn’t worth bothering oneself about.  Currently, IP v4 is 32 bits of address space, which if it were evenly parceled out, would give about 2/3 of the people on the planet a single address.  Not nearly good enough, since the phone companies want everyone to have a phone, right?

IP v6 has 128 bits/16 bytes, which as described above, is beyond big.  “In a different perspective, this is 252 (about 4.5×1015) addresses for every observable star in the known universe.” – Wikipedia article on IP v6.  When CygNet is ready to monitor every cell and bacterium in your body, we’ll be able to assign each of them their own IP address. 

So, yes, rather large enough.  For now.

Stupid Ideas, Part 2,347,292: IP v6 for device/point/tag identification

Remember how I said the IPv6 address space is big?  Here’s a diagram where the v4 address space is shown as a box 1.6 inches on a side.  Using the same scale, the box for the IPv6 address space is a square about the size of our solar system.  So…yeah…bigness incarnate.  I see your bigness and raise you 340 trillion trillion trillion addresses.

I used to be in the Object Oriented DataBase business, but am all better now, thank you for asking.  This is relevant because the only way to make objects work is when you can uniquely identify entities.  Various databases have come up with different ways to do this:

  • unique table row identifiers
  • mapping in-memory objects via an address to/from storage
  • single-table databases with hundreds of columns
  • mapping in-memory pages to/from storage
  • and let’s not forget the Web.  A URI is specifically a way to refer to specific information, or in the case of a search engine, a way to gain an indirect reference to types of information, which typically are taxonomy free and referred to only by bags of keywords.

It occurred to me that we could easily, trivially in fact, represent every entity (device, facility, tag, point, definition, etc.) ever created, ever, in every copy of CygNet ever installed, anywhere, and all of these items created in the future, in an infinitesimal fraction of the IPv6 address space.  And yes, this would include meta-objects like collections of entities, and collections of collections.  Oh, and I should point out that you never have to delete an item if you don’t want to.  IPv6 is the Alice’s Restaurant of networking.  You don’t have to clean up for a very long time.

You do end up needing an ARS on steroids that’s essentially a high-performance metadata (location of an entity as a URI) server to help you get and store items, and to pass out new PUIDs (Persistent GUID) and to communicate with other EMSs (Entity Metadata Service).

EMSs would work in a fashion analogous to DNS, in that not every EMS would have a list of everything everywhere.  They would know who to ask, and I would eschew a hierarchical model in favor of a federated model where groups of EMSs agree to resolve certain stretches of the address space.  In addition to EMSs, we’d have a Storage Management Service (SMS) responsible for storing and retrieving the actual entities.

I hear you asking, “Ummm…have you been drinking?”

No, I don’t drink or do drugs.  Try again.

“Why? Why would you want to do this?  What could possibly justify this model?”

Answers:

Short Version:

Page numbers didn’t become common until several generations after the introduction of the printing press.  Page numbers were the addressing unit of the late Middle Ages.  They made business and reference science possible (along with Arabic numerals).  A flat, effectively infinite address space is the Table Of Contents and Index (both of these a side-effect of page numbering)  of the coming era of ubiquitous networked computing.

You need to be able to uniquely identify, instantiate, and store business entities, and the simpler it is to do this, the cheaper it is.

Long Version:

Every generation of digital technology was predicated on a “fact” that addressability had some upper limit:  30KW, 4Kb, 64Kb, 640Kb, 2GB, 4GB, etc., and this is only memory.  Similar curves exist for data and metadata.  Every generational transition involved an order of magnitude increase (at least) in the available addressable space accessible by computational entities.  Old technologies had to be thrown away, or maintained at great cost (by increasingly scarce and expensive developers willing to remain locked to old tech) to stay functional.

I’d like to avoid that problem.  Shifting to common address space makes certain types of planning and debugging much easier, especially if you use logical subdivisions, e,g.  CygNet image backups for Customer Foo at EOY 2015 occupy addresses FOO-BAR, forever.  Connect a support EMS with management authorization and you’re looking at customer data.

Page numbers weren’t invented, so far as I have been able to determine (yes, I’ve looked for this), for 80-100 years after the invention and common use of the printing press (in Europe, at least).  Indices took even longer.  Why?  Because it never occurred to anyone, except lunatics, that you might want to somehow categorize and access data in anything other than a linear fashion.  Religious texts seem to have been the first to adopt this model, to make it quicker and easier to find verses justifying whatever horrible thing you were thinking of doing. 

The history of ordered alphabets is similar.  Today, most developers and analysts literally have never thought of a universe in which everything exists via a single addressing mechanism.  It’s as inconceivable as the notion of page numbers was to Dark Age scribes.  But…it has to happen.  Why?

Ronald Coase is why.  His work on transaction costs leads me to conclude that corporations will, indeed must adopt technologies that lower transaction costs, and unified addressing is part of that, just as page numbers and Arabic numbering systems made accounting and universal literacy possible.

I believe that this is probably part of what’s driving Microsoft’s most recent foray into functional programming, e.g. http://msdn.microsoft.com/en-us/magazine/cc164244.aspx – Data is just…data.  You use it, you store it, you request it.  The addressing mechanism shouldn’t get in the way, and right now we have in-memory structures, on-disk structures, and database structures.  The real-world limitation is, of course, speed of access, but the SMS would be responsible for keeping data fresh, cached, and backed-up to slow storage as required.

Yeah…I know, I’m still trying to convince myself that this is necessary, and I fully understand that it’s not currently necessary, but just as Gary Wescom foresaw an era with billions of computers networked together, I’m thinking ahead to an era where we have effectively uncountable computational entities that need to be managed.

Other services required:

MSMS – Metadata Schema Mapping Service – translates between CygNet v6 addresses and foreign address taxonomies designed to represent unique items.  It would almost certainly be a specialty implementation of the EMS.

Examples

  • Library Of Congress
  • ISBN
  • Database Mappings (location + database name + row Id + …)
  • Manufacturer-specific serial number datastores
  • Structured, site/domain-specific URIs
  • Social Security Numbers
  • Phone Numbers

You are in a maze of twisty little #include statements, all alike

While sussing out the IPv4->IPv6 in the current code base, I’ve run across the problem of not knowing enough about the entire product to be able to easily track the include dependencies.  I’ve been using a neat tool that tracks these down and graphs them directly in Visual Studio.  Probably one of the best $40 I’ve ever spent.

https://secure.profactor.co.uk/includemanager.php

In the tool diagram, all the drawing entities are “live”.  Square boxes are user header files and diamonds are system includes.  You can hover the mouse to see the full path, open the file, etc.  If you hover over a line connected two files, it will display the file position (line number) where one includes the other.

The most general use of the tool during building is to see which other files will rebuild if one is touched.  There’s also a “build impact” line graph, which they describe thusly:

http://www.profactor.co.uk/forum/viewtopic.php?t=51

“The build impact is essentially an approximation of the cost of including a file, relative to the total cost of compiling a base source file. The estimation is based on token counts from the preprocessor, as a compiler-centric equivalent to the popular ‘lines of code’ metric. Its primary use is in discovering build bottlenecks and determining why some files take a long time to compile. “

 

CygNet NET.H analyses

Here’s a high-level view of the include graph for CygNetSourceSupportNetNet.h (click to zoom in)

Click to view at full size

Here’s a slightly more useful zoomed-in view (click to zoom in)

Click to view at full size

RFC 4291 – IPv6 Addressing Architecture

http://www.ietf.org/rfc/rfc4291.txt

Interesting things to remember:

   IPv6 addresses are 128-bit identifiers for interfaces and sets of
   interfaces (where "interface" is as defined in Section 2 of [IPV6]).
   There are three types of addresses:

    Unicast:   An identifier for a single interface.  A packet sent to a
               unicast address is delivered to the interface identified
               by that address.
    Anycast:   An identifier for a set of interfaces (typically
               belonging to different nodes).  A packet sent to an
               anycast address is delivered to one of the interfaces
               identified by that address (the "nearest" one, according
               to the routing protocols' measure of distance).
    Multicast: An identifier for a set of interfaces (typically
               belonging to different nodes).  A packet sent to a
               multicast address is delivered to all interfaces
               identified by that address.

   There are no broadcast addresses in IPv6, their function being
   superseded by multicast addresses.
---
2.5.5.1.  IPv4-Compatible IPv6 Address

   The "IPv4-Compatible IPv6 address" was defined to assist in the IPv6
   transition.  The format of the "IPv4-Compatible IPv6 address" is as
   follows:
   |                80 bits               | 16 |      32 bits        |
   +--------------------------------------+--------------------------+
   |0000..............................0000|0000|    IPv4 address     |
   +--------------------------------------+----+---------------------+

   Note: The IPv4 address used in the "IPv4-Compatible IPv6 address"
   must be a globally-unique IPv4 unicast address.

   The "IPv4-Compatible IPv6 address" is now deprecated because the
   current IPv6 transition mechanisms no longer use these addresses.
   New or updated implementations are not required to support this
   address type.

2.5.5.2.  IPv4-Mapped IPv6 Address

   A second type of IPv6 address that holds an embedded IPv4 address is
   defined.  This address type is used to represent the addresses of
   IPv4 nodes as IPv6 addresses.  The format of the "IPv4-mapped IPv6
   address" is as follows:
   |                80 bits               | 16 |      32 bits        |
   +--------------------------------------+--------------------------+
   |0000..............................0000|FFFF|    IPv4 address     |
   +--------------------------------------+----+---------------------+

   See [RFC4038] for background on the usage of the "IPv4-mapped IPv6
   address".