简体   繁体   中英

RFC-recommended representation of IPv6 addresses and inet_ntop in C

Reading the RFC recommendations ( RFC 5952 ) for how to represent IPv6 addresses in the best way, I try to implement a function in C++ that converts an array of bytes into the appropriate textual representation, ie std::string .

To check my code for correctness, I compare my results with what inet_ntop ( #include <arpa/inet.h> ) returns. Note that I am actually using the Windows-equivalent #include <ws2tcpip.h> .

For most cases, my function has the same behavior and I fully understand the underlying rules (leave out leading zeros, compress the longest block of zeros by replacnig it with "::", and so on).

But the interesting part is the following : As far as I understand, for some special addresses (IPv4-mapped, IPv4-compatible, and IPv4-translated IPv6 addresses, see RFC 2765 ), it is recommended to represent the otherwise hexa-decimal notation of the last 4 bytes with the dotted decimal notation that is common for IPv4 addresses.

For example, ::ffff:0:168.0.0.1 , ::ffff:168.0.0.1 and ::168.0.0.1 are all valid IPv6 addresses in their recommended textual representations and inet_ntop comes to that conclusion as well.

And now to my question : The IPv6 address 0:0:0:0:0:ffff:0.0.1.1 is, according to inet_ntop , shortened to ::ffff:0:101 , choosing the hexa-decimal representation again. What is the reason for this behavior? I would think that because we have a special address prefix here, the dotted decimal notation would be used regardless of the fact the the first two bytes of the last 4 byte block are zero, and therefore writing it as ::ffff:0.0.1.1 . Am I misunderstanding the RFC recommendation or is inet_ntop not consistent in this regard? I observed that inet_ntop is in all other of my testcases very much RFC-conform.

I hope you can help me.

Edit : After some more testing, it does seem that inet_ntop does always choose the hexa-decimal notation over the dotted decimal one, even though the IPv6 address is actually just some of those special embedded IPv4 addresses, if the first two bytes of that last 4 byte block are zero.

FWIW, on my Ubuntu (glibc 2.31), I get:

::ffff:0:168.0.0.1      -> ::ffff:0:a800:1      *
::ffff:168.0.0.1        -> ::ffff:168.0.0.1
::168.0.0.1             -> ::168.0.0.1
0:0:0:0:0:ffff:0.0.1.1  -> ::ffff:0.0.1.1       *

where the two marked ones seem to differ from your results, showing that this implementation doesn't recognize the ::ffff:0 prefix, but doesn't dislike the embedded 0.0.1.1 that much.

However, with the all-zeroes prefix, I get a similar result: if the top two bytes of the IPv4 address are zero, the output format changes:

::1.2.3.4           -> ::1.2.3.4
::0.2.3.4           -> ::0.2.3.4
::0.0.3.4           -> ::304
::0.0.0.4           -> ::4

That's probably because ::1 shouldn't turn into ::0.0.0.1 , but it means there must be some line drawn for what IPv4 addresses are shown in mixed notation, at least with the all-zero prefix.

So, could this just be a bug or quirk of the library you have? If it does show similar behaviour with both the all-zeroes prefix and other prefixes, perhaps they've just decided to go the easy way and treat all of them equally. As far as I can figure out, the whole 0/8 block is still reserved for "Local Identification" , anyway, so I wonder if addresses like 0.0.xx even come up embedded in IPv6. (I don't know, though.)

The RFC also specifies the mixed notation only as "RECOMMENDED", so we can't really say the implementation you have is wrong.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM