简体   繁体   中英

Type of Character generated by UUID

  1. Does java.util.UUID generates special characters?
  2. What are the type of each character (eg- Uppercase, lower case, digits) generated by UUID.

tl;dr

You asked:

Does java.util.UUID generates special characters?

No. A UUID is actually a 128-bit value , not text.

A UUID's textual representation is canonically a string of hex digits (0-9, af, AF) plus hyphens.


You asked:

What are the type of each character (eg- Uppercase, lower case, digits) generated by UUID.

As required by the UUID spec, any a-to-f characters in the hex string representing a UUID value must be in all lowercase . But violations abound.

UUID ≠ text

To clarify, a UUID is actually a 128-bit value , not text, not digits.

You could think of them as 128-bit unsigned integers. But they are not actually numbers, as certain bit positions have semantics, specific meanings. Which bits have which meanings varies by variant and by version of UUID.

Hex string

Humans don't do well reading and writing 128 bits as 128 1 and 0 characters. When a UUID needs to be written for human consumption, we use a base-16 Hexadecimal (digits 0 - 9 and letters a - f ) string. We use 32 hex characters grouped with 4 hyphens to represent those 128 bits in a total of 36 characters. For example:

550e8400-e29b-41d4-a716-446655440000

No "Special" Characters

As for "special characters" mentioned in the Question, you will only see these 23 possible characters in a hex-string representation of a UUID:

abcdefABCDEF1234567890-

Lowercase Required By Spec

The latest international spec dated 2008-08 states (emphasis mine):

6.5.4 Software generating the hexadecimal representation of a UUID shall not use upper case letters. NOTE – It is recommended that the hexadecimal representation used in all human-readable formats be restricted to lower-case letters. Software processing this representation is, however, required to accept both upper and lower case letters as specified in 6.5.2.

Violations Common

However, Microsoft, Apple, and others commonly violate the lowercase rule. At one point Microsoft released software that generated mixed case (using both upper- and lowercase), apparently an unintended feature.

So do as the spec says:

  • Use lowercase for output.
  • Tolerate either lowercase or uppercase for input.

The Java documentation for the UUID class' toString method documents in BNF that uppercase is allowed when generating a string, in contradiction to the UUID standard specification. However the actual behavior of the class and its toString method in the Oracle implementation for Java 8 is correct, using lowercase for output but tolerating either uppercase or lowercase for input.

Input in either lower-/uppercase:

UUID uuidFromLowercase = UUID.fromString ( "897b7f44-1f31-4c95-80cb-bbb43e4dcf05" ); 
UUID uuidFromUppercase = UUID.fromString ( "897B7F44-1F31-4C95-80CB-BBB43E4DCF05" );

Output to lowercase only:

System.out.println ( "uuidFromLowercase.toString(): " + uuidFromLowercase );
System.out.println ( "uuidFromUppercase.toString(): " + uuidFromUppercase );

uuidFromLowercase.toString(): 897b7f44-1f31-4c95-80cb-bbb43e4dcf05

uuidFromUppercase.toString(): 897b7f44-1f31-4c95-80cb-bbb43e4dcf05

See this code run live in IdeOne.com .

Nil value

When the UUID is not yet known, you can use a special UUID consisting of all zeros.

00000000-0000-0000-0000-000000000000

Example Values

You can see some examples of UUID values by using any of the many web sites that generate values. For example:

Or use a command-line tool. Nearly every operating system comes bundled with such a tool. On Mac OS X, launch Terminal.app and type uuidgen .

The javadoc for java.util.UUID links to RFC 4122 which says

 Each field is treated as an integer and has its value printed as a zero-filled hexadecimal digit string with the most significant digit first. The hexadecimal values "a" through "f" are output as lower case characters and are case insensitive on input.

So no, it will not generate special characters.

A UUID doesn't consist of characters, unless you ask it to be converted into a string. At that point, it will be turned into a string consisting of hex characters and hyphens, as described by the docs for UUID.toString() .

(It's not documented whether the hex digits will be upper or lower case.)

According to Internet RFC 4122 ,

Each field is treated as an integer and has its value printed as a zero-filled hexadecimal digit string with the most significant digit first. The hexadecimal values "a" through "f" are output as lower case characters and are case insensitive on input.

If you respect internet standard, always use lower-case.

Though BNF defines upper-case letters, it is for input, not output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM