简体   繁体   中英

Substitute for MySQL's UUID Version 1 function?

Context

Web application, PHP 5, MySQL 5.0.91

The Problem

I recently switched from using an auto-incremented integer to a UUID as a primary key for some of my tables. When generating UUID's via MySQL's UUID() function, they are extremely similar to one another:

| uuid                                 |
----------------------------------------
| 1e5988da-afec-11e1-9877-5464f7aa6d24 |
| 408092aa-afad-11e1-9877-5464f7aa6d24 |
  ^------^   ^^
  1      8   11-12

As you can see, only the first 8 characters and the 11th and 12th are different. I understand that UUID Version 1 uses a timestamp and hardware MAC address to generate the UUID. However, I am hesitant in using Version 1 because of these similarities (and the fact that the MAC address will never change, in my case). In addition, if the MAC address never changes, most of the UUID is useless and is wasting space.

My Custom UUID Function

As an experiment, I wrote a custom UUID-generator in PHP:

public static function GenerateUUID()
{
    return
    substr(sha1(Account::GetUsername() . Account::GetUserID()), 18, 8) . "-" .
    substr(md5(time()), rand() % 28, 4) . "-" . 
    substr(md5(date("Y")), rand() % 28, 4) . "-" . 
    substr(sha1(rand()), 20, 4) . "-" . 
    substr(sha1(rand() % PHP_INT_MAX), 17, 12);
}

A sample of the results:

| uuid                                 |
----------------------------------------
| 574d18c2-5080-bac9-5597-45435f363ea1 |
| 574d18c2-30d4-8b5b-4ffd-001744d3d287 |

Here, the first 8 characters are identical for the same user. This was intended, but not needed.

The Question

Is there a preferred/recommended way to generate a Version 4 or Version 5 UUID within a MySQL query ?

If not, is it acceptable to generate a custom UUID within PHP (as above) that does not conform to a specification?

Restrictions

  • I am using a shared hosting plan with command-line access, but cannot modify the existing MySQL installation.
  • I would prefer avoiding third-party packages/libraries.

Notes

  • I do not and will not be performing merging, synchronization, or other operations that require a GUID that contains the MAC address. That is not the issue here.

Your concern, that "most of the UUID is useless and is wasting space" is inherent to the size of the data type. You will never be able to have as many entries in your database, as the theoretical limit of 16 bytes allows.

In fact, V1 UUID is more fit than V4 if you use the UUID just as a table ID - because it uses MAC-address and time stamp to prevent clashes. In V4 there is no such mechanism, although practically you don't need to worry too much about clashes either :) You should use V4 UUID instead of V1 if you need your UUID to be unpredictable.

Also, note that composing for example 4x4 byte random values may not be the same as creating a 16 byte random value. As always with crypto and randomness: I would disadvise from implementing your own UUID::V4 routine.

If installed on your machine, you can make use of the php-uuid package.

An example code (which can be used in your application as is) can be found here: http://rommelsantor.com/clog/2012/02/23/generate-uuid-in-php/

Use it like this:

$uuid = uuid_create(1);

Users that are are able to install packages on their webserver, can install the required package, like: (here for ubuntu)

apt-get install php5-dev uuid-dev
pecl install uuid

It's actually a fairly good idea to appreciate having the "similar parts". It will allow you to leverage the MAC address to be able to identify "which of my servers generated this UUID?"... which will be extremely helpful when migrating data between remote locations. You can even do "this is my test data" and "this is my production data" this way.

PHP has a large number of UUID-generator libraries.

Here's one PECL/PEAR thing (I never used it):

http://pecl.php.net/package/uuid

From the CakePHP framework:

http://api.cakephp.org/class/string#method-Stringuuid (cake 2.x) http://api13.cakephp.org/class/string#method-Stringuuid (cake 1.3)

Last generator option:

Consider using a Linux command-line uuid program, which would have the -v version control flag and related options, and using that to feed your database. It's sort of inefficient, but at least you won't have to write up your own generator functions.

http://linux.die.net/man/1/uuid - man page

(package uuid for Debian)

I noticed that for the namespace versions, you'll be generating lots of "long human names" to convert into uuids. As long as you don't have conflicts with those, it might be very sweet. For example, users registering with e-mail addresses... Get v5 uuid for that e-mail address... you'll always find that person! It seems to spit out the same UUID each time, and the UUID will represent the unique relationship bob@bob.com has with example.com, as a member.

uuid -v5 ns:URL "http://example.com/member/bob@bob.com/"

Commentary:

Also, UUIDs, the way you seem to be storing them, are CHAR(36)? You might regret that once comparison operators kick in.

Postgres will treat UUID as 128-bit values (and presumably do optimized binary operations), whereas MYSQL's CHAR(36) solution is looking at 36 bytes = 288-bits ANSI or 576-bits UTF8 plus-or-minus bits/bytes for office-keeping (and presumably do much slower multibyte-char-by-multibyte-char string routines).

I've actually put a lot of consideration into the issues for MySQL plus UUID... and my conclusion was that you'd want to write up a stored function that converts the hex representation into the binary representation for storage, and that would make all "select" statements require a conversion back into hex representation... and who knows how efficient any of that will be... so finally just switch to Postgres. XD

If you do want to switch to Postgres, try be very careful about installing it on your existing server(s) if they are production servers. As in... make a clone to test the migration process before actually doing a migration. I somehow managed to kill my system because of "installing this package will remove a large number of important other packages" (I don't know how the installer made those decisions).

Alternatively, go with Microsoft SQL for their GUID equivalent, if you're prepared to eventually pay them lots of money to operate a DB...

Doing UUID and MySQL just tends to be a bad idea at the moment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM