简体   繁体   中英

How do I generate the shortest database identifier names from a number?

Because I am writing software which generates SQL with a potentially large number of parameters and records that SQL onto disk, I have an uncommon requirement (perhaps more a curiosity): Generate the shortest possible unique parameter names.

Parameter names follow identifier naming rules which are usually:

  1. First character is alphabetic
  2. Subsequent characters can be alphanumeric or certain other characters, such as an underscore.
  3. Almost anything can be used if quoted (ignored -- a quoted identifier is at least three characters total, eg [_] )

The SQL generation code knows how many identifiers there are total, so names can be generated based on an integer.

This ended up being more difficult than I anticipated, and the solution less elegant. I hard-coded invalid values (starting with 0 ) because they are few and every attempt I made to derive them ended up complicated and slow. I would appreciate ideas on how to make this more elegant. I'll post on CodeReview as well.

Most databases support fewer than 2^16 parameters (a ridiculous number to actually use), but in handling numbers larger than 35027 (also ridiculous) the numbers worked out such that 1 million was a good forced stopping point.

public static String intToDatabaseIdentifier(int number)
{
    if(number < 0 || number > 1000000)
        throw new ArgumentOutOfRangeException("number");
    if(number > 25 && number <= 25 + 10) // Skip 0-9 (modified base 36)
        number += 10;
    if(number > 971 && number <= 971 + 360) // Skip 0a-09 (modified base 36)
        number += 360;
    if(number > 35027 && number <= 35027 + 12960) // Skip 0aa-099 (modified base 36)
        number += 12960;
    var stack = new Stack<char>();
    // Base 36 starting with letters rather than numbers
    const string characters = "abcdefghijklmnopqrstuvwxyz0123456789";
    while(number >= 0) {
        stack.Push(characters[number % 36]);
        number = number / 36 - 1;
    }
    return new String(stack.ToArray());
}

Results starting with 0:

a b c d e f g h i j k l m n o p q r s t u v w x y z
aa ab ac ad ae af ag ah ai aj aa ab ac ad ae af ag ah ai aj ak al am an ao
ap aq ar as at au av aw ax ay az a0 a1...

Code above produces collisions. Fixed code without collisions and magic numbers.

    public static String intToDatabaseIdentifier(int number)
    {
        const string abcFirst = "abcdefghijklmnopqrstuvwxyz";
        const string abcFull = "abcdefghijklmnopqrstuvwxyz0123456789";
        if (number < 0 || number > 1000000)
            throw new ArgumentOutOfRangeException("number");
        var stack = new Stack<char>();
        //Get first symbol. We will later reverse string. So last - will be first. 
        stack.Push(abcFirst[number % abcFirst.Length]);
        number = number / abcFirst.Length;
        //Collect remaining part
        while (number > 0)
        {
            int index = (number - 1) % abcFull.Length;
            stack.Push(abcFull[index]);
            number = (number - index) / abcFull.Length;
        }
        //Reversing to guarantee first non numeric.
        return new String(stack.Reverse().ToArray());
    }

Timur Mannapov's answer produces results similar to some of my other attempts (except his results don't have the problems noted in comments) in that the progression is not what one would expect, eg aa, ba, ca instead of aa, ab, ac : (call with String.Concat(ToParamName(i)) )

// Starts with aa, ba, ba... instead of a, b, c. Probably wouldn't be hard
// to fix but I abandoned this method because it's annoying to call using
// string.Concat(...)
public static IEnumerable<char> ToParamName(int number) {
    const string characters = "abcdefghijklmnopqrstuvwxyz0123456789";
    yield return characters[number % 26];
    number = number / 26;
    do {
        yield return characters[number % 36];
        number = number / 36 - 1;
    } while(number >= 0);
}


// Starts with a, b, c...aa, ba, ba but has collisions starting around 960
public static IEnumerable<char> ToParamName(int number) {
    const string characters = "abcdefghijklmnopqrstuvwxyz0123456789";
    yield return characters[number % 26];
    number = number / 26;
    while(number > 0) {
        yield return characters[number % 36];
        number = number / 36 - 1;
    }
}

I prefer having the results returned in a more natural order like a..z, aa, ab, ac...a9 (hey, I didn't claim I was being purely practical), but I forgot to mention that in the original post. Timur's answer covers all the original requirements, so I'll mark it correct.

I'll +1 an answer that produces results as described.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM