簡體   English   中英

將可變長度字符串映射到int

[英]Map variable-length string to int

我無法弄清楚如何編寫一個接受以下輸入並產生以下輸出的函數:

in (int) | out (char *)
0        | ""
1        | "a"
2        | "b"
3        | "c"
4        | "aa"
5        | "ab"
6        | "ac"
7        | "ba"
8        | "bb"
...

它不是簡單地將輸入轉換為三元組,因為存在差異“a”和“aa”(而000之間沒有差異)。

我發現當你只使用ab時,字符串的長度和輸入之間的相關性( len = floor(log2(in + 1))

in (int) | floor(log2(in + 1)) | out (char *)
0        | 0                   | ""
1        | 1                   | "a"
2        | 1                   | "b"
3        | 2                   | "aa"
4        | 2                   | "ab"
5        | 2                   | "ba"
6        | 2                   | "bb"
7        | 3                   | "aaa"
8        | 3                   | "aab"

如果有n不同的有效字符,輸出長度和輸入值之間的一般相關性是什么?

這與C中的Calc細胞轉化器有關,但明顯不同。 這段代碼很快從這段代碼中派生出來:

#include <ctype.h>
#include <stdio.h>
#include <string.h>

/* These declarations should be in a header */
extern char     *b3_row_encode(unsigned row, char *buffer);
extern unsigned  b3_row_decode(const char *buffer);

static char *b3_encode(unsigned row, char *buffer)
{
    unsigned div = row / 3;
    unsigned rem = row % 3;
    if (div > 0)
        buffer = b3_encode(div-1, buffer);
    *buffer++ = rem + 'a';
    *buffer = '\0';
    return buffer;
}

char *b3_row_encode(unsigned row, char *buffer)
{
    if (row == 0)
    {
        *buffer = '\0';
        return buffer;
    }
    return(b3_encode(row-1, buffer));
}

unsigned b3_row_decode(const char *code)
{
    unsigned char c;
    unsigned r = 0;
    while ((c = *code++) != '\0')
    {
        if (!isalpha(c))
            break;
        c = tolower(c);
        r = r * 3 + c - 'a' + 1;
    }
    return r;
}

#ifdef TEST

static const struct
{
    unsigned col;
    char     cell[10];
} tests[] =
{
    {    0,      "" },
    {    1,     "a" },
    {    2,     "b" },
    {    3,     "c" },
    {    4,    "aa" },
    {    5,    "ab" },
    {    6,    "ac" },
    {    7,    "ba" },
    {    8,    "bb" },
    {    9,    "bc" },
    {   10,    "ca" },
    {   11,    "cb" },
    {   12,    "cc" },
    {   13,   "aaa" },
    {   14,   "aab" },
    {   16,   "aba" },
    {   22,   "baa" },
    {  169, "abcba" },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };

int main(void)
{
    int pass = 0;

    for (int i = 0; i < NUM_TESTS; i++)
    {
        char buffer[32];
        b3_row_encode(tests[i].col, buffer);
        unsigned n = b3_row_decode(buffer);
        const char *pf = "FAIL";

        if (strcmp(tests[i].cell, buffer) == 0 && n == tests[i].col)
        {
            pf = "PASS";
            pass++;
        }
        printf("%s: Col %3u, Cell (wanted: %-8s vs actual: %-8s) Col = %3u\n",
               pf, tests[i].col, tests[i].cell, buffer, n);
    }

    if (pass == NUM_TESTS)
        printf("== PASS == %d tests OK\n", pass);
    else
        printf("!! FAIL !! %d out of %d failed\n", (NUM_TESTS - pass), NUM_TESTS);

    return (pass == NUM_TESTS) ? 0 : 1;
}

#endif /* TEST */

該代碼包括一個測試程序和一個從字符串轉換為整數的函數和一個從整數轉換為字符串的函數。 測試運行背靠背轉換。 代碼不會將空字符串處理為零。

樣本輸出:

PASS: Col   0, Cell (wanted:          vs actual:         ) Col =   0
PASS: Col   1, Cell (wanted: a        vs actual: a       ) Col =   1
PASS: Col   2, Cell (wanted: b        vs actual: b       ) Col =   2
PASS: Col   3, Cell (wanted: c        vs actual: c       ) Col =   3
PASS: Col   4, Cell (wanted: aa       vs actual: aa      ) Col =   4
PASS: Col   5, Cell (wanted: ab       vs actual: ab      ) Col =   5
PASS: Col   6, Cell (wanted: ac       vs actual: ac      ) Col =   6
PASS: Col   7, Cell (wanted: ba       vs actual: ba      ) Col =   7
PASS: Col   8, Cell (wanted: bb       vs actual: bb      ) Col =   8
PASS: Col   9, Cell (wanted: bc       vs actual: bc      ) Col =   9
PASS: Col  10, Cell (wanted: ca       vs actual: ca      ) Col =  10
PASS: Col  11, Cell (wanted: cb       vs actual: cb      ) Col =  11
PASS: Col  12, Cell (wanted: cc       vs actual: cc      ) Col =  12
PASS: Col  13, Cell (wanted: aaa      vs actual: aaa     ) Col =  13
PASS: Col  14, Cell (wanted: aab      vs actual: aab     ) Col =  14
PASS: Col  16, Cell (wanted: aba      vs actual: aba     ) Col =  16
PASS: Col  22, Cell (wanted: baa      vs actual: baa     ) Col =  22
PASS: Col 169, Cell (wanted: abcba    vs actual: abcba   ) Col = 169
== PASS == 18 tests OK

你是在正確的軌道上:每個N字符組只是基數M的N位數字,其中M是符號的數量。 因此,您的序列是0位三元組(“”),后跟1-trnaries(“a”,“b”,“c”)等。

給出等級的位數是floor(log3(2n+1)) ,並且每個序列的第一等級是(3**d-1)/2 所以序列中的第10000個有9個數字; 第一個9位序列(“aaaaaaaa”)是數字9841. 10000-9841是159,其在基數3是12220,所以第10000個序列是“aaaabccca”。

這個簡單的代碼適用於您的兩個示例案例,如果您增加了您正在使用的字符數,則應該可以使用。 這是C#中的簡易版本:

string Convert(int input)
{
    char[] chars = { 'a', 'b', 'c' };

    string s = string.Empty;
    while (input > 0)
    {
        int digit = (input - 1) % chars.Length;
        s = s.Insert(0, chars[digit].ToString());

        input = (input-1) / chars.Length;
    }

    return s;
}

在C中它有點復雜:

char* Convert(int input)
{
    char* chars = "abc";
    char result[50] = "";
    int numChars = strlen(chars);
    int place = 0;

    // Generate the result string digit by digit from the least significant digit
    // The string generated by this is in reverse
    while(input > 0)
    {
        int digit = (input - 1) % numChars;

        result[place] = chars[digit];

        input = (input-1) / numChars; 
        place++;
    }

    // Fix the result string by reversing it
    place -= 1;
    char *reversedResult = malloc(strlen(result));
    int i;
    for(i = 0; i <= place; i++)
    {
        reversedResult[i] = result[place-i];
    }
    reversedResult[i] = '\0';

    return reversedResult;
}

是不是簡單地a = 1,b = 2,c = 3,乘以3 ^ n,其中n是字符串中的位置? 這似乎是一個有點奇怪的三元定義。 無論空字符串有多長,空字符串都是等效的,因為除了a,b和c之外的任何內容都會對輸出值貢獻0。

此外,你問題中的兩個表似乎有沖突。 在第一個表格中,5 - >“ab”,而在第二個表格中5 - >“ab”。 這是故意的嗎? 如果是這樣,功能不是一對一的,你的問題就更加模糊了。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM