[英]Map variable-length string to int
我無法弄清楚如何編寫一個接受以下輸入並產生以下輸出的函數:
in (int) | out (char *)
0 | ""
1 | "a"
2 | "b"
3 | "c"
4 | "aa"
5 | "ab"
6 | "ac"
7 | "ba"
8 | "bb"
...
它不是簡單地將輸入轉換為三元組,因為存在差異“a”和“aa”(而0
和00
之間沒有差異)。
我發現當你只使用a
和b
時,字符串的長度和輸入之間的相關性( len = floor(log2(in + 1))
:
in (int) | floor(log2(in + 1)) | out (char *)
0 | 0 | ""
1 | 1 | "a"
2 | 1 | "b"
3 | 2 | "aa"
4 | 2 | "ab"
5 | 2 | "ba"
6 | 2 | "bb"
7 | 3 | "aaa"
8 | 3 | "aab"
如果有n
不同的有效字符,輸出長度和輸入值之間的一般相關性是什么?
這與C中的Calc細胞轉化器有關,但明顯不同。 這段代碼很快從這段代碼中派生出來:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
/* These declarations should be in a header */
extern char *b3_row_encode(unsigned row, char *buffer);
extern unsigned b3_row_decode(const char *buffer);
static char *b3_encode(unsigned row, char *buffer)
{
unsigned div = row / 3;
unsigned rem = row % 3;
if (div > 0)
buffer = b3_encode(div-1, buffer);
*buffer++ = rem + 'a';
*buffer = '\0';
return buffer;
}
char *b3_row_encode(unsigned row, char *buffer)
{
if (row == 0)
{
*buffer = '\0';
return buffer;
}
return(b3_encode(row-1, buffer));
}
unsigned b3_row_decode(const char *code)
{
unsigned char c;
unsigned r = 0;
while ((c = *code++) != '\0')
{
if (!isalpha(c))
break;
c = tolower(c);
r = r * 3 + c - 'a' + 1;
}
return r;
}
#ifdef TEST
static const struct
{
unsigned col;
char cell[10];
} tests[] =
{
{ 0, "" },
{ 1, "a" },
{ 2, "b" },
{ 3, "c" },
{ 4, "aa" },
{ 5, "ab" },
{ 6, "ac" },
{ 7, "ba" },
{ 8, "bb" },
{ 9, "bc" },
{ 10, "ca" },
{ 11, "cb" },
{ 12, "cc" },
{ 13, "aaa" },
{ 14, "aab" },
{ 16, "aba" },
{ 22, "baa" },
{ 169, "abcba" },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
int main(void)
{
int pass = 0;
for (int i = 0; i < NUM_TESTS; i++)
{
char buffer[32];
b3_row_encode(tests[i].col, buffer);
unsigned n = b3_row_decode(buffer);
const char *pf = "FAIL";
if (strcmp(tests[i].cell, buffer) == 0 && n == tests[i].col)
{
pf = "PASS";
pass++;
}
printf("%s: Col %3u, Cell (wanted: %-8s vs actual: %-8s) Col = %3u\n",
pf, tests[i].col, tests[i].cell, buffer, n);
}
if (pass == NUM_TESTS)
printf("== PASS == %d tests OK\n", pass);
else
printf("!! FAIL !! %d out of %d failed\n", (NUM_TESTS - pass), NUM_TESTS);
return (pass == NUM_TESTS) ? 0 : 1;
}
#endif /* TEST */
該代碼包括一個測試程序和一個從字符串轉換為整數的函數和一個從整數轉換為字符串的函數。 測試運行背靠背轉換。 代碼不會將空字符串處理為零。
樣本輸出:
PASS: Col 0, Cell (wanted: vs actual: ) Col = 0
PASS: Col 1, Cell (wanted: a vs actual: a ) Col = 1
PASS: Col 2, Cell (wanted: b vs actual: b ) Col = 2
PASS: Col 3, Cell (wanted: c vs actual: c ) Col = 3
PASS: Col 4, Cell (wanted: aa vs actual: aa ) Col = 4
PASS: Col 5, Cell (wanted: ab vs actual: ab ) Col = 5
PASS: Col 6, Cell (wanted: ac vs actual: ac ) Col = 6
PASS: Col 7, Cell (wanted: ba vs actual: ba ) Col = 7
PASS: Col 8, Cell (wanted: bb vs actual: bb ) Col = 8
PASS: Col 9, Cell (wanted: bc vs actual: bc ) Col = 9
PASS: Col 10, Cell (wanted: ca vs actual: ca ) Col = 10
PASS: Col 11, Cell (wanted: cb vs actual: cb ) Col = 11
PASS: Col 12, Cell (wanted: cc vs actual: cc ) Col = 12
PASS: Col 13, Cell (wanted: aaa vs actual: aaa ) Col = 13
PASS: Col 14, Cell (wanted: aab vs actual: aab ) Col = 14
PASS: Col 16, Cell (wanted: aba vs actual: aba ) Col = 16
PASS: Col 22, Cell (wanted: baa vs actual: baa ) Col = 22
PASS: Col 169, Cell (wanted: abcba vs actual: abcba ) Col = 169
== PASS == 18 tests OK
你是在正確的軌道上:每個N字符組只是基數M的N位數字,其中M是符號的數量。 因此,您的序列是0位三元組(“”),后跟1-trnaries(“a”,“b”,“c”)等。
給出等級的位數是floor(log3(2n+1))
,並且每個序列的第一等級是(3**d-1)/2
。 所以序列中的第10000個有9個數字; 第一個9位序列(“aaaaaaaa”)是數字9841. 10000-9841是159,其在基數3是12220,所以第10000個序列是“aaaabccca”。
這個簡單的代碼適用於您的兩個示例案例,如果您增加了您正在使用的字符數,則應該可以使用。 這是C#中的簡易版本:
string Convert(int input)
{
char[] chars = { 'a', 'b', 'c' };
string s = string.Empty;
while (input > 0)
{
int digit = (input - 1) % chars.Length;
s = s.Insert(0, chars[digit].ToString());
input = (input-1) / chars.Length;
}
return s;
}
在C中它有點復雜:
char* Convert(int input)
{
char* chars = "abc";
char result[50] = "";
int numChars = strlen(chars);
int place = 0;
// Generate the result string digit by digit from the least significant digit
// The string generated by this is in reverse
while(input > 0)
{
int digit = (input - 1) % numChars;
result[place] = chars[digit];
input = (input-1) / numChars;
place++;
}
// Fix the result string by reversing it
place -= 1;
char *reversedResult = malloc(strlen(result));
int i;
for(i = 0; i <= place; i++)
{
reversedResult[i] = result[place-i];
}
reversedResult[i] = '\0';
return reversedResult;
}
是不是簡單地a = 1,b = 2,c = 3,乘以3 ^ n,其中n是字符串中的位置? 這似乎是一個有點奇怪的三元定義。 無論空字符串有多長,空字符串都是等效的,因為除了a,b和c之外的任何內容都會對輸出值貢獻0。
此外,你問題中的兩個表似乎有沖突。 在第一個表格中,5 - >“ab”,而在第二個表格中5 - >“ab”。 這是故意的嗎? 如果是這樣,功能不是一對一的,你的問題就更加模糊了。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.