[英]Equal strings produces different hash index
我這里有一個程序可以復制內存文件系統(尚未完成),它必須從文件中讀取其命令,在這里它們很容易說明:
create /foo
create /foo/bar
create /foo/baz
create /foo/baz/qux
write /foo/bar "test"
read /foo/bar
read /foo/baz/qux
read /foo/baz/quux
create /foo/bar
create /dir
create /bar
create /dir/bar
find bar
delete /foo/bar
find wat
find foo
read /foo/bar
create /foo/bar
read /foo/bar
delete_r /foo
exit
然后,我有一個函數,它給定字符串以操縱它在數組字符串中插入文件夾名稱,命令是命令字符串,而fullPath字符串由另一個函數提供,該函數使用先前創建的字符串數組組成一個新字符串。 這是結構和操作結構:
typedef struct _command {
unsigned char command[10];
unsigned char path[255][255];
unsigned char* fullPath;
int pathLevels;
} command;
這是確實實現樹狀結構的節點結構:
typedef struct _node {
int isRoot;
int isDir;
char* message;
int childNumber;
struct _node* childNodes[1024];
unsigned char fullPath[MAX_LEN_PATH];
unsigned char resName[255];
} node;
以及處理字符串的函數:
command* createCommandMul(unsigned char* str) {
unsigned char* c = str;
command* commandPointer = (command*) malloc(sizeof(command));
//commandPointer->path[0][0] = '/';
//commandPointer->path[0][1] = '\0';
int commandIndex = 0;
int pathLevel = 0;
int pathIndex = 0;
/* Parte Comando */
while(*c != ' ' && commandIndex < 10) {
commandPointer->command[commandIndex] = *c++;
commandIndex++;
}
while(commandIndex<10) {
commandPointer->command[commandIndex] = '\0';
commandIndex++;
}
while(*c == ' ' || *c == '/') c++;
/* Parte Path*/
while(*c != '\0') {
if (*c == '/') {
commandPointer->path[pathLevel][pathIndex] = '\0';
pathLevel++;
pathIndex = 0;
c++;
} else {
commandPointer->path[pathLevel][pathIndex] = *c++;
pathIndex++;
}
}
commandPointer->path[pathLevel][pathIndex] = '\0';
commandPointer->pathLevels = pathLevel;
return commandPointer;
}
我有一個createDir
函數,該函數確實檢查傳遞給該函數的node *是目錄還是根(假設它有樹); 如果是,它將創建節點。
int createDir(node* fatherOfChildToCreate, unsigned char* fullPath, command* currentCommand) {
if ((fatherOfChildToCreate->isRoot == 1 || fatherOfChildToCreate->isDir == 1) && fatherOfChildToCreate->childNumber < 1024) {
node* dirToCreate = (node*) malloc(sizeof(node));
command* comando = (command*) currentCommand;
dirToCreate->isDir = 1;
dirToCreate->isRoot = 0;
dirToCreate->message = NULL;
dirToCreate->childNumber = 0;
strcmp(dirToCreate->fullPath, fullPath);
for (int i = 0; i < 1024; i++) dirToCreate->childNodes[i] = NULL;
int index = (int) hashCalc(comando->path[comando->pathLevels]);
printf("Hash di %s = %d", comando->path[comando->pathLevels], index);
fatherOfChildToCreate->childNodes[index] = dirToCreate;
fatherOfChildToCreate->childNumber += 1;
return 1;
} else return 0;
}
請注意, createDir
此createDir
函數的目的是創建node* fatherOfChildToCreate
的直接subDir,因此,基本上,文本文件的第一個命令確實使用此函數創建/foo
因為其唯一的parentDir
是根目錄,該根目錄是在main()
。 第二個命令將使用下面的此函數搜索/foo
目錄,並且由於它是/foo/bar
的父目錄,因此該指針將傳遞到createDir函數,該函數將在/foo
目錄中創建childNode
。
node* linearSearchUpper(node* rootNode, unsigned char* upperPath, command* currentCommand) {
command* comandoSearch = (command*) currentCommand;
node* curr = (node*) rootNode;
int counter = comandoSearch->pathLevels;
int index;
unsigned char* upperName = comandoSearch->path[comandoSearch->pathLevels - 1];
for (int i = 0; i < counter; i++) {
index = (int) hashCalc(comandoSearch->path[i]);
printf("Hash di %s = %d", comandoSearch->path[i], index);
if (curr->childNodes[index] == NULL) return NULL;
else curr = curr->childNodes[index];
}
if (strcmp(upperPath, curr->fullPath) == 1) return curr;
}
在所有這些中,我都使用了此哈希函數來搜索parentDir並在node->childNodes[]
數組中插入一個新元素。
unsigned long hashCalc(unsigned char* str) {
unsigned long hash = 5381;
int c;
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
return hash % 1024;
}
現在,我將粘貼main()
,這是要檢查的最后一個函數。
int main() {
node* rootNode = (node*) createRoot();
command* comando = (command*) malloc(sizeof(command));
unsigned char* upPath = NULL;
unsigned char* allPath = NULL;
unsigned char* line = NULL;
FILE* fp;
size_t len = 0;
ssize_t read;
fp = fopen("/Users/mattiarighetti/Downloads/semplice.txt", "r");
if (fp == NULL)
exit(EXIT_FAILURE);
while ((read = getline(&line, &len, fp)) != -1) {
if (*line == 'f') {
//comandoFind = createCommandFind(line);
} if (*line == 'w') {
//comandoWrite = createCommandWrite(line);
} if (*line == 'c') {
comando = createCommandMul(line);
upPath = upperPath(comando);
allPath = fullPath(comando);
if (comando->pathLevels == 0) {
if (createDir(rootNode, allPath, comando) == 1) printf("ok\n\n");
else printf("no\n\n");
} else {
node* upperNode = (node*) linearSearchUpper(rootNode, upPath, comando);
if (upperNode == NULL) {
printf("no\n\n");
}
else {
if (createDir(upperNode, allPath, comando) == 1) printf("ok\n\n");
else printf("no\n\n");
}
}
}
}
fclose(fp);
if (line)
free(line);
return 0;
}
因此,此操作是從文件中逐行讀取,創建並填充命令struct,然后創建一個upPath,它是父級(將被找到)和fullPath。 我得到的問題是程序對該文本文件的第一行使用createDir,這沒關系,但是出於某種奇怪的原因在comando->path[I]
讀取foo
,哈希函數給了我179不正確。 輸入繼續,第二行它使用linearSearchUpper()
搜索父文件夾/foo
,所以它給出了comando-> path [I],它也是foo
但是這次hashCalc給了我905,它應該是正確的答案因此,最后,linearSearchUpper找不到/ foo文件夾,因為它在索引905中不存在。每次我對帶有rootOne子級的文件夾使用create命令或create_dir時,都會發生此情況。 foo,/ dir,/ bar會給我一個奇怪的哈希索引。
您是否知道為什么會發生這種情況?
我並沒有嘗試了解您的整個程序,但是用於獲得不同哈希值的字符串確實有所不同:其中之一保留了最后一行的換行符,可能來自fgets
。
ASCII換行符的numerc值為10,因此:
hash("foo") == 905;
hash("foo\n") == (33 * hash("foo") + '\n') % 1024
== (33 * 905 + 10) % 1024
== 179
解決方案是從fgets
接收的字符串中刪除尾隨空格,或者使用更好的標記化,這將確保您的標記不具有前導或尾隨空格。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.