简体   繁体   English

如何使用 strsep() 标记字符串

[英]How to tokenize a string using strsep()

I have a kernel module, and in the module, I'm trying to split a string using strsep() .我有一个内核模块,在模块中,我尝试使用strsep()拆分字符串。 I have the following decrypt variable that contains the string I'm trying to split.我有以下decrypt变量,其中包含我要拆分的字符串。

unsigned char decrypt[KEY_SIZE];
printk(KERN_DEBUG "%s\n", decrypt);

output:输出:

N = D0C2ACDCF780B1E4846054BDA700F18D567247FE8BC5BA4FBCAB814E619DA63A20F65A58EE89FC0824DC9367C5725BDDC596065F1C8868E99C896F3A0CF7D7F0A785E668F2568F19BAFB8FF3BA5CDF487544EFE71010BEDB4EE16EDC3AF0A71391AD3194B42D3FD40B4E0DE12A22D8188AF03FF4E36D37BA1DA1F3C57188E60DA38C25329E48805FC7FF524761A6F010E737B927D8F67383274F8E600167A52A042E1DCA3335150C090803F9D96F6E63BEBFB153516E863F5B4CB02104077834FC5EC31A47451783931D643CE736DD1BAB40C5523858BB067FB9E490DCB5FDBBB03B9D68A8998C1347E237C477AA14B0997A84708CED05A9E24C7072B838F753
E = 010001
D = 21AFE07431CE47269083F8F8B7ABCBCEDA6DCB975457BE6662942C64091586FEE755C9A3832EAA0868665DB507A41A15F1EDF12E44ECF03D0E61111D457D730FA700D0FB0B6C13607C0F5F1DDDEB61AE9019E53A9C998F2AD5924430EEA3E9DA1B0E5F2B575DDBE86C4096B5C87661F7A7E7F7F21D0701509BBA881B4AE463F6F18C7F04AB742319E2D7319EECA136EEB0CF7B2BFA87E3A0E69FBC0E5FDC7EE6271EB2CA09DDBF7C8B57D951762708D76890E62858C1D5FC5B7E40D50913CE7797BD80F6A398FB92703FBDD33FBCB129B86E54F13EC14DA68BE139634DD1E9C01F01751
...
...

I'm using the following code to extract the values.我正在使用以下代码来提取值。 My goal is to get the value of N , E , D in each case.我的目标是在每种情况下获得NED的值。 When I call this module, my machine freeze.当我调用这个模块时,我的机器冻结了。 However, when I use gdb to debug the loop, it works.但是,当我使用gdb调试循环时,它可以工作。

As @John Bollinger asked, I have the following line to make the string null-terminated before using strsep() .正如@John Bollinger 所问的那样,在使用strsep()之前,我有以下行来使字符串以空值结尾。

size_t lenght = strlen(decrypt);
int N = lenght - 2361; // 2361 is the original size
decrypt[lenght - N] = '\0';

Code:代码:

char *s3 = decrypt;
int k = 0;
int size = 0;
char *test;

while (s3 != NULL) {
    test = strsep(&s3, " ");
    test = strsep(&s3, " ");
    test = strsep(&s3, "\n");

    switch (k) {
      case 0:
        size = strlen(test);
        printk(KERN_DEBUG "token id %d: size %d, token is %s\n", k, size, test);
        break;

      case 1:
        size = strlen(test);
        printk(KERN_DEBUG "token id %d: size %d, token is %s\n", k, size, test);
        break;
        
      case 2:
        size = strlen(test);
        printk(KERN_DEBUG "token id %d: size %d, token is %s\n", k, size, test);
        break;
  
      ........
      ........
    }    
    k = k + 1;
} 

Can someone please tell me, what am I doing incorrectly here?有人可以告诉我,我在这里做错了什么吗? Or is there any other thread-safe function available to split a string?或者是否有任何其他线程安全函数可用于拆分字符串? Thanks in advance.提前致谢。

Kernel version: Linux 4.15.0-142-generic内核版本:Linux 4.15.0-142-generic

Can someone please tell me, what am I doing incorrectly here?有人可以告诉我,我在这里做错了什么吗?

The one thing I see that you are clearly doing wrong is not validating your results before using them.我看到你明显做错的一件事是在使用它们之前没有验证你的结果。 If the input happens not to be formatted exactly as you expect, then it is easily possible for test to be null at entry to the switch , and the code does not appear to anticipate that possibility.如果输入的格式不完全符合您的预期,那么test在进入switch时很容易为 null,并且代码似乎没有预料到这种可能性。

Of course, "not formatted exactly as you expect" might also mean having the wrong expectations.当然,“未完全按照您的预期格式化”也可能意味着有错误的期望。 For example, one way to trigger the aforementioned null value of test would be for the last line of the input to end with a newline.例如,触发上述空值test的一种方法是输入的最后一行以换行符结尾。 In that case, strsep would not notice that it had reached the end of the string until the next cycle.在这种情况下, strsep直到下一个循环才注意到它已经到达字符串的末尾。

Also, this declaration is suspicious:此外,这个声明是可疑的:

unsigned char decrypt[KEY_SIZE];

If the data can be up to KEY_SIZE in length, then that leaves no room for a string terminator.如果数据的长度可以达到KEY_SIZE ,那么就没有字符串终止符的空间。 You must have a string terminator if you are processing it as a string via strsep() or outputting it or a tail of it as a string via printk .如果您通过strsep()将其作为字符串处理或通过printk将其或尾部作为字符串输出,则必须有一个字符串终止符。 If the data don't naturally have a terminator then you need to make sure to add one (for which you must leave space).如果数据自然没有终结符,那么您需要确保添加一个(必须留出空间)。

Additionally, not wrong per se , but wasteful: calling strlen() , except for the last token.此外,本身并没有错,但很浪费:调用strlen() ,除了最后一个标记。 You can get the length of each other token by simple pointer difference: s3 - test - 1 .您可以通过简单的指针差异获得彼此令牌的长度: s3 - test - 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM