简体   繁体   English

C strlen使用指针

[英]C strlen using pointers

I have seen the standard implementation of strlen using pointer as: 我已经看到使用指针的strlen的标准实现:

int strlen(char * s) {
  char *p = s;
  while (*p!='\0')
    p++;
  return p-s;
}

I get this works, but when I tried to do this using 3 more ways (learning pointer arithmetic right now), I would want to know whats wrong with them? 我得到了这个功能,但是当我尝试使用另外3种方法(现在学习指针算法)时,我想知道它们有什么问题?

  1. This is somewhat similar to what the book does. 这有点类似于本书的内容。 Is this wrong? 这是错的吗?

     int strlen(char * s) { char *p = s; while (*p) p++; return ps; } 
  2. I though it would be wrong if I pass an empty string but still gives me 0, kinda confusing since p is pre increment: (and now its returning me 5) 我虽然如果我传递一个空字符串然后仍然给我0,那会是错误的,有点令人困惑,因为p是预增量:(现在它返回我5)

     int strlen(char * s) { char *p = s; while (*++p) ; return ps; } 
  3. Figured this out, does the post increment and returns +1 on it. 想出这个,后期增加并返回+1。

     int strlen(char * s) { char *p = s; while (*p++) ; return ps; } 

1) Looks fine to me. 1)对我来说很好看。 I personally prefer the explicit comparison against '\\0' so that it's clear you didn't mean to (for example) compare p to the NULL pointer in situations where it's not clear from context. 我个人更喜欢与'\\ 0'的显式比较,因此很明显你并不意味着(例如)在从上下文中不清楚的情况下将p与NULL指针进行比较。

2) When your program runs, the area of memory known as the stack is uninitialized. 2)程序运行时,称为堆栈的内存区域未初始化。 Local variables live there. 局部变量存在于那里。 The way you wrote your program puts p in the stack (if you made it const or used malloc , it would almost certainly live elsewhere). 你编写程序的方式将p放在堆栈中(如果你使用const或使用malloc ,它几乎肯定会在其他地方生活)。 What happens when you look at *p is that you then peek at the stack. 当你看*p时会发生什么,然后你会看到堆栈。 If the string is length 0, this is the same as char p[1] = {0} . 如果字符串的长度为0,则与char p[1] = {0} Pre-incrementing looks at the byte immediately after the \\0 , so you're looking at undefined memory. 预递增会在\\0之后立即查看字节,因此您正在查看未定义的内存。 Here be dragons! 这里是龙!

3) I don't think there's a question there :) As you see, it always returns one more than the correct answer. 3)我不认为那里有一个问题:)如你所见,它总是返回一个比正确答案更多的答案。

Addendum: You can also write this using a for-loop, if you prefer this style: 附录:如果您喜欢这种风格,也可以使用for循环编写:

size_t strlen(char * s) {
    char *p = s;
    for (; *p != '\0'; p++) {}
    return p - s;
}

Or (more error-prone-ly) 或者(更容易出错)

size_t strlen(char * s) {
    char *p = s;
    for (; *p != '\0'; p++);
    return p - s;
}

Also, strlen can't return a negative number, so you should use an unsigned value. 另外,strlen不能返回负数,因此您应该使用无符号值。 size_t is even better. size_t甚至更好。

Version 1 is fine - while (*p != '\\0') is equivalent to while (*p != 0) , which is equivalent to while (*p) . 版本1很好 - while (*p != '\\0')等同于while (*p != 0) ,这相当于while (*p)

In the original code and version 1, the pointer p is advanced if and only if *p is not 0 (IOW, you're not at the end of the string). 在原始代码和版本1中,当且仅当 *p不为0 ,指针p才会前进(IOW,您不在字符串的末尾)。

Versions 2 and 3 advance p regardless of whether *p is 0 or not. 版本2和3预先p 无论 *p0或不是。 *p++ evaluates to the character p points to, and as a side effect advances p . *p++评估字符p指向,并且副作用提前p *++p evaluates to the character following the character p points to, and as a side effect advances p . *++p的计算结果为以下的字符的字符p点,和作为副作用前进p Therefore, versions 2 and 3 will always advance p past the end of the string, which is why your values are off. 因此,版本2和3将始终推进p过去的字符串,这就是为什么你的价值观是关闭的结束。

One issue you will run into when you compare the performance of strlen replacement functions is their performance will suffer compared to the actual strlen function for long strings? 比较strlen替换函数的性能时遇到的一个问题是,与长字符串的实际strlen函数相比,它们的性能会受到影响吗? Why? 为什么? strlen processes more than one-byte per iteration in searching for the end of string. strlen在搜索字符串结尾时每次迭代处理超过一个字节。 How can you implement a more efficient replacement? 如何实现更有效的替代?

It's not that difficult. 这并不困难。 The basic approach is to look at 4-bytes per iteration and adjust the return based on where within those 4-bytes the nul-byte is found. 基本方法是每次迭代查看4个字节,并根据找到nul-byte的 4字节内的位置调整返回值。 You could do something like the following (using array indexing): 您可以执行以下操作(使用数组索引):

size_t strsz_idx (const char *s) {
    size_t len = 0;
    for(;;) {
        if (s[0] == 0) return len;
        if (s[1] == 0) return len + 1;
        if (s[2] == 0) return len + 2;
        if (s[3] == 0) return len + 3;
        s += 4, len += 4;
    }
}

You can do the exact same thing using pointers and masks: 您可以使用指针和掩码执行完全相同的操作:

size_t strsz (const char *s) {
    size_t len = 0;
    for(;;) {
        unsigned x = *(unsigned*)s;
        if((x & 0xff) == 0) return len;
        if((x & 0xff00) == 0) return len + 1;
        if((x & 0xff0000) == 0) return len + 2;
        if((x & 0xff000000) == 0) return len + 3;
        s += 4, len += 4;
    }
}

Either way, you will find a 4-byte comparison each iteration will give you performance equivalent to strlen itself. 无论哪种方式,您都会发现4字节的比较,每次迭代都会为您提供与strlen本身相当的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM