[英]Strange behavior when parsing a string using scanf
I'm encountering rather strange behavior when preforming a sscanf. 在预制sscanf时,我遇到了相当奇怪的行为。 Currently working on a windows 7 machine in c. 目前在c。的Windows 7机器上工作。
I have the following: 我有以下内容:
if( sscanf( str, "%1[a-zA-Z]%31[a-zA-Z+.-]%n", &scheme[ 0 ], &scheme[ 1 ], &num_chars ) >= 1 )
{
return( num_chars );
}
The str variable is a large input string with potentially larger then 32 characters. str变量是一个大的输入字符串,可能大于32个字符。 The scheme variable is declared as an argument to the wrapping function call, it's a 32 character array. scheme变量被声明为包装函数调用的参数,它是一个32个字符的数组。
I can easily do this with a couple of scanfs or two separate variables. 我可以使用几个scanfs或两个单独的变量轻松完成此操作。 I was just curious as to why this doesn't work as is. 我只是好奇为什么这不起作用。
Edit: 编辑:
At the time I executed this and the error occurred str contained "tel-net" (was testing the '-') and it resulted in the scheme string having basically no usable characters. 当我执行此操作并且发生错误时str包含“tel-net”(正在测试' - ')并且导致方案字符串基本上没有可用字符。
Solution: 解:
I figured out what the problem was, it was actually not a scanf issue at all. 我弄清楚问题是什么,它实际上根本不是一个scanf问题。
This is how i declared the scheme variable: 这是我声明方案变量的方式:
IOP_uri_scheme_type * scheme_str;
IOP_uri_scheme_type was declared as follows: IOP_uri_scheme_type声明如下:
typedef char IOP_uri_scheme_type[ IOP_URI_MAX_SCHEME_SZ ]; // Size = 32
The problem was the indexing, scheme[ 1 ] was actually jumping the entire block (all 32 bytes) rather then a character like i was expecting. 问题是索引,方案[1]实际上是跳过整个块(所有32个字节)而不是像我期待的那样的字符。 So technically the scanf was written correctly to begin with (minus the %n thing). 所以从技术上讲,scanf是正确编写的(减去%n的东西)。
One possible way i can solve this is by casting scheme as a (char *) first or directly manipulating the pointer value, de-referencing it, or just not using a pointer which i don't need anyways. 我可以解决这个问题的一种可能方法是首先将方案转换为(char *)或直接操作指针值,取消引用它,或者只是不使用我不需要的指针。
Thanks for everyone's help. 谢谢大家的帮助。
It appears that you are trying to use regular expressions inside sscanf
. 您似乎正在尝试在sscanf
使用正则表达式。 As far as I know, sscanf
does not have any support for regular expressions. 据我所知, sscanf
对正则表达式没有任何支持。
Here is a test suite I made for this case (with size reduced for readability): 这是我为这种情况制作的测试套件(为了便于阅读,尺寸减小了):
#include <stdio.h>
int main()
{
char str[] = "tel-net";
char scheme[13] = { 0 };
int num_chars;
int result = sscanf( str, "%1[a-zA-Z]%11[a-zA-Z+.-]%n",
&scheme[ 0 ], &scheme[ 1 ], &num_chars );
printf("result = %d\n", result);
printf("scheme = '%s'\n", scheme);
printf("scheme = ");
for (int ii = 0; ii < sizeof scheme; ++ii)
printf("%02x ", (unsigned char)scheme[ii]);
printf("\n");
if ( result == 2 )
printf("num_chars = %d\n", num_chars);
return 0;
}
where the output is: 输出的位置是:
result = 2
scheme = 'tel-net'
scheme = 74 65 6c 2d 6e 65 74 00 00 00 00 00 00
num_chars = 7
Can you post your output? 你能发布你的输出吗?
Note that your program has a bug, since the %n
will not be processed if the second [
fails. 请注意,您的程序有一个错误,因为如果第二个[
失败,则不会处理%n
。 You can only return num_chars
if the return value is exactly 2
. 如果返回值正好为2
则只能返回num_chars
。
Regarding the "regular expressions": according to the C standard it is implementation-defined what happens when you use a hyphen inside the [ ]
specifier like this. 关于“正则表达式”:根据C标准,它是实现定义当你在[ ]
说明符中使用连字符时会发生什么。 Your compiler (plus C library etc.) may or may not support the usage you are trying. 您的编译器(加上C库等)可能支持也可能不支持您正在尝试的用法。 Check your compiler's documentation of scanf
to see what it says about this case. 检查编译器的scanf
文档,看看它对这个案例的说法。
NB. NB。 I originally posted an answer saying it was undefined to read into overlapping objects - however I think that is actually false, and it is fine because the arguments are processed in order (and the standard does not say that it it is undefined). 我最初发布了一个答案,说它没有被定义来读取重叠的对象 - 但是我认为这实际上是错误的,并且它很好,因为参数是按顺序处理的(并且标准并没有说它是未定义的)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.