简体   繁体   中英

Reading an Identifier into a C Program using scanf()

I need my C program to be able to read in an identifier using the scanf() method in C.

An identifier in this case is a letter or a _ character followed by one or more alphanumeric characters including the _ character.

The regular expression would be

    [a-ZA-Z_][a-zA-Z0-9_]*

These are examples of correct identifiers:

    _identifier1
    variable21

These are examples of incorrect identifiers

    12var
    %foobar

Does anybody know how this would be done using scanf() in C?

scanf() doesn't support regular expressions. There's no regular expression support at all in the standard C library. You'll have to read a string and then parse it "manually".

For example:

#include <stdio.h>
#include <ctype.h>

int isIdentifier(const char* s)
{
  const char* p = s;

  if (!(*p == '_' || isalpha(*p)))
  {
    return 0;
  }

  for (p++; *p != '\0'; p++)
  {
    if (!(*p == '_' || isalnum(*p)))
    {
      return 0;
    }
  }

  return 1;
}

int main(void)
{
  const char* const testData[] =
  {
    "a",
    "a_",
    "_a",
    "3",
    "3a",
    "3_",
    "_3"
  };
  int i;

  for (i = 0; i < sizeof(testData) / sizeof(testData[0]); i++)
  {
    printf("\"%s\" is %san identifier\n",
           testData[i],
           isIdentifier(testData[i]) ? "" : "not ");
  }

  return 0;
}

Output:

"a" is an identifier
"a_" is an identifier
"_a" is an identifier
"3" is not an identifier
"3a" is not an identifier
"3_" is not an identifier
"_3" is an identifier

scanf 's format specifiers are kind of limited. They cannot be used to identify the pattern of your identifier. I believe you can only perform custom validation on the read string, something like:

int in_range(char ch, char begin, char end)
{
    return ch >= begin && ch <= end;
}

int valid_start_char(char ch)
{
    return in_range(ch, 'a', 'z') ||
        in_range(ch, 'A', 'Z') ||
        ('_' == ch);
}

int valid_char(char ch)
{
    return valid_start_char(ch) || in_range(ch, '0', '9');
}

// ..

char buff[255];
int i, len = 0, valid = 0;
scanf("%s", buff);

len = strlen(buff);

if(len > 0)
    valid = valid_start_char(buff[0]);

for(i = 1 ; i < len ; ++i)
    valid = valid && valid_char(buff[i]);

if(valid)
    printf("Valid Identifier\n");
else
    printf("Invalid Identifier\n");

(I haven't tested this but it should illustrate the idea)

If you're comfortable with using regex why not just use a regex library? If you're using a POSIX compliant operating system there should be a regex library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM