简体   繁体   中英

Reading an Identifier into a C Program using scanf()

I need my C program to be able to read in an identifier using the scanf() method in C.

An identifier in this case is a letter or a _ character followed by one or more alphanumeric characters including the _ character.

The regular expression would be


These are examples of correct identifiers:


These are examples of incorrect identifiers


Does anybody know how this would be done using scanf() in C?

scanf() doesn't support regular expressions. There's no regular expression support at all in the standard C library. You'll have to read a string and then parse it "manually".

For example:

#include <stdio.h>
#include <ctype.h>

int isIdentifier(const char* s)
  const char* p = s;

  if (!(*p == '_' || isalpha(*p)))
    return 0;

  for (p++; *p != '\0'; p++)
    if (!(*p == '_' || isalnum(*p)))
      return 0;

  return 1;

int main(void)
  const char* const testData[] =
  int i;

  for (i = 0; i < sizeof(testData) / sizeof(testData[0]); i++)
    printf("\"%s\" is %san identifier\n",
           isIdentifier(testData[i]) ? "" : "not ");

  return 0;


"a" is an identifier
"a_" is an identifier
"_a" is an identifier
"3" is not an identifier
"3a" is not an identifier
"3_" is not an identifier
"_3" is an identifier

scanf 's format specifiers are kind of limited. They cannot be used to identify the pattern of your identifier. I believe you can only perform custom validation on the read string, something like:

int in_range(char ch, char begin, char end)
    return ch >= begin && ch <= end;

int valid_start_char(char ch)
    return in_range(ch, 'a', 'z') ||
        in_range(ch, 'A', 'Z') ||
        ('_' == ch);

int valid_char(char ch)
    return valid_start_char(ch) || in_range(ch, '0', '9');

// ..

char buff[255];
int i, len = 0, valid = 0;
scanf("%s", buff);

len = strlen(buff);

if(len > 0)
    valid = valid_start_char(buff[0]);

for(i = 1 ; i < len ; ++i)
    valid = valid && valid_char(buff[i]);

    printf("Valid Identifier\n");
    printf("Invalid Identifier\n");

(I haven't tested this but it should illustrate the idea)

If you're comfortable with using regex why not just use a regex library? If you're using a POSIX compliant operating system there should be a regex library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM