简体   繁体   中英

Can I determine if an argument is string literal?

Is it possible to determine if an argument passed in macro or function is a string literal at compile time or run time?

For example,

#define is_string_literal(X)
...
...   

is_string_literal("hello") == true;
const char * p = "hello";
is_string_literal(p) == false;

or

bool is_string_literal(const char * s);

is_string_literal("hello") == true;
const char * p = "hello";
is_string_literal(p) == false;

Thanks.

YES! (Thanks to James McNellis and GMan for corrections. Updated to correctly handle concatenated literals like "Hello, " "World!" which get stringized before concatenation.)

#define is_literal_(x) is_literal_f(#x, sizeof(#x) - 1)
#define is_literal(x) is_literal_(x)

bool is_literal_f(const char *s, size_t l)
{
    const char *e = s + l;
    if(s[0] == 'L') s++;
    if(s[0] != '"') return false;
    for(; s != e; s = strchr(s + 1, '"'))
      {
        if(s == NULL) return false;
        s++;
        while(isspace(*s)) s++;
        if(*s != '"') return false;
      }
    return true;
}

This will stringify the argument before passing it to the function, so if the argument was a string literal, the argument passed to our function will be surrounded with quote characters.

If you consider this a string literal:

const char *p = "string";
// should is_literal(p) be true or false?

I cannot help you. You might be able to use some implementation-defined (or *shudder* undefined) behavior to test whether or not a string is stored in read-only memory, but on some (probably older) systems p could be modified.

For those who question the use of such a function, consider:

enum string_type { LITERAL, ARRAY, POINTER };

void string_func(/*const? */char *c, enum string_type t);

Rather than explicitly specifying the second argument to string_function on every call, is_literal allows us to wrap it with a macro:

#define string_func(s) \
    (string_func)(s, is_literal(s)  ? LITERAL :
        (void *)s == (void *)&s ? ARRAY : POINTER)

I can't imagine why it would make a difference, except in plain C where literals aren't const and for some reason you don't want to/can't write the function as taking a const char * instead of a char . But there are all kinds of reasons to want to do something. Someday you, too may feel the need to resort to a horrible hack.

Knowing at compile time (as mentioned in question), with following technique. You can determine if the a given argument is string literal or not. If it's some array or pointer like const char x[], *p ; then it will throw compiler error.

#define is_string_literal(X) _is_string_literal("" X)
bool _is_string_literal (const char *str) { return true; } // practically not needed

[Note: My previous answer was down voted by experts and it's yet not accepted or up voted after edits. I am putting an another answer with same content.]

No. A string literal is just an array of char (in C) or const char (in C++).

You can't distinguish between a string literal and some other array of char like this one (in C++):

const char x[] = "Hello, World!";

Clang's and gcc's __builtin_constant_p(expr) returns 1 if expr is a string literal, else it returns 0. Tested with clang 3.5+ and gcc 4.6+.

Since clang pre-defines GCC , you can simply

#ifdef __GCC__
  ... use __builtin_constant_p(expr)
#else
  ... use fallback
#endif

See https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

Yes; in C++20:

#include <type_traits>
#define IS_SL(x) ([&]<class T = char>() { \
    return std::is_same_v<decltype(x), T const (&)[sizeof(x)]> and \
    requires { std::type_identity_t<T[sizeof(x)]>{x}; }; }())

That is, a string literal is something with type reference to const array of char that can be used to initialize an array of char of the same size.

Test cases:

#include <source_location>
int main(int argc, char* argv[]) {
    static_assert(IS_SL("hello"));
    static_assert(IS_SL("hello" "world"));
    static_assert(IS_SL(R"(hello)"));
    static_assert(not IS_SL(0));
    static_assert(not IS_SL(argc));
    static_assert(not IS_SL(argv[0]));
    char const s[] = "hello";
    static_assert(not IS_SL(s));
    constexpr char const cs[] = "hello";
    static_assert(not IS_SL(cs));
    constexpr char const* sp = "hello";
    static_assert(not IS_SL(sp));
    static_assert(IS_SL(__FILE__));
    static_assert(not IS_SL(std::source_location::current().file_name()));
}

Demo: https://godbolt.org/z/xhd8xqY13

Try this one:

#define is_string_literal(s) \
  (memcmp(#s, "\"", 1) == 0)

According to C/C++ variable naming convention, variable name must be start with '_' or an alphabet.

Maybe not what the op wants but I use

#define literal(a) "" s
#define strcpy(a, b ) _strcpy(a, literal(b))

This makes for

char buf[32];
char buf2[32];
const char *p = "hello,";

strcpy(buf, "hello"); // legal
strcpy,(buf, p);  // illegal
strcpy(buf, buf2); // illegal

Here is a simple portable approach for both C and C++, completely evaluated at compile time, that tests for simple string literals, ie: single string literals without backslashes:

#define STR(s)  #s
#define is_string_literal(s)  (sizeof(STR(#s)) == sizeof(#s) + 4 \
                               && #s[0] == '"' && #s[sizeof(#s)-2] == '"')

It simply tests that the string conversion starts and ends with a " and that applying the string conversion twice only increases the length of the string by 4, only escaping the initial and trailing " .

Here is a test program:

#include <stdio.h>

#define STR(s)  #s
#define is_simple_string(s)  (sizeof(STR(#s)) == sizeof(#s) + 4 \
                              && #s[0] == '"' && #s[sizeof(#s)-2] == '"')

int main() {
#define TEST(s)  printf("%16s -> %d\n", #s, is_simple_string(s))
    char buf[4] = "abc";
    const char cbuf[4] = "def";
    char *p = buf;
    const char *cp = cbuf;

#define S "abc"
    TEST(1);
    TEST('x');
    TEST(1.0);
    TEST(1LL);
    TEST(main);
    TEST(main());
    TEST(S);
    TEST("");
    TEST("abc");
    TEST("abc\n");
    TEST("abc\"def");
    TEST("abc" "");
    TEST("abc"[0]);
    TEST("abc"+1);
    TEST("abc"-*"def");
    TEST(""+*"");
    TEST("a"+*"");
    TEST("ab"+*"");
    TEST("abc"+*"");
    TEST(1+"abc");
    TEST(buf);
    TEST(buf + 1);
    TEST(cbuf);
    TEST(cbuf + 1);
    TEST(p);
    TEST(cp);
    TEST(p + 1);
    TEST(cp + 1);
    TEST(&p);
    TEST(&cp);
    TEST(&buf);
    TEST(&cbuf);

    return *cp - *p - 3;
}

It only outputs 1 for TEST(S) , TEST("") and TEST("abc") .

I had a similar question: I wanted to say that

MY_MACRO("compile-time string")

was legal, and that

char buffer[200]="a string";
MY_MACRO(buffer)

was legal, but not to allow

MY_MACRO(szArbitraryDynamicString)

I used GCC's __builtin_types_compatible_p and MSVC's _countof, which seem to work correctly at the expense of rejecting short string literals.

Because a string literal in c++ can have different prefixes then there is no point to check for beginning quote: https://en.cppreference.com/w/cpp/language/string_literal

Better to check for the ending quote:

  • C++11
  • msvc2015u3,gcc5.4,clang3.8.0

     #include <type_traits> #define UTILITY_CONST_EXPR_VALUE(exp) ::utility::const_expr_value<decltype(exp), exp>::value // hint: operator* applies to character literals, but not to double-quoted literals #define UTILITY_LITERAL_CHAR_(c_str, char_type) UTILITY_CONST_EXPR_VALUE(((void)(c_str * 0), ::utility::literal_char_caster<typename ::utility::remove_cvref<char_type>::type>::cast_from(c_str, L ## c_str, u ## c_str, U ## c_str))) #define UTILITY_LITERAL_CHAR(c_str, char_type) UTILITY_LITERAL_CHAR_(c_str, char_type) #define UTILITY_IS_LITERAL_STRING(c_str) UTILITY_CONST_EXPR_VALUE((sizeof(#c_str) > 1) ? #c_str [sizeof(#c_str) - 2] == UTILITY_LITERAL_CHAR('\"', decltype(c_str[0])) : false) #define UTILITY_IS_LITERAL_STRING_A(c_str) UTILITY_CONST_EXPR_VALUE((sizeof(#c_str) > 1) ? #c_str [sizeof(#c_str) - 2] == '\"' : false) #define UTILITY_IS_LITERAL_STRING_WITH_PREFIX(c_str, prefix) UTILITY_CONST_EXPR_VALUE((sizeof(#c_str) > 1) ? #c_str [sizeof(#c_str) - 2] == prefix ## '\"' : false) namespace utility { template <typename T, T v> struct const_expr_value { static constexpr const T value = v; }; // remove_reference + remove_cv template <typename T> struct remove_cvref { using type = typename std::remove_cv<typename std::remove_reference<T>::type>::type; }; //// literal_char_caster, literal_string_caster // template class to replace partial function specialization and avoid overload over different return types template <typename CharT> struct literal_char_caster; template <> struct literal_char_caster<char> { static inline constexpr char cast_from( char ach, wchar_t wch, char16_t char16ch, char32_t char32ch) { return ach; } }; template <> struct literal_char_caster<wchar_t> { static inline constexpr wchar_t cast_from( char ach, wchar_t wch, char16_t char16ch, char32_t char32ch) { return wch; } }; template <> struct literal_char_caster<char16_t> { static inline constexpr char16_t cast_from( char ach, wchar_t wch, char16_t char16ch, char32_t char32ch) { return char16ch; } }; template <> struct literal_char_caster<char32_t> { static inline constexpr char32_t cast_from( char ach, wchar_t wch, char16_t char16ch, char32_t char32ch) { return char32ch; } }; } const char * a = "123"; const char b[] = "345"; int main() { static_assert(UTILITY_IS_LITERAL_STRING_A(a) == 0, "Aa"); static_assert(UTILITY_IS_LITERAL_STRING(a) == 0, "a"); static_assert(UTILITY_IS_LITERAL_STRING_A(b) == 0, "Ab"); static_assert(UTILITY_IS_LITERAL_STRING(b) == 0, "b"); static_assert(UTILITY_IS_LITERAL_STRING_A("123") == 1, "A123"); static_assert(UTILITY_IS_LITERAL_STRING_WITH_PREFIX(L"123", L) == 1, "L123"); static_assert(UTILITY_IS_LITERAL_STRING_WITH_PREFIX(u"123", u) == 1, "u123"); static_assert(UTILITY_IS_LITERAL_STRING_WITH_PREFIX(U"123", U) == 1, "U123"); static_assert(UTILITY_IS_LITERAL_STRING("123") == 1, "123"); static_assert(UTILITY_IS_LITERAL_STRING(L"123") == 1, "L123"); static_assert(UTILITY_IS_LITERAL_STRING(u"123") == 1, "u123"); static_assert(UTILITY_IS_LITERAL_STRING(U"123") == 1, "U123"); }

https://godbolt.org/z/UXIRY6

The UTILITY_CONST_EXPR_VALUE macro is required to force a compiler to generate a compile time only code.

This response may be a bit late, but hopefully it will be helpful to other people asking, my solution was the following:

#define _s(x) #x
#define IS_LITERAL(expr) (_s(expr)[0] == '"')

The advantage of this solution is that this code can be evaluated at compile time, which will allow for useful optimizations elsewhere. Also this macro can be compiled with any compiler, no compiler-specific features are used here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM