简体   繁体   English

使用gcc在C语言中键入varargs

[英]Typesafe varargs in C with gcc

Many times I want a function to receive a variable number of arguments, terminated by NULL, for instance 很多时候我想要一个函数来接收可变数量的参数,例如以NULL结尾

#define push(stack_t stack, ...) _push(__VARARG__, NULL);
func _push(stack_t stack, char *s, ...) {
    va_list args;
    va_start(args, s);
    while (s = va_arg(args, char*)) push_single(stack, s);
}

Can I instruct gcc or clang to warn if foo receives non char* variables? 如果foo收到非char*变量,我可以指示gcc或clang发出警告吗? Something similar to __attribute__(format) , but for multiple arguments of the same pointer type. 类似于__attribute__(format) ,但是对于同一指针类型的多个参数。

I know you're thinking of using __attribute__((sentinel)) somehow, but this is a red herring. 我知道你想要以某种方式使用__attribute__((sentinel)) ,但这是一个红色的鲱鱼。

What you want is to do something like this: 你想要的是做这样的事情:

#define push(s, args...) ({                   \
  char *_args[] = {args};                     \
  _push(s,_args,sizeof(_args)/sizeof(char*)); \
})

which wraps: 包装:

void _push(stack_t s, char *args[], int argn);

which you can write exactly the way you would hope you can write it! 您可以按照您希望的方式编写它!

Then you can call: 然后你可以打电话:

push(stack, "foo", "bar", "baz");
push(stack, "quux");

I can only think of something like this: 我只能想到这样的事情:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct tArg
{
  const char* Str;
  struct tArg* Next;
} tArg;

tArg* Arg(const char* str, tArg* nextArg)
{
  tArg* p = malloc(sizeof(tArg));
  if (p != NULL)
  {
    p->Str = str;
    p->Next = nextArg;
  }
  else
  {
    while (nextArg != NULL)
    {
      p = nextArg->Next;
      free(nextArg);
      nextArg = p;
    }
  }
  return p;
}

void PrintR(tArg* arg)
{
  while (arg != NULL)
  {
    tArg* p;
    printf("%s", arg->Str);
    p = arg->Next;
    free(arg);
    arg = p;
  }
}

void (*(*(*(*(*(*(*Print8
  (const char* Str))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*)
{
  printf("%s", Str);
  // There's probably a UB here:
  return (void(*(*(*(*(*(*(*)
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))&Print8;
}

int main(void)
{
  PrintR(Arg("HELLO", Arg(" ", Arg("WORLD", Arg("!", Arg("\n", NULL))))));
//  PrintR(Arg(1, NULL));        // warning/error
//  PrintR(Arg(&main, NULL));    // warning/error
//  PrintR(Arg(0, NULL));        // no warning/error
//  PrintR(Arg((void*)1, NULL)); // no warning/error

  Print8("hello")(" ")("world")("!")("\n");
// Same warning/error compilation behavior as with PrintR()
  return 0;
}

The problem with C variadics is that they are really bolted on afterwards, not really designed into the language. C variadics的问题在于它们之后真的被拴在一起,而不是真正设计成语言。 The main problem is that the variadic parameters are anonymous, they have no handles, no identifiers. 主要问题是可变参数是匿名的,它们没有句柄,没有标识符。 This leads to the unwieldy VA macros to generate references to parameters without names. 这导致笨重的VA宏生成对没有名称的参数的引用。 It also leads to the need to tell those macros where the variadic list starts and what type the parameters are expected to be of. 它还需要告诉那些可变参数列表开始的宏以及预期参数的类型。

All this information really ought to be encoded in proper syntax in the language itself. 所有这些信息确实应该用语言本身的正确语法编码。

For example, one could extend existing C syntax with formal parameters after the ellipsis, like so 例如,可以在省略号之后用形式参数扩展现有的C语法,就像这样

void foo ( ... int counter, float arglist );

By convention, the first parameter could be for the argument count and the second for the argument list. 按照惯例,第一个参数可以是参数计数,第二个参数可以是参数列表。 Within the function body, the list could be treated syntactically as an array. 在函数体内,列表可以在语法上处理为数组。

With such a convention, the variadic parameters would no longer be anonymous. 通过这种约定,可变参数将不再是匿名的。 Within the function body, the counter can be referenced like any other parameter and the list elements can be referenced as if they were array elements of an array parameter, like so 在函数体内,可以像任何其他参数一样引用计数器,并且可以引用列表元素,就好像它们是数组参数的数组元素一样,如此

void foo ( ... int counter, float arglist ) {
  unsigned i;
  for (i=0; i<counter; i++) {
    printf("list[%i] = %f\n", i, arglist[i]);
  }
}

With such a feature built into the language itself, every reference to arglist[i] would then be translated to the respective addresses on the stack frame. 通过语言本身内置的这种功能,每次对arglist[i]引用都将被转换为堆栈帧上的相应地址。 There would be no need to do this via macros. 没有必要通过宏来做到这一点。

Furthermore, the argument count would automatically be inserted by the compiler, further reducing opportunity for error. 此外,编译器会自动插入参数计数,从而进一步减少出错的机会。

A call to 打电话给

foo(1.23, 4.56, 7.89);

would be compiled as if it had been written 将被编译为好像已经写好了

foo(3, 1.23, 4.56, 7.89);

Within the function body, any access to an element beyond the actual number of arguments actually passed could be checked at runtime and cause a compile time fault, thereby greatly enhancing safety. 在函数体内,可以在运行时检查超出实际传递的实际参数数量的元素的任何访问,并导致编译时故障,从而大大增强安全性。

Last but not least, all the variadic parameters are typed and can be type checked at compile time just like non-variadic parameters are checked. 最后但并非最不重要的是,所有可变参数都是类型化的,并且可以在编译时进行类型检查,就像检查非可变参数一样。

In some use cases it would of course be desirable to have alternating types, such as when writing a function to store keys and values in a collection. 在一些使用情况中,当然希望具有交替类型,例如在编写函数以存储集合中的键和值时。 This could also be accommodated simply by allowing more formal parameters after the ellipsis, like so 这也可以简单地通过在省略号之后允许更正式的参数来适应,就像这样

void store ( collection dict, ... int counter, key_t key, val_t value );

This function could then be called as 然后可以将此函数称为

store(dict, key1, val1, key2, val2, key3, val3);

but would be compiled as if it had been written 但是会被编译成好像已经写好了

store(dict, 3, key1, val1, key2, val2, key3, val3);

The types of actual parameters would be compile time checked against the corresponding variadic formal parameters. 实际参数的类型将根据相应的可变参数形式参数进行编译时检查。

Within the body of the function the counter would again be referenced by its identifier, keys and values would be referenced as if they were arrays, 在函数体内,计数器将再次被其标识符引用,键和值将被引用,就像它们是数组一样,

key[i] refers to the key of the i-th key/value pair value[i] refers to the value of the i-th value pair key[i]是指第i个键/值对value[i]value[i]是指第i个值对的值

and these references would be compiled to their respective addresses on the stack frame. 并且这些引用将被编译到堆栈帧上它们各自的地址。

None of this is really difficult to do, nor has it ever been. 这些都不是真的很难做到,也从来没有。 However, C's design philosophy simply isn't conducive to such features. 但是,C的设计理念根本不利于这些功能。

Without a venturing C compiler implementor (or C preprocessor implementor) taking the lead to implement this or a similar scheme it is unlikely we will ever see anything of this kind in C. 如果没有冒险的C编译器实现者(或C预处理器实现者)带头实现这个或类似的方案,我们不太可能在C中看到任何这种类型。

The trouble is that folks who are interested in type safety and willing to put in the work to build their own compilers usually come to the conclusion that the C language is beyond salvage and one may as well start over with a better designed language to begin with. 麻烦的是,那些对类型安全感兴趣并且愿意投入工作来构建自己的编译器的人通常会得出这样的结论:C语言已经超越了打捞,人们也可以从更好的设计语言开始。

I have been there myself, eventually decided to abandon the attempt, then implement one of Wirth's languages and added type safe variadics to that instead. 我自己一直在那里,最终决定放弃尝试,然后实施Wirth的一种语言,并为此添加了类型安全的可变参数。 I have since run into other people who told me about their own aborted attempts. 我遇到了其他人,他们告诉我他们自己的流产企图。 Proper type safe variadics in C seem poised to remain elusive. C中适当的类型安全可变参数似乎仍然难以捉摸。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM