[英]Parsing Command Line Arguments for Shell
I have this shell that I am writing for a summer project. 我有一个要为一个夏季项目编写的shell。 I am trying to parse the command line for example,
例如,我正在尝试解析命令行,
If I call 如果我打电话
ls -l
I need to parse the 我需要解析
-l
part. 部分。
So I can pass it in for the arguments vector that is used in side of execv
. 因此,我可以将其传递给
execv
端使用的参数向量。 I know I am parsing it correctly but for some reason something is not with finding the directory. 我知道我正确解析了它,但是由于某种原因,找不到目录。 Am I perhaps missing something?
我可能想念什么吗? Below is my code.
下面是我的代码。
Although the strtok
standard library function can be useful, you need to be aware of the shortcomings of its interface, which is basically a trap for the unwary. 尽管
strtok
标准库功能可能很有用,但您需要了解其接口的缺点,这基本上是对粗心的陷阱。
In this program, you seem to have stumbled over both of the most common problems with the strtok
interface. 在此程序中,您似乎偶然发现了
strtok
界面的两个最常见问题。 Please reread man strtok
carefully in conjunction with this answer in order to avoid falling into these problems in the future. 请结合此答案仔细阅读
man strtok
,以免将来陷入这些问题。 Also, do not use strtok
as an example of good interface design. 另外,请勿将
strtok
用作良好界面设计的示例。 Instead, use it as a model for what to avoid: 而是将其用作避免情况的模型:
strtok
operates on a string pointer which it keeps in a static variable. strtok
对保留在静态变量中的字符串指针进行操作。 Whenever you call strtok
with a non-NULL first argument, it first resets the value of this static variable to that string. 每当您使用第一个非NULL参数调用
strtok
时,它都会首先将此静态变量的值重置为该字符串。 At the end of each call to strtok
, it sets its static variable to the address at which the next scan should start, which is just after the token it just found. 在对
strtok
的每次调用结束时,它将其静态变量设置为下一次扫描应开始的地址,该地址恰好在它刚刚找到的令牌之后。
There is only one instance of the static variable in the whole program, so you can't interleave strtok
scans on two different strings. 整个程序中只有一个静态变量实例,因此您不能在两个不同的字符串上交错进行
strtok
扫描。 Worse, you can't call a function which itself calls strtok
inside a strtok
scan of a string, because the call inside the function will reset the strtok
state. 更糟糕的是,您无法调用本身在字符串的
strtok
扫描内调用strtok
的函数,因为该函数内的调用将重置strtok
状态。
That means you have to be careful whenever you have more than one strtok
scan in a program. 这意味着在程序中进行多次
strtok
扫描时,您必须要小心。 In your case, after the initialization of the badly-named variable env
: 在您的情况下,在对名称错误的变量
env
进行初始化之后:
token = strtok(env, ":");
you use strtok
to divide your input command into pieces in the badly-named variable argv
: 您可以使用
strtok
将输入命令分为几个名字不完整的变量argv
:
argv = strtok(buf_copy, " ");
so when you later want to find the next component of env
: 因此,当您以后想要查找
env
的下一个组件时:
token = strtok(NULL, ":");
strtok
's state no longer points into env
; strtok
的状态不再指向env
; instead it points into buf_copy
(and, with your particular input, at a point in buf_copy
where no more tokens will be found). 相反,它指向
buf_copy
(并且使用您的特定输入,指向buf_copy
中没有更多令牌的位置)。
The first argument to strtok
is a char*
, not a const char*
. strtok
的第一个参数是char*
,而不是const char*
。
In general, if a library function has a string argument, the argument should be declared as const char*
unless the function intends to modify the string. 通常,如果库函数具有字符串参数,则除非函数打算修改字符串,否则该参数应声明为
const char*
。 Or, to put it another way, a const char*
declaration is a promise that no attempt will be made to modify the argument, and if the promise is not made, it's probably for a good reason. 或者,换句话说,
const char*
声明是一个承诺,即不会尝试修改该参数,并且如果未做出承诺,则可能是有充分的理由的。
And, indeed, if you read strtok
's documentation, you will see that it explicitly modifies its input string by overwriting some delimiter characters with a NUL character. 而且,的确,如果您阅读
strtok
的文档,将会看到它通过用NUL字符覆盖一些分隔符来显式修改其输入字符串。 This has the effect of permanently dividing the original string into separate tokens. 这具有将原始字符串永久划分为单独标记的作用。 Sometimes that's fine, but it can get you into a lot of trouble if you want to refer to the string's original value again in the future.
有时候很好,但是如果您以后想再次引用字符串的原始值,可能会给您带来很多麻烦。 Often you will find yourself making a copy of the original string in order to call
strtok
on it. 通常,您会发现自己在复制原始字符串,以便对其调用
strtok
。 (That's often a symptom of bad program design, or a signal that strtok
wasn't really the right tool to use for parsing.) (这通常是不良程序设计的征兆,或者是信号
strtok
并不是真正用于解析的正确工具。)
In this particular program, the trap is that getenv()
does not return a copy of the environment variable's value. 在此特定程序中,陷阱是
getenv()
不返回环境变量值的副本。 It returns a pointer directly into the environment variable table. 它直接将指针返回到环境变量表中。 Although the return type of
getenv
is char*
, which might lead you to believe that modifying the value is ok, the C standard clearly tells you not to: 尽管
getenv
的返回类型为char*
,这可能使您认为修改该值是可以的,但C标准显然告诉您不要:
The string pointed to shall not be modified by the program
指向的字符串不得由程序修改
Unfortunately, this prohibition is not present in the Linux manpage for getenv
, but that manpage does note that getenv
gives you a pointer into the environment table. 不幸的是,在Linux的
getenv
页中没有这个禁止,但是该手册页确实指出getenv
为您提供了指向环境表的指针。 If you do modify the string returned by getenv
, it is highly likely (though not guaranteed) that a subsequent call to getenv
for the same environment variable will retrieve the modified value. 如果您确实修改了
getenv
返回的字符串,则很有可能(尽管不能保证)随后对同一环境变量的getenv
调用将检索修改后的值。
And that's precisely what you do: since you let strtok
loose on the string returned by getenv(PATH)
, a subsequent call to getenv(PATH)
will see a value truncated at the first colon. 而这正是您要做的事情:由于您在
getenv(PATH)
返回的字符串上放了strtok
,因此随后对getenv(PATH)
调用将在第一个冒号处截断一个值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.