简体   繁体   English

以下SED模式到底能做什么?

[英]What does the following SED pattern exactly do?

I am working on a CGI script and the developer who worked on this before me has used a SED Pattern. 我正在研究CGI脚本,而在使用SED模式之前从事此工作的开发人员。

COMMAND=`echo "$QUERY_STRING" | sed -n 's/^.*com_tex=\([^&]*\).*$/\1/p' | sed "s/%20/ /g"`

Here com_tex is the name of the text box in HTML. com_tex是HTML中文本框的名称。

What this line does is it takes a value form the HTML text box and assigns it to a SHELL variable. 该行的作用是从HTML文本框获取一个值并将其分配给SHELL变量。 The SED pattern is apparently (not sure) necessary to extract the value from HTML without the other unnecessary accompanying stuff. 显然(不确定)SED模式是从HTML提取值而无需其他不必要的附带内容所必需的。

I will also mention the issue what I am asking this. 我还要提到这个问题。 The same pattern is used for a text area where I am entering a command and I need it retrieved exactly as it is. 我在其中输入命令的文本区域使用相同的模式,我需要完全按原样检索它。 However it's getting jumbled up. 但是,它变得混乱了。 Eg. 例如。 IF I enter the following command in text box: 如果我在文本框中输入以下命令:

/usr/bin/free -m >> /home/admin/memlog.txt / usr / bin / free -m >> /home/admin/memlog.txt

The value that gets stored in the variable is: 存储在变量中的值是:

%2Fusr%2Fbin%2Ffree+-m+%3E%3E+%2Fhome%2Fadmin%2Fmemlog.txt %2Fusr%2Fbin%2Ffree + -m +%3E%3E +%2Fhome%2Fadmin%2Fmemlog.txt

All of us can get that / is being substituted by %2F, a space by + and the > sign by %3E. 我们所有人都可以得到/被%2F取代,空格被+取代,而>符号被%3E取代。

But I just can not figure how this is specified in the above pattern! 但是我只是不知道上面的模式是如何指定的! Will someone please tell me how that pattern works or what pattern should I substitute there so that I would get my entered command instead of the output I am getting? 有人可以告诉我该模式是如何工作的,或者我应该在那里替换什么模式,这样我才能得到输入的命令而不是得到的输出?

 sed -n

-n switch means "Dont print" -n开关表示“不打印”

's/

s is for substitutions, / is a delimiter so the command looks like s用于替代, /是定界符,因此命令看起来像
s/Thing to sub/subsitution/optional extra command

^.*com_tex=

^ means the start of the line ^表示行的开头
.* means match 0 or more of any character .*表示匹配0个或多个任意字符
So it will match the longest string from the start of the line up to com_tex= 因此它将匹配从行首到com_tex=的最长字符串

\(\)

This is a capture group, whatever is matched inside these brackets is saved and can be used later 这是一个捕获组,这些括号内匹配的内容将被保存,以后可以使用

[^&]*

[^] When the hat is used inside square brackets it means do not match any characters inside the brackets [^]当帽子在方括号内使用时,表示与括号内的任何字符都不匹配
* The same as before means 0 or more matches *与之前相同表示0个或多个匹配项

The capture group combined with this means capture any character except & . 捕获组与此结合意味着捕获&以外的任何字符。

 .*$

The same as the first bit except $ means the end of the line, so this matches everything until the end 与第一位相同,除了$表示行的结尾,因此这将匹配所有内容直到结尾

/\1/p' 

After the second / is the substitution. 第二个/是替换。 \\1 is the capture group from before, so this will substitute everything we matched in the first part(the whole line) with the capture group. \\1是之前的捕获组,因此它将用捕获组替换我们在第一部分(整行)中匹配的所有内容。 p means print, this must be explicitly stated as the -n switch was used and will prevent other lines from being printed. p表示打印,必须在使用-n开关时明确声明,否则将无法打印其他行。

|

PIPE

s/%20/ /g

Sub %20 for a space, g means global so do it for every match on the line 小于%20的空格, g表示全局,因此每次在线匹配时都这样做

HTH :) HTH :)

This is not performed by any of the patterns. 任何模式都不执行此操作。 My best guess is that this escaping is performed by the shell or whatever fetches the HTML. 我最好的猜测是,这种转义是由Shell或任何获取HTML的东西执行的。

I will try to explain the patterns a little at a time 我将尝试一次解释一下模式

sed -n

-n specifies that sed should not print out the text to be matched, ie the html, after applying the commands. -n指定sed在应用命令后不应打印出要匹配的文本,即html。
The command following is of the form 's/regexp/replacement/flags' 以下命令的格式为“ s / regexp / replacement / flags”

^.*com_tex=\([^&]*\).*$

^ matches the beginning of the line ^与行首匹配
.* matches zero to many of any character 。*匹配零到任意多个字符
com_tex= matches the characters literally com_tex =从字面上匹配字符
\\([^&]*\\) '\\(' specifies the beginning of a group that can later be backreferenced via its index. '[^&]*' matches zero to many characters which are not '&'. '\\)' specifies the end of the group. \\([^&] * \\)'\\('指定了一个组的开始,以后可以通过其索引进行反向引用。'[^&] *'将0匹配为多个非'&'的字符。 '指定组的结尾。
.* See above 。* 往上看
$ matches the end of the line $匹配行尾

\1

The above replacement is a backreference to the first (and only) group in the regexp ie '[^&]*'. 上面的替换是对正则表达式中第一个(也是唯一一个)组的后向引用,即“ [^&] *”。 So the replacement replaces the entire line with all characters immediately following 'com_tex=' till the first '&'. 因此,替换操作将整个行替换为紧跟在“ com_tex =”之后的所有字符,直到第一个“&”为止。

The p flag specifies that if a substitution took place, the current line post substitution should be printed. p标志指定如果发生替换,则应打印当前行的替换后。

sed "s/%20/ /g"

The above is much simpler, it replaces all (not just the first) occurences of '%20' with a space ' '. 上面的代码要简单得多,它用空格''代替了所有'%20'出现(不仅是第一次出现)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM