简体   繁体   English

sed - 删除匹配模式以外的所有行

[英]sed - Remove all of a line except matching pattern

I am working at trying to parse out hashtags from a file. 我正在努力解析文件中的主题标签。 For instance: 例如:

Some text here #Foo Some other text here....

I would like the output to be: 我希望输出为:

#Foo

The text before and after the # can change and I'm trying to apply this to multiple lines of the file. #之前和之后的文本可以更改,我正在尝试将其应用于文件的多行。 Every line will have a # in it as I already grep'd the file for hashtags. 每一行都会有一个#,因为我已经为hashtags写了一个文件。

Basically I'm trying to create a list of the hashtags that are contained in a file. 基本上我正在尝试创建一个包含在文件中的主题标签列表。 If there is also a way to remove duplicated tags from the resulting output that would be a bonus. 如果还有一种方法可以从结果输出中删除重复的标签,这将是一个奖励。

使用GNU grep:

grep -o '#[^ ]*' file

With sed : sed

sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/'
  • ^[^#]* matches the portion before first # ^[^#]*匹配第一个之前的部分#

  • (#[^[:blank:]]*) matches the # followed by any number of non-space/tab characters, and put the match in captured group 1 (#[^[:blank:]]*)匹配#后跟任意数量的非空格/制表符,并将匹配放入捕获的组1中

  • .* matches the rest .*匹配其余的

  • In the replacement, the captured group \\1 is used 在替换中,使用捕获的组\\1

Example: 例:

% sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/' <<<'Some text here #Foo Some other text here'
#Foo

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM