[英]sed - Remove all of a line except matching pattern
I am working at trying to parse out hashtags from a file. 我正在努力解析文件中的主题标签。 For instance:
例如:
Some text here #Foo Some other text here....
I would like the output to be: 我希望输出为:
#Foo
The text before and after the # can change and I'm trying to apply this to multiple lines of the file. #之前和之后的文本可以更改,我正在尝试将其应用于文件的多行。 Every line will have a # in it as I already grep'd the file for hashtags.
每一行都会有一个#,因为我已经为hashtags写了一个文件。
Basically I'm trying to create a list of the hashtags that are contained in a file. 基本上我正在尝试创建一个包含在文件中的主题标签列表。 If there is also a way to remove duplicated tags from the resulting output that would be a bonus.
如果还有一种方法可以从结果输出中删除重复的标签,这将是一个奖励。
使用GNU grep:
grep -o '#[^ ]*' file
With sed
: 用
sed
:
sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/'
^[^#]*
matches the portion before first #
^[^#]*
匹配第一个之前的部分#
(#[^[:blank:]]*)
matches the #
followed by any number of non-space/tab characters, and put the match in captured group 1 (#[^[:blank:]]*)
匹配#
后跟任意数量的非空格/制表符,并将匹配放入捕获的组1中
.*
matches the rest .*
匹配其余的
In the replacement, the captured group \\1
is used 在替换中,使用捕获的组
\\1
Example: 例:
% sed -E 's/^[^#]*(#[^[:blank:]]*).*/\1/' <<<'Some text here #Foo Some other text here'
#Foo
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.