简体   繁体   English

sed或awk捕获部分网址

[英]sed or awk to capture part of url

I am not very experienced with regular expressions and sed/awk scripting. 我对正则表达式和sed / awk脚本不太熟悉。

I have urls that are similar to the following torrent url: 我的网址类似于以下洪流网址:

http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent

I would like to have sed or awk script extract the text after the title ie from the example above just get: 我想用sedawk脚本在标题后提取文本,即从上面的示例中得到:

[kickass.to]against.the.ropes.by.carly.fall.epub.torrent 反对carly.epub.rope.ropes.torrent

A simple approach with awk : use the = as the field separator: 使用awk一种简单方法:使用=作为字段分隔符:

awk -F"=" '{print $2}'

Thus: 从而:

echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | awk -F"=" '{print $2}'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent

Just remove everything before the title=: sed 's/.*title=//' 只需删除title =之前的所有内容: sed 's/.*title=//'

$ echo "http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent" | sed 's/.*title=//'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent

Let's say: 比方说:

s='http://torcache.net/torrent/D7249CD9AF321C8578B3A7007ABBDD63B0475EEB.torrent?title=[kickass.to]against.the.ropes.by.carly.fall.epub.torrent'

Pure BASH solution: 纯BASH解决方案:

echo "${s/*title=}"
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent

OR using grep -P : 或使用grep -P

echo "$s"|grep -oP 'title=\K.*'
[kickass.to]against.the.ropes.by.carly.fall.epub.torrent

By using sed (no need to mention title in the regexp in your example) : 通过使用sed (在示例中的regexp中无需提及title ):

 sed 's/.*=//'

An another solution exists with cut , another standard unix tool : 另一个解决方案是cut与另一个标准的unix工具:

 cut -d= -f2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM