使用SED，AWK或GREP匹配文件中的URL模式

Question

I am trying to use grep to extract a list of urls beginning with http and ending with jpg. 我正在尝试使用grep提取以http开始并以jpg结尾的网址列表。

grep -o 'picturesite.com/wp-content/uploads/.......' filename grep -o'picturesite.com/wp-content/uploads / .......'文件名

The code above is how far I've gotten. 上面的代码是我已经走了多远。 I then need to pass these file names to curl 然后，我需要传递这些文件名来卷曲

title : "Family Vacation", jpg:" http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg ", owner : "PhotoTaker" 标题：“家庭度假”，jpg：“ http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg ”，所有者：“ PhotoTaker”

Answer 1

You can capture url patterns by doing: 您可以通过执行以下操作捕获url模式：

grep -o 'http.*.jpg' file

$ grep -o 'http.*.jpg' <<EOF
> title : "Family Vacation", jpg:"http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg", owner : "PhotoTaker
> EOF 
http://picturesite.com/wp-content/uploads/2014/01/mypicture.jpg

curl does not take url from standard input so your best bet would be to store the extracted url to a file and then reading the file one line at a time and passing the variable that holds the line to curl command. curl不从标准输入中获取url ，因此最好的选择是将提取的url存储到文件中，然后一次读取一行文件，然后将包含该行的变量传递给curl命令。

Answer 2

sed -nr 's/http\S*(jpg\|gif\|other\|ext)/\
    curl $CURLOPTS & >$OUT/p' <$infile | sh -n

The above command will search $infile for any string beginning with "http" followed by any length of non-whitespace characters and ending with any of the "\\|" 上面的命令将在$ infile中搜索任何以“ http”开头，其后为任意长度的非空格字符，并以“ \\ |”结尾的字符串 separated file extensions contained in the parentheses. 括号中包含分隔的文件扩展名。

Once it's found such a string sed will substitute it into the curl commandline on the second line to replace "&." 一旦找到，这样的字符串sed会将其替换到第二行的curl命令行中，以替换“＆”。 It will then pipe the command string to sh for execution. 然后它将命令字符串传递给sh以便执行。

Remember, sed is the stream editor, not just the stream searcher, so it can very capably pre-process input for other commands to make them do what you want. 请记住，sed是流编辑器，而不仅仅是流搜索器，因此它可以非常有能力地预处理其他命令的输入，以使它们执行您想要的操作。

Note: sh is currently passed the 'noexecute' argument which basically works more like echo than anything else. 注意：sh当前被传递了'noexecute'参数，该参数基本上比echo更为有效。 When you've run it a few times and are satisfied you're doing the right thing you'll need to remove it for any effect. 运行几次后，如果您对它感到满意，那么您在做正确的事情就需要删除它才能产生任何效果。

Note 2: If there's a chance you'll want to match more than one url per line you'll need the 'g' sed option. 注意2：如果有可能您希望每行匹配多个网址，则需要使用'g'sed选项。

使用SED，AWK或GREP匹配文件中的URL模式

问题描述

2 个解决方案

解决方案1
0 2014-03-04 03:37:04

解决方案2
0 已采纳 2014-03-04 06:09:33

使用SED，AWK或GREP匹配文件中的URL模式

问题描述

2 个解决方案

解决方案1 0 2014-03-04 03:37:04

解决方案2 0 已采纳 2014-03-04 06:09:33

解决方案1
0 2014-03-04 03:37:04

解决方案2
0 已采纳 2014-03-04 06:09:33