简体   繁体   English

使用 wget 下载网页并定义新文件名

[英]Download a web page using wget and define a new filename

I need to write a script in bash using wget which download a web page which has been passed to an argument and then the script should put the extracted page in a new file.html and then also extract all the tags of the web page in a second file and keep only the content of the web page.我需要使用wget在 bash 中编写一个脚本,该脚本下载一个已传递给参数的网页,然后脚本应该将提取的页面放在一个新的file.html 中,然后还将网页的所有标签提取到一个第二个文件,只保留网页的内容。

This is the beginning of my script :这是我脚本的开头:

#!/bin/bash
$page = "https://fr.wikipedia.org/wiki/Page_web"
wget -r  -np '$page' file.html

From the second part, I am blocked.从第二部分开始,我被阻止了。

This will work:这将起作用:

page="https://fr.wikipedia.org/wiki/Page_web"
wget -O file.html -r -np "$page"
  1. Variable assignment: var_name=value (no space allowed around = )变量赋值: var_name=value=周围不允许有空格)
  2. Bash is not PHP, $var=val is not correct, var=val is. Bash 不是 PHP, $var=val不正确, var=val是。
  3. Use double quote to allow variable expansion ( "$page" )使用双引号允许变量扩展( "$page"

From wget manual:wget手册:

 -O file --output-document=file The documents will not be written to the appropriate files, but all will be concatenated together and written to file.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM