i follow the tutorial from
Nutch Wiki "SetupNutchAndTor"( https://wiki.apache.org/nutch/SetupNutchAndTor )
Set up nutch-site.xml
<property> <name>http.proxy.host</name> <value>127.0.0.1</value> <description>The proxy hostname. If empty, no proxy is used. </description> </property> <property> <name>http.proxy.port</name> <value>8118</value> <description>The proxy port.</description> </property>
but still crawl nothing from the .onion link and not indexed into Solr. Anyone know what is the problem?
Anything in the logs?
FYI with StormCrawler you can use a SOCKS proxy directly thanks to this commit
You'd need to use OKHTTP for the protocol implementation and configure it like this
http.protocol.implementation: "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol" https.protocol.implementation: "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol"
http.proxy.host: localhost
http.proxy.port: 9050
http.proxy.type: "SOCKS"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.