简体   繁体   English

如何在R getURL中使用Tor socks5

[英]How to use Tor socks5 in R getURL

I want to use Tor in getURL function in R. Tor is working (checked in firefox), socks5 at port 9050 . 我想在R中的getURL函数中使用getURL正在工作(在firefox中检查), socks5port 9050 But when I set this in R, I get the following error 但是当我在R中设置它时,我得到以下错误

html <- getURL("http://www.google.com", followlocation = T, .encoding="UTF-8", .opts = list(proxy = "127.0.0.1:9050", timeout=15))

Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : '\\n\\nTor is not an HTTP Proxy\\n\\n\\n curlPerform出错(curl = curl,.opts = opts,.encoding = .encoding):'\\ n \\ nTor不是HTTP代理\\ n \\ n \\ n

Tor is not an HTTP Proxy Tor不是HTTP代理

\\n \\ n

\\nIt appears you have configured your web browser to use Tor as an HTTP proxy.\\nThis is not correct: Tor is a SOCKS proxy, not an HTTP proxy.\\nPlease configure your client accordingly. \\ n您似乎已将Web浏览器配置为使用Tor作为HTTP代理。\\ n这不正确:Tor是SOCKS代理,而不是HTTP代理。\\ n请相应地配置您的客户端。

I've tried replace proxy with socks, socks5 but it didn't work. 我已经尝试用socks,socks5替换代理,但它没有用。

There are curl bindings for R , after which you can use curl to call the Tor SOCKS5 proxy server. Rcurl绑定 ,之后你可以使用curl来调用Tor SOCKS5代理服务器。

The call from the shell (which you can translate to the R binding) is: 来自shell的调用(可以转换为R绑定)是:

curl --socks5-hostname 127.0.0.1:9050 google.com

Tor will do the DNS also for A records. Tor也将为A记录执行DNS。

RCurl will default to a HTTP proxy, but Tor provides a SOCKS proxy. RCurl将默认为HTTP代理,但Tor提供SOCKS代理。 Tor is clever enough to understand that the proxy client (RCurl) is trying to use a HTTP proxy, hence the error message in HTML returned by Tor. Tor很聪明地理解代理客户端(RCurl)正在尝试使用HTTP代理,因此Tor返回的HTML中的错误消息。

In order to get RCurl, and curl, to use a SOCKS proxy, you can use a protocol prefix, and there are two protocol prefixes for SOCKS5: "socks5" and "socks5h" (see the Curl manual ). 为了获得RCurl和curl,使用SOCKS代理,您可以使用协议前缀,SOCKS5有两个协议前缀:“socks5”和“socks5h”(参见Curl手册 )。 The latter will let the SOCKS server handle DNS-queries, which is the preferred method when using Tor (in fact, Tor will warn you if you let the proxy client resolve the hostname). 后者将让SOCKS服务器处理DNS查询,这是使用Tor时的首选方法(事实上,如果让代理客户端解析主机名,Tor会发出警告)。

Here is a pure R solution which will use Tor for dns-queries. 这是一个纯R解决方案,它将使用Tor进行dns查询。

library(RCurl)
options(RCurlOptions = list(proxy = "socks5h://127.0.0.1:9050"))
my.handle <- getCurlHandle()
html <- getURL(url='https://www.torproject.org', curl=my.handle)

If you want to specify additional parameters, see below on where to put them: 如果要指定其他参数,请参阅下面的放置位置:

library(RCurl)
options(RCurlOptions = list(proxy = "socks5h://127.0.0.1:9050",
                            useragent = "Mozilla",
                            followlocation = TRUE,
                            referer = "",
                            cookiejar = "my.cookies.txt"
                            )
        )
my.handle <- getCurlHandle()
html <- getURL(url='https://www.torproject.org', curl=my.handle)

嗨Naparst我真的很感激如何解决你建议的解决方案选项应该是这样的:opts < - list(socks5.hostname =“127.0.0.1:9050”)(这不起作用,因为socks5.hostname是不是一种选择)

Under Mac OSX install Tor Bundle for Mac and Privoxy and then update the proxy settings in the system preferences. 在Mac OSX下安装Tor Bundle for MacPrivoxy ,然后更新系统首选项中的代理设置。

'System preferences' --> 'Wi-FI' --> 'Advanced' --> 'Proxies' --> set 'Web Proxy (HTTP)' Web Proxy Server 127.0.0.1:8118 '系统偏好' - >'Wi-FI' - >'高级' - >'代理' - >设置'Web代理(HTTP)'Web代理服务器127.0.0.1:8118

'System preferences' --> 'Wi-FI' --> 'Advanced' --> 'Proxies' --> set 'Secure Web Proxy (HTTPS)' Secure Web Proxy Server 127.0.0.1:8118 --> 'OK' --> 'Apply' '系统偏好' - >'Wi-FI' - >'高级' - >'代理' - >设置'安全Web代理(HTTPS)'安全Web代理服务器127.0.0.1:8118 - >'确定' - >'申请'

library(rcurl)
curl <- getCurlHandle()
curlSetOpt(proxy='127.0.0.1:9150',proxytype=5,curl=curl)
html <- getURL(url='check.torproject.com',curl=curl)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM