[英]How do I transfer Data using Web Server/TCPsockets in Ruby?
I have a data scraper in ruby that retrieves article data.我在 ruby 中有一个数据抓取器,用于检索文章数据。
Another dev on my team needs my scraper to spin up a webServer he can make a request to so that he may import the data on a Node Application he's built.我团队中的另一个开发人员需要我的 scraper 来启动一个 web 服务器,他可以向其发出请求,以便他可以将数据导入到他构建的节点应用程序中。
Being a junior, I do not understand the following:作为小学生,我不明白以下几点:
a) Is there a proper convention in Rails that tells me where to place my scraper.rb file a) Rails 中是否有适当的约定告诉我将我的 scraper.rb 文件放在哪里
b) Once that file is properly placed, how would i get the server to accept connections with the scrapedData b) 一旦该文件被正确放置,我将如何让服务器接受与 scrapedData 的连接
c)What (functionally) is the relationship between the ports, sockets, and routing c) 端口、sockets 和路由之间的(功能上)关系是什么
I understand this may be a "rookieQuestion" but I honestly dont know.我知道这可能是一个“菜鸟问题”,但老实说我不知道。
Can someone please BREAK THIS DOWN.有人可以打破这个吗?
I have already:我已经:
i) Setup a server.rb file and have it linking to localhost:2000 but Im not sure how to create a proper route or connection that allows someone to use Postman for a valid route and connect to my data. i) 设置一个 server.rb 文件并将其链接到 localhost:2000 但我不确定如何创建正确的路由或连接以允许某人使用 Postman 作为有效路由并连接到我的数据。
require 'socket'
require 'mechanize'
require 'awesome_print'
port = ENV.fetch("PORT",2000).to_i
server = TCPServer.new(port)
puts "Listening on port #{port}..."
puts "Current Time : #{Time.now}"
loop do
client = server.accept
client.puts "= Running Web Server ="
general_sites = [
"https://www.lovebscott.com/",
"https://bleacherreport.com/",
"https://balleralert.com/",
"https://peopleofcolorintech.com/",
"https://afrotech.com/",
"https://bossip.com/",
"https://www.itsonsitetv.com/",
"https://theshaderoom.com/",
"https://shadowandact.com/",
"https://hollywoodunlocked.com/",
"https://www.essence.com/",
"http://karencivil.com/",
"https://www.revolt.tv/"
]
holder=[]
agent = Mechanize.new
general_sites.each do |site|
page=agent.get(site);
newRet = page.search('a')
newRet.each do |e|
data = e.attr('href').to_s
if(data.length > 50)
holder.push(data)
end
end
pp holder.length.to_s + " [ posts total] ==> Now Scraping --> " + site
end
client.write(holder)
client.close
end
In Rails you don't spin up a web server manually, as it's done for you using rackup, unicorn, puma or any other compatible application server .在 Rails 中,您无需手动启动 web 服务器,因为它已为您使用 rackup、unicorn、puma 或任何其他兼容的应用程序服务器完成。
Rails itself is never "talking" to the HTTP clients directly, it is just a specific application that exposes a rack-compatible API (basically have an object that responds to call(hash)
and returns [integer, hash, enumerable_of_strings]
); Rails 本身从不直接与 HTTP 客户端“对话”,它只是一个特定的应用程序,它公开一个机架兼容的 API(基本上有一个 object 响应
call(hash)
并返回[integer, hash, enumerable_of_strings]
); the app server will get the data from unix/tcp sockets and call your application.应用服务器将从 unix/tcp sockets 获取数据并调用您的应用程序。
If you want to expose your scraper to an external consumer (provided it's fast enough), you can create a controller with a method that accepts some data, runs the scraper, and finally renders back the scraping results in some structured way.如果你想将你的抓取器暴露给外部消费者(前提是它足够快),你可以创建一个 controller 方法来接受一些数据,运行抓取器,最后以某种结构化的方式返回抓取结果。 Then in the router you connect some URL to your controller method.
然后在路由器中将一些 URL 连接到 controller 方法。
# config/routes.rb
post 'scrape/me', to: 'my_controller#scrape'
# app/controllers/my_controller.rb
class MyController < ApplicationController
def scrape
site = params[:site]
results = MyScraper.run(site)
render json: results
end
end
and then with a simple POST yourserver/scrape/me?site=www.example.com
you will get back your data.然后用一个简单的
POST yourserver/scrape/me?site=www.example.com
你会取回你的数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.