[英]Pandas dataframe insert every value in mysql database
I got a python script that scrapes domain names.我有一个可以抓取域名的 python 脚本。 I just parsed the json response because it renders html code.
我只是解析了 json 响应,因为它呈现 html 代码。 I used pandas to read the html and get the
body
which is the html content.我使用 Pandas 读取 html 并获取
body
,即 html 内容。 I tried to print it and then i got the correct values.我试图打印它,然后我得到了正确的值。 Now that i got it, I want to save every result in a mysql database.
现在我明白了,我想将每个结果保存在 mysql 数据库中。 How could I achieve it?
我怎么能做到呢?
Here is my script这是我的脚本
mydb = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database='domainscrape'
)
mycursor = mydb.cursor()
print(mydb)
pageNumber = 0
while True:
driver.implicitly_wait(3)
driver.get('https://reversewhois.domaintools.com/?ajax=mReverseWhois&call=ajaxGetPreviewPage&q=%5B%5B%5B%22whois%22%2C%222%22%2C%22VerifiedID%40SG-Mandatory%22%5D%5D%5D&o='+str(pageNumber))
time.sleep(3)
pre = driver.find_element_by_tag_name("pre").text
data = json.loads(pre)
if data['body']:
table = data['body']
tables = pd.read_html(table,skiprows=1)
df = tables[-1]
print(df.to_string(index=False))
pageNumber += 1
continue
else:
break
I got a result like this我得到了这样的结果
0vh-cl0ud.sg 2017-10-12 KEY-SYSTEMS GMBH
0vh-cloud.sg 2017-10-12 KEY-SYSTEMS GMBH
0vhcloud.sg 2017-10-12 KEY-SYSTEMS GMB
Tried saving it to an csv file and i got good results尝试将它保存到一个 csv 文件,我得到了很好的结果
df.to_csv('Domains.csv', mode='a', sep=',',index=False)
but i dont want to import csv to mysql.但我不想将 csv 导入 mysql。 I just want to directly insert the rows in an existing mysql table in it.
我只想直接在其中插入现有 mysql 表中的行。
How could i format it like 0vh-cl0ud.sg
is the domain 2017-10-12
is the date and KEY-SYSTEMS GMBH
is the company?我怎么能把它格式化成
0vh-cl0ud.sg
是域2017-10-12
是日期, KEY-SYSTEMS GMBH
是公司? I dit not include the header since in every iteration it prints the header and i dont want it.我不包括标题,因为在每次迭代中它都会打印标题而我不想要它。
It should be something like this:它应该是这样的:
mycursor = mydb.cursor()
mycursor.execute("INSERT INTO table_name(domain, date, company) VALUES ('0vh-cl0ud.sg', '2017-10-12', 'KEY-SYSTEMS GMBH'))"
This piece should be put in the loop after the data is scraped.这块应该在数据被抓取后放入循环中。 Please go through the aforementioned links in comments to have a better understanding of the process.
请在评论中浏览上述链接,以便更好地了解该过程。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.