[英]Second python code bs4 and saving data to a text file
我已經設法編寫了我的第二個 python 代碼,我正在嘗試從網頁上的表格中提取數據,即 www.ksmnet.org。 我需要表的第二列的數據,即今天的日期,並且我已經成功提取了該數據。 但是我需要將數據的第一列的數據保存為包含第二列數據的文本文件。 例如,如果 Fajr 是 05:00,那么我需要一個文本文件保存為 Fajr.txt,在這個文本文件中我需要 05:00。
我知道有些時候沒有“:”符號,我需要轉換它們。 因此,例如帶有 06.00 的那些需要是 06:00。
這是我的代碼:
# import libraries
import json
import urllib.request
#import soupsieve
from bs4 import BeautifulSoup
import requests
url = 'https://www.ksmnet.org/'
request = urllib.request.Request(url)
response = urllib.request.urlopen(request)
html = response.read()
soup = BeautifulSoup(html.decode("utf-8"), "html.parser")
path = '/srv/docker/homeassistant/prayer/'
table = soup.find('div', id={'prayer': 'listing sortable'})
package = '' ; version = ''
for i in table.select('tr'):
data = i.select('td')
if data:
package = data[0].text.strip()
version = ' '.join(data[1].text.strip().split())
print(version)
有人可以幫忙嗎?
謝謝
這是要寫入文本文件的代碼。
import urllib.request
#import soupsieve
from bs4 import BeautifulSoup
import requests
url = 'https://www.ksmnet.org/'
request = urllib.request.Request(url)
response = urllib.request.urlopen(request)
html = response.read()
soup = BeautifulSoup(html.decode("utf-8"), "html.parser")
path = '/srv/docker/homeassistant/prayer/'
table = soup.find('div', id={'prayer': 'listing sortable'})
package = '' ; version = ''
for i in table.select('tr'):
data = i.select('td')
if data:
package = data[0].text.strip()
file = open(package[:-1] +".txt", "w+")
version =data[1].text.strip().replace('.',':')
file.write(version)
file.close()
或者你可以使用 python pandas 。
import pandas as pd
url = 'https://www.ksmnet.org/'
df=pd.read_html(url)[0]
for pkg, version in zip(df['Date'],df['06/01']):
file = open(pkg[:-1] +".txt", "w+")
version =version.replace('.',':').strip()
file.write(version)
file.close()
更新校驗位
import urllib.request
#import soupsieve
from bs4 import BeautifulSoup
import requests
url = 'https://www.ksmnet.org/'
request = urllib.request.Request(url)
response = urllib.request.urlopen(request)
html = response.read()
soup = BeautifulSoup(html.decode("utf-8"), "html.parser")
path = '/srv/docker/homeassistant/prayer/'
table = soup.find('div', id={'prayer': 'listing sortable'})
package = '' ; version = ''
for i in table.select('tr'):
data = i.select('td')
if data:
package = data[0].text.strip()
file = open(package[:-1] +".txt", "w+")
version =data[1].text.strip().replace('.',':')
#Check hours
checkHours = version.split(':')[0]
if len(checkHours) < 2:
version ="0" + str(checkHours) +':' + version.split(':')[-1]
# print(version)
#Check minutes
checkMinute = version.split(':')[-1]
if len(checkMinute) < 2:
version = version.split(':')[0] + ":" + "0" + str(checkMinute)
print(version)
file.write(version)
file.close()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.