[英]Want to extract values which is in a webpage as dictionary format using python
我想从网页上删除姓名、电话号码和 email,但似乎整个细节都在字典中,有人请纠正我,我很困惑如何在特定列中提取这些值。 这是代码
import requests
from bs4 import BeautifulSoup
from csv import writer
url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
R = requests.get(url)
soup = BeautifulSoup(R.text, 'html.parser')
print(soup)
with open('school.csv', 'a', encoding='utf8', newline ='') as f:
thewriter = writer(f)
header = ['Name', 'Location', 'Phone Number', 'Email' ]
thewriter.writerow(header)
thewriter.writerow(soup)
It do not need BeautifulSoup
as mentioned simply requets the api and transform JSON viacsv.DictWriter
to CSV.
import requests, csv
url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
data = requests.get(url).json()['items']
data
with open('my.csv', 'w', newline='') as output_file:
dict_writer = csv.DictWriter(output_file, data[0].keys())
dict_writer.writeheader()
dict_writer.writerows(data)
正如Barry the Platipus所提到的,go 和pandas
也有一种和多种方法:
import pandas as pd
pd.json_normalize(
pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']
).to_csv('my.csv', index=False)
或者
pd.DataFrame(
pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']\
.values.tolist()
).to_csv('my.csv', index=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.