繁体   English   中英

想要使用 python 将网页中的值提取为字典格式

[英]Want to extract values which is in a webpage as dictionary format using python

我想从网页上删除姓名、电话号码和 email,但似乎整个细节都在字典中,有人请纠正我,我很困惑如何在特定列中提取这些值。 这是代码

import requests
from bs4 import BeautifulSoup
from csv import writer

url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
R = requests.get(url)

soup = BeautifulSoup(R.text, 'html.parser')
print(soup)
with open('school.csv', 'a', encoding='utf8', newline ='') as f:
thewriter = writer(f)
header = ['Name', 'Location', 'Phone Number', 'Email' ]
thewriter.writerow(header)
thewriter.writerow(soup)

It do not need BeautifulSoup as mentioned simply requets the api and transform JSON viacsv.DictWriter to CSV.

例子

import requests, csv

url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
data = requests.get(url).json()['items']
data
with open('my.csv', 'w', newline='') as output_file:
    dict_writer = csv.DictWriter(output_file, data[0].keys())
    dict_writer.writeheader()
    dict_writer.writerows(data)

编辑

正如Barry the Platipus所提到的,go 和pandas也有一种和多种方法:

import pandas as pd

pd.json_normalize(
    pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']
).to_csv('my.csv', index=False)

或者

pd.DataFrame(
    pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']\
    .values.tolist()
).to_csv('my.csv', index=False)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM