簡體   English   中英

如何將 web 抓取 output 格式化為 CSV 中的表?

[英]How to format web scraping output as table in a CSV?

我正在從理論上它是一個表格的網站中提取一些數據。

import requests
from bs4 import BeautifulSoup

cookies = {
    'SISWEB-PUBLIC': 'ORA_WWV-RMvAbLGLSxXJOqOTipG30k1M',
    '_ga': 'GA1.3.825042167.1579292801',
    '_pk_id.11.6e3e': '31091343e8e5c6a9.1579292805.14.1605535420.1584973016.',
    '_pk_ref.11.6e3e': '%5B%22%22%2C%22%22%2C1605535420%2C%22https%3A%2F%2Fwww.google.com%2F%22%5D',
    '_gid': 'GA1.3.532866579.1610911359',
    '_gat_gtag_UA_139253076_4': '1',
}

headers = {
    'Connection': 'keep-alive',
    'Accept': 'text/html, */*; q=0.01',
    'X-Requested-With': 'XMLHttpRequest',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36',
    'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
    'Origin': 'https://sisweb.tesouro.gov.br',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-Dest': 'empty',
    'Referer': 'https://sisweb.tesouro.gov.br/apex/f?p=2691:2&minimal=full&font=opensans',
    'Accept-Language': 'en-US,en;q=0.9,pt-BR;q=0.8,pt;q=0.7',
}

data = {
  'p_json': '{"salt":"284140192213841769741724635899547408701","pageItems":{"itemsToSubmit":[{"n":"P2_TIPO_LEILAO","v":"1"},{"n":"P2_TIPO_TITULO","v":"1"},{"n":"P2_PESQUISAR","v":"S","ck":"PCb5bs5LDIDvee0z7u0Uj6YkpPyJBARj2dYQ4WkxnaxN599CNVbrf6gulSAHSU5lQmuIPDpNOaTQUQaUXgpU5Q"},{"n":"P2_DATA_INICIAL","v":"14/01/2021"},{"n":"P2_DATA_FINAL","v":"18/01/2021"}],"protected":"U3PMYyQfm1IU1I_Cn_7v3g","rowVersion":""}}',
  'p_flow_id': '2691',
  'p_flow_step_id': '2',
  'p_instance': '16388465980453',
  'p_page_submission_id': '284140192213841769741724635899547408701',
  'p_request': 'PESQUISAR',
  'p_reload_on_submit': 'A'
}

response = requests.post('https://sisweb.tesouro.gov.br/apex/wwv_flow.accept', headers=headers, cookies=cookies, data=data)

我想知道如何在格式為表格的 csv 文件中獲取 output (響應),或者可以讓我將此 output 視為表格的東西。 謝謝!

Having json format as shown in your example it's easy to convert it to python dictionary using json.dump() method from json package.

with open(filename) as json_file:
       data = json.load(json_file)

然后您可以作為標准 python 字典訪問數據。

或者通過抓取將其轉換為字典,您可以直接將其寫入 csv。

然后您可以使用csv模塊將您感興趣的數據寫入 csv 文件。 In case you have your data in a dictionary it's recomended to use csv.DictWriter() API to extend csv table columns with respect to the header.

使用示例:

進口 csv

with open('names.csv', 'w', newline='') as csvfile:
    fieldnames = ['first_name', 'last_name']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})

希望這是您正在尋找的提示

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM