简体   繁体   English

在不同列中分离 Python web 抓取数据

[英]Separate Python web scraped data in different columns

I tried to scrape data by using API and put those result in an CSV file.我尝试使用 API 来抓取数据,并将这些结果放入 CSV 文件中。 But when I open my CSV file all the data is put together in 1 column(A).但是当我打开我的 CSV 文件时,所有数据都放在一列(A)中。 Instead I want the data to be separated in different columns(A & B (and C, D, E, F etc when I want to add info)).相反,我希望将数据分隔在不同的列中(A 和 B(以及 C、D、E、F 等,当我想添加信息时))。 How can I do that?我怎样才能做到这一点?

import requests
import pandas as pd
from pandas.compat import StringIO
import numpy as np
import datetime as dt
from dateutil.relativedelta import relativedelta
import csv

csv_file = open('/Users/katewang/Desktop/Test/scrape.csv', 'w')
csv_writer = csv.writer(csv_file)

def get_EOD_data(api_token='5cb671b0b4a790.35526238', session = None, tickers = 'AAPL', start_date = dt.datetime(2018,1,1), end_date = dt.datetime(2018,12,31)):
    symbols = tickers
    if session is None:
        session = requests.Session()

    url = 'https://eodhistoricaldata.com/api/eod/%s.US' % symbols
    params = {"api_token": api_token, "from": start_date, "to": end_date}
    r = session.get(url, params = params)
    if r.status_code == requests.codes.ok:


    cols=[0,5]
    df = pd.read_csv(StringIO(r.text), skipfooter = 1, parse_dates = [0], engine = 'python', na_values=['nan'], index_col = 0, usecols = cols)

    df.fillna(method = 'ffill', inplace = True)
    df.fillna(method = 'bfill', inplace = True)
    return df

def main():
    df_data = get_EOD_data()
    csv_writer.writerow([df_data])

if __name__ == '__main__':
    main()

csv_file.close()

I expect to see two separate columns.我希望看到两个单独的列。

You're seeing only one column since, out of the two selected columns 0 and 5, you set column 0 to be the index when creating the dataframe.您只看到一列,因为在选择的两个列 0 和 5 中,您在创建 dataframe 时将列 0 设置为索引。 This leaves only column 5 as an actual column.这仅留下第 5 列作为实际列。

You can check for yourself by removing index_col = 0 from the line您可以通过从行中删除index_col = 0来检查自己

df = pd.read_csv(StringIO(r.text), skipfooter = 1, parse_dates = [0], engine = 'python', na_values=['nan'], index_col = 0, usecols = cols)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM