简体   繁体   中英

How to get CSV data (as obtained from GoogleSheets - Google Python API) into a table-format (dataframe)?

I have written a piece of code that gets data from a Google-Sheets doc through a path defined by IFTTT, DialogFlow and Hiroku, this happens through this snippet:

# Finding a workbook by name and opening the first sheet
sheet = client.open("IFTTT_Webhooks").sheet1

# Extract and print all of the values
list_of_answers = sheet.get_all_records()

#put the values from the list_of_answers into a csv and store locallyy.
with open('user_answers.csv', 'w') as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    wr.writerow(list_of_answers)

If I open 'user_answers.csv' the csv looks as such:

[{'Date/Time': 'February 19, 2020 at 09:49PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'Sometimes', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 19, 2020 at 09:50PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'never', 'Statement 3': 'never', 'Statement 4': 'no'}, {'Date/Time': 'February 19, 2020 at 10:48PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 20, 2020 at 12:14AM', 'Intent': 'poll_completed', 'Statement 1': 'now', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}]

How can I get this data in a table-format with the columns as such and their respective values:

Date/time,                    Statement 1, Statement 2, Statement 3, Statement 4
February 19, 2020 at 09:49PM.  yes         Sometimes    Sometimes.   no

It's ok if the csv format stays as is. But I want to be able to work with data more effectively in another file. So I want to create I guess a dataframe that considers this new (requested) table-format. Any help is appreciated.

You have list of dictionares so you should use DictWriter()

  • first I get keys from first row to have headers
  • next I create writer which can work with dictionares
  • finally I write headers and all data

Code:

import csv

data = [{'Date/Time': 'February 19, 2020 at 09:49PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'Sometimes', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 19, 2020 at 09:50PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'never', 'Statement 3': 'never', 'Statement 4': 'no'}, {'Date/Time': 'February 19, 2020 at 10:48PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 20, 2020 at 12:14AM', 'Intent': 'poll_completed', 'Statement 1': 'now', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}]

headers = data[0].keys()

with open('user_answers.csv', 'w') as f:
    csv_writer = csv.DictWriter(f, headers)
    csv_writer.writeheader()
    csv_writer.writerows(data)

You can also use pandas.DataFrame directly

import pandas as pd

data = [{'Date/Time': 'February 19, 2020 at 09:49PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'Sometimes', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 19, 2020 at 09:50PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'never', 'Statement 3': 'never', 'Statement 4': 'no'}, {'Date/Time': 'February 19, 2020 at 10:48PM', 'Intent': 'poll_completed', 'Statement 1': 'yes', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}, {'Date/Time': 'February 20, 2020 at 12:14AM', 'Intent': 'poll_completed', 'Statement 1': 'now', 'Statement 2': 'often', 'Statement 3': 'sometimes', 'Statement 4': 'yes'}]

df = pd.DataFrame(data)
df.to_csv('user_answers.csv', index=False)

print(df)

Ive not pulled from google but maybe change the code to something like this so it pulls directly into a dataframe?

import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd

scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name(
         'ForMedium-8f02213e02aa.json', scope) # Your json file here

gc = gspread.authorize(credentials)

wks = gc.open("NYC subway data").sheet1

data = wks.get_all_values()
headers = data.pop(0)

df = pd.DataFrame(data, columns=headers)

list of answers is a list of dictionaries. That means that you should better use a DictWriter here.

...
#put the values from the list_of_answers into a csv and store locallyy.
with open('user_answers.csv', 'w', newline='') as myfile:
    wr = csv.DictWriter(myfile, quoting=csv.QUOTE_ALL, fieldnames = 
                        ['Date/time', 'Statement 1', 'Statement 2', 'Statement 3', 'Statement 4'])
    wr.writeheader()
    wr.writerow(list_of_answers)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM