[英]Python Pandas groupby, row value to column headers
我有一个要转置的 DataFrame:
import pandas as pd
sid= '13HKQ0Ue1_YCP-pKUxFuqdiqgmW_AZeR7P3VsUwrCnZo' # spreadsheet id
gid = 0 # sheet unique id (0 equals sheet0)
url = 'https://docs.google.com/spreadsheets/d/{}/export?gid={}&format=csv'.format(sid,gid)
df = pd.read_csv(url)
我想要做的是获取 StoreName 和 CATegory 作为列标题,并为每个类别设置权重与价格。
期望输出:
我尝试过 Loops、Pandas 但无法弄清楚,
我认为它可以由 df.GroupBy 完成,但返回的对象不是 DataFrame。
我从 API 的 JSON 输出中得到所有这些:
import pandas as pd
import json, requests
from cytoolz.dicttoolz import merge
page = requests.get(mainurl)
dict_dta = json.loads(page.text) # load in Python DICT
list_columns = ['id', 'name', 'category_name', 'ounce', 'gram', 'two_grams', 'quarter', 'eighth','half_ounce','unit','half_gram'] # get the unformatted output
df = pd.io.json.json_normalize(dict_dta, ['categories', ['items']]).pipe(lambda x: x.drop('prices', 1).join(x.prices.apply(lambda y: pd.Series(merge(y)))))[list_columns]
df.to_csv('name')
我已经尝试了很多方法。 如果有人能指出我正确的方向,那将非常有帮助。
这是在正确的方向吗?
import pandas as pd
sid= '13HKQ0Ue1_YCP-pKUxFuqdiqgmW_AZeR7P3VsUwrCnZo' # spreadsheet id
gid = 0 # sheet unique id (0 equals sheet0)
url = 'https://docs.google.com/spreadsheets/d/{}/export?gid={}&format=csv'.format(sid,gid)
df = pd.read_csv(url)
for idx, dfx in df.groupby(df.CAT):
if idx != 'Flower':
continue
df_test = dfx.drop(['CAT','NAME'], axis=1)
df_test = df_test.rename(columns={'StoreNAME':idx}).set_index(idx).T
df_test
返回:
Flower Pueblo West Organics - Adult Use Pueblo West Organics - Adult Use \
UNIT NaN NaN
HALFOUNCE 15.0 50.0
EIGHTH NaN 25.0
TWOGRAMS NaN NaN
QUARTER NaN 40.0
OUNCE 30.0 69.0
GRAM NaN 9.0
Flower Pueblo West Organics - Adult Use Three Rivers Dispensary - REC \
UNIT NaN NaN
HALFOUNCE 50.0 75.0
EIGHTH 25.0 20.0
TWOGRAMS NaN NaN
QUARTER 40.0 45.0
OUNCE 69.0 125.0
GRAM 9.0 8.0
Flower Three Rivers Dispensary - REC
UNIT NaN
HALFOUNCE 75.0
EIGHTH 20.0
TWOGRAMS NaN
QUARTER 40.0
OUNCE 125.0
GRAM 8.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.