[英]Need help in counting strings under heads of Pandas data frame with timelines
I am looking to count the frequency of each string under different heads in Pandas dataframe using Pandas pivot for data analysis with monthly trends.我希望使用 Pandas 数据透视表计算每个字符串在 Pandas 数据框中不同头部的频率,以进行每月趋势的数据分析。 The data looks like below,
数据如下所示,
name age city Date country hight MessageList gender
Tom 10 NewYork 1/1/2021 08:35:58Z US NaN X List Male
Mark 5 London 5/1/2021 08:35:58Z UK NaN X List Male
Pam 7 London 3/6/2021 08:35:58Z UK NaN Y List Female
Tom 18 California 4/6/2021 08:35:58Z US 163 Y List Male
Lena 23 NewYork 12/12/2020 08:35:58Z US NaN Y List Female
Ben 17 Colombo 11/12/2020 08:35:58Z Srilanka NaN X List Male
Lena 23 Paris 8/1/2020 08:35:58Z France NaN Y List Female
Ben 51 Colombo 7/1/2020 08:35:58Z Srilanka NaN Z List Male
Tom 18 Paris 1/1/2021 08:35:58Z France NaN Z List Male
Mark 5 Paris 5/1/2021 08:35:58Z Japan NaN Z List Male
Tom 18 London 3/6/2021 08:35:58Z UK NaN X List Male
Tom 18 Paris 4/6/2021 08:35:58Z France 163 Z List Male
import pandas as pd
import numpy as np
table = pd.pivot_table(df, values='name', index=['name', 'city'],
aggfunc=np.count_nonzero())
I am new to Pandas and am struggling to get the string count with the monthly trend.我是 Pandas 的新手,正在努力根据每月趋势获得字符串数。
I am looking for output like this,我正在寻找这样的输出,
2020 2021
Name Nov Dec Jan Feb
Tom
Paris 3 1 2 3
Colombo 2 3 3
London 4 1 4 2
Mark
Colombo 1 3 1
London 3 3 2 2
Pam
California 3 1 1
NewYork 1 4 2
Len
London 1 2 2 1
Use crosstab
with months periods by Series.dt.to_period
, so possible create MultiIndex
in ouput by PeriodIndex.year
with PeriodIndex.strftime
:使用
crosstab
与几个月时间Series.dt.to_period
,所以可以创建MultiIndex
由输出中PeriodIndex.year
与PeriodIndex.strftime
:
df['Date'] = pd.to_datetime(df['Date'])
table = pd.crosstab([df['name'], df['city']], df['Date'].dt.to_period('m'))
table.columns = [table.columns.year, table.columns.strftime('%b')]
print (table)
Date 2020 2021
Date Jul Aug Nov Dec Jan Mar Apr May
name city
Ben Colombo 1 0 1 0 0 0 0 0
Lena NewYork 0 0 0 1 0 0 0 0
Paris 0 1 0 0 0 0 0 0
Mark London 0 0 0 0 0 0 0 1
Paris 0 0 0 0 0 0 0 1
Pam London 0 0 0 0 0 1 0 0
Tom California 0 0 0 0 0 0 1 0
London 0 0 0 0 0 1 0 0
NewYork 0 0 0 0 1 0 0 0
Paris 0 0 0 0 1 0 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.