每天从两列或更多列熊猫中计算独特的出现次数

Question

I could like to count the unique occurrences of names per day from two columns:我想从两列中计算每天出现的名字的唯一次数：

df = pd.DataFrame({
    'ColA':['john wick','bloody mary','peter pan','jeff bridges','billy boy'],
    'ColB':['bloody mary','jeff bridges','billy boy','billy boy','john wick'],
    'date':['2000-01-01', '2000-01-01', '2000-01-03', '2000-01-03', '2000-01-03'],})
datetime_series = pd.to_datetime(df['date'])
datetime_index = pd.DatetimeIndex(datetime_series.values)
df2 = df.set_index(datetime_index)
df2.drop('date',axis=1,inplace=True)
df2
Out[746]: 
                    ColA          ColB
2000-01-01  john wick     bloody mary 
2000-01-01  bloody mary   jeff bridges
2000-01-03  peter pan     billy boy   
2000-01-03  jeff bridges  billy boy   
2000-01-03  billy boy     john wick

So that I obtain a series or similar to the following:以便我获得一系列或类似于以下内容：

           unique occurrences of names
2000-01-01             3
2000-01-03             4

Answer 1

Use DataFrame.stack with DataFrameGroupBy.nunique and last Series.to_frame :将DataFrame.stack与DataFrameGroupBy.nunique和最后一个Series.to_frame ：

df3 = df2.stack().groupby(level=0).nunique().to_frame(name='unique occurrences of names')
print (df3)
            unique occurrences of names
2000-01-01                            3
2000-01-03                            4

Or alternative with DataFrame.melt :或者使用DataFrame.melt替代：

df3 = (df2.reset_index()
          .melt('index')
          .groupby('index')['value']
          .nunique()
          .to_frame(name='unique occurrences of names'))

每天从两列或更多列熊猫中计算独特的出现次数

问题描述

1 个解决方案

解决方案1
1 2020-08-26 08:45:04

每天从两列或更多列熊猫中计算独特的出现次数

问题描述

1 个解决方案

解决方案1 1 2020-08-26 08:45:04

解决方案1
1 2020-08-26 08:45:04