Pivot Pandas dataframe 看条件是否满足

Question

I have the following DataFrame that represents whether a User was present in some week, some year:我有以下 DataFrame 表示用户是否在某周，某年出现：

    User    Year    Week
0   John    2020    1
1   John    2020    2
2   Steve   2020    1
3   Fred    2020    3
4   George  2020    2   
5   George  2020    3
    ...     ...     ...
200 John    2021    2
201 John    2021    4
202 Steve   2021    2
203 Fred    2021    2
204 George  2021    1   
205 George  2021    4

I want to get a DataFrame that groups the dataset by User and each column represents whether he was present in a certain week of a certain year, each column either being of type boolean or integer with possible values 0 or 1.我想得到一个 DataFrame，它按User对数据集进行分组，每一列代表他是否在某一年的某一周出现，每一列的类型为 boolean 或 Z157DB7DF530023575515ZD366C9B1.80 或 672E8

It would look something like this:它看起来像这样：

        2020_1  2020_2  2020_3  ... 2021_1  2021_2  2021_3  2021_4
John         1       1       0  ...      0       1       0       1
Steve        1       0       0  ...      0       1       0       0
Fred         0       0       1  ...      0       1       0       0
George       0       1       1  ...      1       0       0       1

Is there anyway to do this without iterating through the DataFrme?无论如何都可以在不遍历 DataFrme 的情况下做到这一点？

Thanks.谢谢。

Answer 1

Create a new column and use pd.crosstab :创建一个新列并使用pd.crosstab ：

pd.crosstab(df['User'],
            df[['Year','Week']].astype(str).apply('_'.join, axis=1)
           )

Output: Output：

col_0   2020_1  2020_2  2020_3  2021_1  2021_2  2021_4
User                                                  
Fred         0       0       1       0       1       0
George       0       1       1       1       0       1
John         1       1       0       0       1       1
Steve        1       0       0       0       1       0

Answer 2

Here's one way you can do this:这是您可以执行此操作的一种方法：

import pandas as pd
df = pd.DataFrame({
    "User" : ["John","John","Steve","Fred","George","George"],
    "Year" : [2020,2020,2020,2020,2020,2020],
    "Week": [1,2,1,3,2,3]})

# add a helper column for year_week
df["year_week"] = df["Year"].map(str) + "_" + df["Week"].map(str)

# group by User and year_week, then unstack and fill NaN with 0
df.groupby(["User","year_week"]).size().unstack(fill_value = 0)

Results in:结果是：

| User   |   2020_1 |   2020_2 |   2020_3 |
|:-------|---------:|---------:|---------:|
| Fred   |        0 |        0 |        1 |
| George |        0 |        1 |        1 |
| John   |        1 |        1 |        0 |
| Steve  |        1 |        0 |        0 |

Answer 3

pd.crosstab(df.User, df['Year'].astype(str)+"_"+df['Week'].astype(str))

Pivot Pandas dataframe 看条件是否满足

问题描述

3 个解决方案

解决方案1
3 已采纳 2021-01-27 03:10:43

解决方案2
3 2021-01-27 03:17:48

解决方案3
2 2021-01-27 03:18:20

Pivot Pandas dataframe 看条件是否满足

问题描述

3 个解决方案

解决方案1 3 已采纳 2021-01-27 03:10:43

解决方案2 3 2021-01-27 03:17:48

解决方案3 2 2021-01-27 03:18:20

解决方案1
3 已采纳 2021-01-27 03:10:43

解决方案2
3 2021-01-27 03:17:48

解决方案3
2 2021-01-27 03:18:20