[英]Split Pandas column of lists into multiple columns based on value
I have a Pandas Dataframe that has a list column.我有一个 Pandas Dataframe 有一个列表列。 I'd like to split this list column into multiple column based on the value.我想根据值将此列表列拆分为多个列。 Returning yes_value or no_value for each record based on the column name.根据列名为每条记录返回 yes_value 或 no_value。
Example input:输入示例:
id | values
---|----------
1 | [A,B,C,D]
2 | [D,E,F]
3 | [A,D]
4 | [K]
Expected output:预计 output:
id | values | A | B | C | D | E | F | K |
---|----------|-------|-------|-------|-------|-------|-------|-------|
1 | [A,B,C,D]| yes_A | yes_B | yes_C | yes_D | no_E | no_F | no_K |
2 | [D,E,F] | no_A | no_B | no_C | yes_D | yes_E | yes_F | no_K |
3 | [A,D] | yes_A | no_B | no_C | yes_D | no_E | no_F | no_K |
4 | [K] | no_A | no_B | no_C | no_D | no_E | no_F | yes_K |
You can use a crosstab
to reshape:您可以使用crosstab
来重塑:
df2 = df.explode('values')
df3 = pd.crosstab(df2['id'], df2['values']).replace({0: 'no_', 1: 'yes_'})
out = df.merge(df3.add(df3.columns), left_on='id', right_index=True)
Or str.get_dummies
:或者str.get_dummies
:
df2 = df['values'].agg('|'.join).str.get_dummies().replace({0: 'no_', 1: 'yes_'})
out = df.join(df2.add(df2.columns))
output: output:
id values A B C D E F K
0 1 [A, B, C, D] yes_A yes_B yes_C yes_D no_E no_F no_K
1 2 [D, E, F] no_A no_B no_C yes_D yes_E yes_F no_K
2 3 [A, D] yes_A no_B no_C yes_D no_E no_F no_K
3 4 [K] no_A no_B no_C no_D no_E no_F yes_K
Code:代码:
#Input
df = pd.DataFrame({'id':[1,2,3,4], 'Values':[['A','B','C','D'], ['D','E','F'], ['A','D'], ['K']]})
#STEP 1 merging the Values column all lists into one and findout the unique values using set
#so the output will be {'A', 'B', 'C', 'D', 'E', 'F', 'K'} and then looping on it as below
for i in sorted(set(sum(df['Values'].tolist(),[]))):
#STEP 2 Creating new column and check if column in list or not
df[i] = df['Values'].apply(lambda x: f'yes_{i}' if i in x else f'no_{i}')
df
Output: Output:
id Values A B C D E F K
0 1 [A, B, C, D] yes_A yes_B yes_C yes_D no_E no_F no_K
1 2 [D, E, F] no_A no_B no_C yes_D yes_E yes_F no_K
2 3 [A, D] yes_A no_B no_C yes_D no_E no_F no_K
3 4 [K] no_A no_B no_C no_D no_E no_F yes_K
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.