简体   繁体   English

将Pandas数据框中字符串出现的次数附加到另一列

[英]Append number of times a string occurs in Pandas dataframe to another column

I'd like to create an extra column on this dataframe: 我想在这个数据帧上创建一个额外的列:

Index                  Value
0                22,88,22,24
1                      24,24
2                      22,24
3    11,22,24,12,24,24,22,24
4                         22

So that the number of times a value occurs is stored in a new column: 因此,值发生的次数存储在新列中:

Index                  Value     22 Count
0                22,88,22,24            2
1                      24,24            1
2                      22,24            1
3    11,22,24,12,24,24,22,24            2
4                         22            1

I'd like to repeat this process for a number of different values within the value column. 我想在value列中为许多不同的值重复此过程。

My minimal Python knowledge is telling me something like: 我最小的Python知识告诉我类似的东西:

df['22 Count'] = df['Value'].count('22')

I've tried this and a few other versions but I must be missing something. 我试过这个和其他几个版本但我必须遗漏一些东西。

If want count only one value use str.count : 如果只想计算一个值,请使用str.count

df['22 Count'] = df['Value'].str.count('22')
print (df)
                         Value  22 Count
Index                                   
0                  22,88,22,24         2
1                        24,24         0
2                        22,24         1
3      11,22,24,12,24,24,22,24         2
4                           22         1

For all columns count need: 对于所有列数需要:

from collections import Counter

df1 = df['Value'].apply(lambda x: pd.Series(Counter(x.split(','))), 1).fillna(0).astype(int)

Or: 要么:

df1 = pd.DataFrame([Counter(x.split(',')) for x in df['Value']]).fillna(0).astype(int)

Or: 要么:

from sklearn.feature_extraction.text import CountVectorizer

countvec = CountVectorizer()
counts = countvec.fit_transform(df['Value'].str.replace(',', ' '))
df1 = pd.DataFrame(counts.toarray(), columns=countvec.get_feature_names())

print (df1)
   11  12  22  24  88
0   0   0   2   1   1
1   0   0   0   2   0
2   0   0   1   1   0
3   1   1   2   4   0
4   0   0   1   0   0

Last if need add to original: 最后如果需要添加到原始:

df = df.join(df1.add_suffix(' Count'))
print (df)
                         Value  11 Count  12 Count  22 Count  24 Count  \
Index                                                                    
0                  22,88,22,24         0         0         2         1   
1                        24,24         0         0         0         2   
2                        22,24         0         0         1         1   
3      11,22,24,12,24,24,22,24         1         1         2         4   
4                           22         0         0         1         0   

       88 Count  
Index            
0             1  
1             0  
2             0  
3             0  
4             0  

Isolated count 孤立的计数

You are close. 你很亲密 But your syntax attempts to treat a series as if it were a list. 但是您的语法会尝试将系列视为列表。 Instead, you can use the count method after conversion to list : 相反,您可以转换为list 使用count方法:

from operator import methodcaller

df['22_Count'] = df['Value'].str.split(',').apply(methodcaller('count', '22'))

print(df)

   Index                    Value  22_Count
0      0              22,88,22,24         2
1      1                    24,24         0
2      2                    22,24         1
3      3  11,22,24,12,24,24,22,24         2
4      4                       22         1

Multiple counts 多重计数

Use the methods shown by @jezrael . 使用@jezrael显示的方法。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pandas 字符串在基于另一列的列中出现的次数 - pandas number of times a string occurs in one column based on another column 在大熊猫数据框中,算出某一列中某条件发生的次数? - In a pandas dataframe, count the number of times a condition occurs in one column? 检查一个字符串在另一个字符串中出现的次数 - Checking the number of times a string occurs in another string Pandas-如何获取另一列中每个对应值的行数 - Pandas- How to get number of times row occurs for each corresponding value in another column 计算每个值在pandas列中出现的次数 - Count number of times each value occurs in pandas column 熊猫将数据框附加到另一个不合并列值 - Pandas append dataframe to another not merging column values Python Pandas Dataframe将列创建为另一列中出现的字符串数 - Python Pandas Dataframe create column as number of occurrence of string in another columns 计算列表中每个项目在 Pandas 数据框列中出现的次数,用逗号将值与其他列的附加聚合分开 - Count number of times each item in list occurs in a pandas dataframe column with comma separates values with additional aggregation of other columns 熊猫数据框将字符串追加到id列 - pandas dataframe append string to id column 用来自另一个数据框中的字符串匹配的平均值列向pandas数据框附加 - Append pandas dataframe with column of averaged values from string matches in another dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM