计算其他 dataframe 列中列值的出现次数

Question

I have two dataframes and I want to count the occurrence of "classifier" in "fullname".我有两个数据框，我想计算“全名”中“分类器”的出现次数。 My problem is that my script counts a word like "carrepair" only for one classifier and I would like to have a count for both classifiers.我的问题是我的脚本只为一个分类器计算一个像“carrepair”这样的词，我想计算两个分类器。 I would also like to add one random coordinate that matches the classifier.我还想添加一个与分类器匹配的随机坐标。

First dataframe:首先dataframe：

Second dataframe:第二个 dataframe：

Result so far:到目前为止的结果：

Desired Result:期望的结果：

My script so far:到目前为止我的脚本：

 import pandas as pd

fl = pd.read_excel (r'fullname.xlsx')
clas= pd.read_excel (r'classifier.xlsx')

fl.fullname= fl.fullname.str.lower()
clas.classifier = clas.classifier.str.lower()

pat = '({})'.format('|'.join(clas['classifier'].unique()))

fl['fullname'] = fl['fullname'].str.extract(pat, expand = False)

clas['count_of_classifier'] = clas['classifier'].map(fl['fullname'].value_counts())
print(clas)

Thanks!谢谢！

Answer 1

You could try this:你可以试试这个：

import pandas as pd

fl = pd.read_excel (r'fullname.xlsx')
clas= pd.read_excel (r'classifier.xlsx')
fl.fullname= fl.fullname.str.lower()
clas.classifier = clas.classifier.str.lower()

# Add a new column to 'fl' containing either 'repair' or 'car'
for value in clas["classifier"].values:
    fl.loc[fl["fullname"].str.contains(value, case=False), value] = value

# Count values and create a new dataframe
new_clas = pd.DataFrame(
    {
        "classifier": [col for col in clas["classifier"].values],
        "count": [fl[col].count() for col in clas["classifier"].values],
    }
)

# Merge 'fl' and 'new_clas'
new_clas = pd.merge(
    left=new_clas, right=fl, how="left", left_on="classifier", right_on="fullname"
).reset_index(drop=True)

# Keep only expected columns
new_clas = new_clas.reindex(columns=["classifier", "count", "coordinate"])

print(new_clas)
# Outputs
classifier    count    coordinate
repair        3        52.520008, 13.404954
car           3        54.520008, 15.404954

计算其他 dataframe 列中列值的出现次数

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-05-10 17:51:52

计算其他 dataframe 列中列值的出现次数

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-05-10 17:51:52

解决方案1
1 已采纳 2021-05-10 17:51:52