[英]nested for loops with pandas dataframe
I am looping through a dataframe column of headlines (sp500news) and comparing against a dataframe of company names (co_names_df).我正在遍历标题的数据框列 (sp500news) 并与公司名称的数据框 (co_names_df) 进行比较。 I am trying to update the frequency each time a company name appears in a headline.每次公司名称出现在标题中时,我都试图更新频率。
My current code is below and is not updating the frequency columns.我当前的代码在下面并且没有更新频率列。 Is there a cleaner, faster implementation - maybe without the for loops?有没有更干净、更快的实现——也许没有 for 循环?
for title in sp500news['title']:
for string in title:
for co_name in co_names_df['Name']:
if string == co_name:
co_names_index = co_names_df.loc[co_names_df['Name']=='string'].index
co_names_df['Frequency'][co_names_index] += 1
co_names_df sample co_names_df 示例
Name Frequency
0 3M 0
1 A.O. Smith 0
2 Abbott 0
3 AbbVie 0
4 Accenture 0
5 Activision 0
6 Acuity Brands 0
7 Adobe Systems 0
...
sp500news['title'] sample sp500news['title'] 示例
title
0 Italy will not dismantle Montis labour reform minister
1 Exclusive US agency FinCEN rejected veterans in bid to hire lawyers
4 Xis campaign to draw people back to graying rural China faces uphill battle
6 Romney begins to win over conservatives
8 Oregon mall shooting survivor in serious condition
9 Polands PGNiG to sign another deal for LNG supplies from US CEO
You can probably speed this up;您可能可以加快速度; you're using dataframes where other structures would work better.您正在使用其他结构可以更好地工作的数据帧。 Here's what I would try.这就是我要尝试的。
from collections import Counter
counts = Counter()
# checking membership in a set is very fast (O(1))
company_names = set(co_names_df["Name"])
for title in sp500news['title']:
for word in title: # did you mean title.split(" ")? or is title a list of strings?
if word in company_names:
counts.update([word])
counts
is then a dictionary {company_name: count}
. counts
然后是一个字典{company_name: count}
。 You can just do a quick loop over the elements to update the counts in your dataframe.您只需对元素进行快速循环即可更新数据框中的计数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.