简体   繁体   English

iPython:如何计算字符串在单元格中出现的次数?

[英]iPython: how do I count the number of times a string appears in a cell?

I have a data frame with the columns Movie Title and Cast that looks like this: 我有一个包含Movie TitleCast表列的数据框,如下所示:

CSV的图片

Column 1 has the name of the movie, whilst Column 2 lists the full cast of the film. 第1栏是电影的名称,而第2栏列出了电影的完整演员表。 The cast has been taken from the site TMDB. 演员表取自网站TMDB。

Column 2 has the pattern: 'cast_id': {cast_id_number} , 'character': {character_name} , 'credit_id': {credit_number} , 'gender': {gender_identifier} , etc. 第2列采用以下模式: 'cast_id': {cast_id_number}'character': {character_name}'credit_id': {credit_number}'gender': {gender_identifier}等。

I am writing a project for school looking at the gender split in different films. 我正在为学校编写一个项目,研究不同电影中的性别差异。 I therefore want to create a column that counts the number of male/female actors in a specific film. 因此,我想创建一个列来计算特定电影中男性/女性演员的数量。 eg: 例如:

Movie Title | Cast | No. of Males | No. of Females
Toy Story   | .... | 3            | 7

However, I'm not sure how to go about doing this. 但是,我不确定如何执行此操作。 I've tried using str.count but it keeps returning all values as 0, even if I can see a cell contains 'gender': 2 or 'gender': 1 . 我已经尝试过使用str.count但是即使我看到一个单元格包含'gender': 2'gender': 1 ,它也会始终将所有值返回为0。

I'm assuming it may need an if loop counter that reads the string in each row and adds 1 every time it encounters 'gender': 2 but have no idea how to implement this. 我假设它可能需要一个if循环计数器,该计数器读取每行中的字符串,并在遇到'gender': 2每次加1 'gender': 2但不知道如何实现。

You will need to iterate over each cast member for each movie and determine how many cast members are female/male. 您将需要遍历每部电影的每个演员,并确定有多少演员是女性/男性。 Something like this should work: 这样的事情应该起作用:

def gender_ct(data, gender=1):
    return len([1 for x in data if x['gender'] == gender])

df['No. of Females'] = df['Cast'].apply(gender_ct, gender=1)
df['No. of Males'] = df['Cast'].apply(gender_ct, gender=2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM