简体   繁体   English

如何遍历 pandas dataframe 的每一行,然后有条件地在该行中设置一个新值?

[英]How can I iterate through each row of a pandas dataframe, then conditionally set a new value in that row?

I am working on a school project, so please no exact answers.我正在做一个学校项目,所以请没有确切的答案。 I have a pandas dataframe that has numerators and denominators rating images of dogs out of 10. When there are multiple dogs in the image, the rating is out of number of dogs * 10. I am trying to adjust it so that for example... if there are 5 dogs, and the rating is 40/50, then the new numerator/denominator is 8/10.我有一个 pandas dataframe 的分子和分母对狗的评分图像(满分 10)。当图像中有多只狗时,评分超出了狗的数量 * 10。我正在尝试调整它,例如.. . 如果有 5 只狗,评分为 40/50,则新的分子/分母为 8/10。 Here is an example of my code.这是我的代码示例。 I am aware that the syntax does not work in line 3, but I believe it accurately represents what I am trying to accomplish.我知道该语法在第 3 行中不起作用,但我相信它准确地代表了我想要完成的任务。 twitter_archive is the dataframe. twitter_archive 是 dataframe。

twitter_archive['new_denom'] = 10
twitter_archive['new_numer'] = 0
for numer, denom in twitter_archive['rating_numerator','rating_denominator']:
    if (denom > 10) & (denom % 10 == 0):
        num_denom = denom / 10
        new_numer = numer / num_denom
        twitter_archive['new_numer'] = new_numer

So basically I am checking the denominator if it is above 10, and if it is, is it divisible by 10?所以基本上我正在检查分母是否高于 10,如果是,它是否可以被 10 整除? if it is, then find out how many times 10 goes into it, and then divide the numerator by that value to get an new numerator.如果是,则找出有多少次 10 进入它,然后将分子除以该值以获得新分子。 I think my logic for that works fine, but the issue I have is grabbing that row, and then adding that new value to the new column I created, in that row.我认为我的逻辑工作正常,但我遇到的问题是抓取该行,然后将该新值添加到我在该行中创建的新列中。 edit: added df head编辑:添加了 df 头

tweet_id推文ID timestamp时间戳 text文本 rating_numerator rating_numerator rating_denominator评级分母 name姓名 doggo狗狗 floofer发条者 pupper小狗 puppo木偶 avg_numerator avg_numerator avg_denom avg_denom avg_numer avg_numer
0 0 8.924206e+17 8.924206e+17 2017-08-01 16:23:56+00:00 2017-08-01 16:23:56+00:00 This is Phineas.这是菲尼亚斯。 He's a mystical boy.他是一个神秘的男孩。 Only eve...只有晚上... 13.0 13.0 10.0 10.0 phineas飞哥 None没有任何 None没有任何 None没有任何 None没有任何 0.0 0.0 10 10 0 0
1 1 8.921774e+17 8.921774e+17 2017-08-01 00:17:27+00:00 2017-08-01 00:17:27+00:00 This is Tilly.这是蒂莉。 She's just checking pup on you....她只是在检查你的小狗...... 13.0 13.0 10.0 10.0 tilly蒂蒂 None没有任何 None没有任何 None没有任何 None没有任何 0.0 0.0 10 10 0 0
2 2 8.918152e+17 8.918152e+17 2017-07-31 00:18:03+00:00 2017-07-31 00:18:03+00:00 This is Archie.这是阿奇。 He is a rare Norwegian Pouncin...他是一个罕见的挪威Pouncin... 12.0 12.0 10.0 10.0 archie阿奇 None没有任何 None没有任何 None没有任何 None没有任何 0.0 0.0 10 10 0 0
3 3 8.916896e+17 8.916896e+17 2017-07-30 15:58:51+00:00 2017-07-30 15:58:51+00:00 This is Darla.这是达拉。 She commenced a snooze mid meal...她开始打盹中餐…… 13.0 13.0 10.0 10.0 darla达拉 None没有任何 None没有任何 None没有任何 None没有任何 0.0 0.0 10 10 0 0
4 4 8.913276e+17 8.913276e+17 2017-07-29 16:00:24+00:00 2017-07-29 16:00:24+00:00 This is Franklin.这是富兰克林。 He would like you to stop ca...他希望你停止... 12.0 12.0 10.0 10.0 franklin富兰克林 None没有任何 None没有任何 None没有任何 None没有任何 0.0 0.0 10 10 0 0

copy/paste head below:在下面复制/粘贴头:

{'tweet_id': {0: 8.924206435553362e+17,
  1: 8.921774213063434e+17,
  2: 8.918151813780849e+17,
  3: 8.916895572798587e+17,
  4: 8.913275589266883e+17},
 'timestamp': {0: Timestamp('2017-08-01 16:23:56+0000', tz='UTC'),
  1: Timestamp('2017-08-01 00:17:27+0000', tz='UTC'),
  2: Timestamp('2017-07-31 00:18:03+0000', tz='UTC'),
  3: Timestamp('2017-07-30 15:58:51+0000', tz='UTC'),
  4: Timestamp('2017-07-29 16:00:24+0000', tz='UTC')},
 'text': {0: "This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 ",
  1: "This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 ",
  2: 'This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 ',
  3: 'This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us ',
  4: 'This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek '},
 'rating_numerator': {0: 13.0, 1: 13.0, 2: 12.0, 3: 13.0, 4: 12.0},
 'rating_denominator': {0: 10.0, 1: 10.0, 2: 10.0, 3: 10.0, 4: 10.0},
 'name': {0: 'phineas', 1: 'tilly', 2: 'archie', 3: 'darla', 4: 'franklin'},
 'doggo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'floofer': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'pupper': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'puppo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'}}

If you want to use for loop to get row values, you can use iterrows() function.如果要使用for循环获取行值,可以使用iterrows() function。

for idx, row in twitter_archive.iterrows():
    denom = row['rating_denominator']
    numer = row['rating_numerator']
    # You can add values in list and concat it with df

Faster way to iterate on df is itertuples() :迭代 df 的更快方法是itertuples()

for row in twitter_archive.itertuples():
    denom = row[1]
    numer = row[2]

But I think best way to create new col from old ones is to use pandas apply function.但我认为从旧的创建新 col 的最佳方法是使用 pandas 应用function。

df = pd.DataFrame(data={'a' : [1,2], 'b': [3,5]})
df['c'] = df.apply(lambda x: 'sum_is_odd' if (x['a'] + x['b']) % 2 == 1 else 'sum_is_even', axis=1)

In this case, 'c' is a new column and value is calculated using 'a' and 'b' columns.在这种情况下,“c”是一个新列,值是使用“a”和“b”列计算的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python-Pandas-导入Excel文件,遍历每一行,添加新值,然后添加到数据框 - Python - Pandas - Import Excel file, iterate through each row, add new value, and add to dataframe 如何为一组 pandas dataframe 正确迭代每一行 - How to properly iterate over each row for a set of pandas dataframe 如何在for循环中为Pandas DataFrame的特定行设置值? - How can I set the value for a specific row for a Pandas DataFrame in a for loop? 如果单元格内容是一个集合并且我想查看其中是否有值,如何从熊猫数据框中有条件地获取一行? - How to conditionally get a row from a pandas dataframe, if the cell content is a set and I want to see if a value is in it? Pandas 遍历一个数据帧,将行值和列值连接到一个关于特定列值的新数据帧中 - Pandas-iterate through a dataframe concatenating row values and column values into a new dataframe with respect to a specific column value 使用熊猫,如何逐行遍历数据帧,但每一行都是其自己的数据帧 - Using pandas, how do I loop through a dataframe row by row but with each row being its own dataframe 如何在数据框中拆分一列并将每个值存储为新行(以熊猫为单位)? - How to split a column in a dataframe and store each value as a new row (in pandas)? 如何遍历Pandas数据框行并在每次迭代中创建数据框 - How to iterate over Pandas dataframe row & create a dataframe in each iteration 如何在熊猫数据框中逐行迭代并在其列中查找值 - How to iterate row by row in a pandas dataframe and look for a value in its columns 如何遍历pandas数据帧,并有条件地将值分配给一行变量? - How to loop through pandas dataframe, and conditionally assign values to a row of a variable?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM