如何遍历 pandas dataframe 的每一行，然后有条件地在该行中设置一个新值？

Question

I am working on a school project, so please no exact answers.我正在做一个学校项目，所以请没有确切的答案。 I have a pandas dataframe that has numerators and denominators rating images of dogs out of 10. When there are multiple dogs in the image, the rating is out of number of dogs * 10. I am trying to adjust it so that for example... if there are 5 dogs, and the rating is 40/50, then the new numerator/denominator is 8/10.我有一个 pandas dataframe 的分子和分母对狗的评分图像（满分 10）。当图像中有多只狗时，评分超出了狗的数量 * 10。我正在尝试调整它，例如.. . 如果有 5 只狗，评分为 40/50，则新的分子/分母为 8/10。 Here is an example of my code.这是我的代码示例。 I am aware that the syntax does not work in line 3, but I believe it accurately represents what I am trying to accomplish.我知道该语法在第 3 行中不起作用，但我相信它准确地代表了我想要完成的任务。 twitter_archive is the dataframe. twitter_archive 是 dataframe。

twitter_archive['new_denom'] = 10
twitter_archive['new_numer'] = 0
for numer, denom in twitter_archive['rating_numerator','rating_denominator']:
    if (denom > 10) & (denom % 10 == 0):
        num_denom = denom / 10
        new_numer = numer / num_denom
        twitter_archive['new_numer'] = new_numer

So basically I am checking the denominator if it is above 10, and if it is, is it divisible by 10?所以基本上我正在检查分母是否高于 10，如果是，它是否可以被 10 整除？ if it is, then find out how many times 10 goes into it, and then divide the numerator by that value to get an new numerator.如果是，则找出有多少次 10 进入它，然后将分子除以该值以获得新分子。 I think my logic for that works fine, but the issue I have is grabbing that row, and then adding that new value to the new column I created, in that row.我认为我的逻辑工作正常，但我遇到的问题是抓取该行，然后将该新值添加到我在该行中创建的新列中。 edit: added df head编辑：添加了 df 头

	tweet_id推文ID	timestamp时间戳	text文本	rating_numerator rating_numerator	rating_denominator评级分母	name姓名	doggo狗狗	floofer发条者	pupper小狗	puppo木偶	avg_numerator avg_numerator	avg_denom avg_denom
0 0	8.924206e+17 8.924206e+17	2017-08-01 16:23:56+00:00 2017-08-01 16:23:56+00:00	This is Phineas.这是菲尼亚斯。 He's a mystical boy.他是一个神秘的男孩。 Only eve...只有晚上...	13.0 13.0	10.0 10.0	phineas飞哥	None没有任何	None没有任何	None没有任何	None没有任何	0.0 0.0	10 10
1 1	8.921774e+17 8.921774e+17	2017-08-01 00:17:27+00:00 2017-08-01 00:17:27+00:00	This is Tilly.这是蒂莉。 She's just checking pup on you....她只是在检查你的小狗......	13.0 13.0	10.0 10.0	tilly蒂蒂	None没有任何	None没有任何	None没有任何	None没有任何	0.0 0.0	10 10
2 2	8.918152e+17 8.918152e+17	2017-07-31 00:18:03+00:00 2017-07-31 00:18:03+00:00	This is Archie.这是阿奇。 He is a rare Norwegian Pouncin...他是一个罕见的挪威Pouncin...	12.0 12.0	10.0 10.0	archie阿奇	None没有任何	None没有任何	None没有任何	None没有任何	0.0 0.0	10 10
3 3	8.916896e+17 8.916896e+17	2017-07-30 15:58:51+00:00 2017-07-30 15:58:51+00:00	This is Darla.这是达拉。 She commenced a snooze mid meal...她开始打盹中餐……	13.0 13.0	10.0 10.0	darla达拉	None没有任何	None没有任何	None没有任何	None没有任何	0.0 0.0	10 10
4 4	8.913276e+17 8.913276e+17	2017-07-29 16:00:24+00:00 2017-07-29 16:00:24+00:00	This is Franklin.这是富兰克林。 He would like you to stop ca...他希望你停止...	12.0 12.0	10.0 10.0	franklin富兰克林	None没有任何	None没有任何	None没有任何	None没有任何	0.0 0.0	10 10

copy/paste head below:在下面复制/粘贴头：

{'tweet_id': {0: 8.924206435553362e+17,
  1: 8.921774213063434e+17,
  2: 8.918151813780849e+17,
  3: 8.916895572798587e+17,
  4: 8.913275589266883e+17},
 'timestamp': {0: Timestamp('2017-08-01 16:23:56+0000', tz='UTC'),
  1: Timestamp('2017-08-01 00:17:27+0000', tz='UTC'),
  2: Timestamp('2017-07-31 00:18:03+0000', tz='UTC'),
  3: Timestamp('2017-07-30 15:58:51+0000', tz='UTC'),
  4: Timestamp('2017-07-29 16:00:24+0000', tz='UTC')},
 'text': {0: "This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 ",
  1: "This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 ",
  2: 'This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 ',
  3: 'This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us ',
  4: 'This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek '},
 'rating_numerator': {0: 13.0, 1: 13.0, 2: 12.0, 3: 13.0, 4: 12.0},
 'rating_denominator': {0: 10.0, 1: 10.0, 2: 10.0, 3: 10.0, 4: 10.0},
 'name': {0: 'phineas', 1: 'tilly', 2: 'archie', 3: 'darla', 4: 'franklin'},
 'doggo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'floofer': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'pupper': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
 'puppo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'}}

Answer 1

If you want to use for loop to get row values, you can use iterrows() function.如果要使用for循环获取行值，可以使用iterrows() function。

for idx, row in twitter_archive.iterrows():
    denom = row['rating_denominator']
    numer = row['rating_numerator']
    # You can add values in list and concat it with df

Faster way to iterate on df is itertuples() :迭代 df 的更快方法是itertuples() ：

for row in twitter_archive.itertuples():
    denom = row[1]
    numer = row[2]

But I think best way to create new col from old ones is to use pandas apply function.但我认为从旧的创建新 col 的最佳方法是使用 pandas 应用function。

df = pd.DataFrame(data={'a' : [1,2], 'b': [3,5]})
df['c'] = df.apply(lambda x: 'sum_is_odd' if (x['a'] + x['b']) % 2 == 1 else 'sum_is_even', axis=1)

In this case, 'c' is a new column and value is calculated using 'a' and 'b' columns.在这种情况下，“c”是一个新列，值是使用“a”和“b”列计算的。

如何遍历 pandas dataframe 的每一行，然后有条件地在该行中设置一个新值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-07-25 20:19:28

如何遍历 pandas dataframe 的每一行，然后有条件地在该行中设置一个新值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-07-25 20:19:28

解决方案1
1 已采纳 2022-07-25 20:19:28