[英]How can I iterate through each row of a pandas dataframe, then conditionally set a new value in that row?
I am working on a school project, so please no exact answers.我正在做一个学校项目,所以请没有确切的答案。 I have a pandas dataframe that has numerators and denominators rating images of dogs out of 10. When there are multiple dogs in the image, the rating is out of number of dogs * 10. I am trying to adjust it so that for example... if there are 5 dogs, and the rating is 40/50, then the new numerator/denominator is 8/10.
我有一个 pandas dataframe 的分子和分母对狗的评分图像(满分 10)。当图像中有多只狗时,评分超出了狗的数量 * 10。我正在尝试调整它,例如.. . 如果有 5 只狗,评分为 40/50,则新的分子/分母为 8/10。 Here is an example of my code.
这是我的代码示例。 I am aware that the syntax does not work in line 3, but I believe it accurately represents what I am trying to accomplish.
我知道该语法在第 3 行中不起作用,但我相信它准确地代表了我想要完成的任务。 twitter_archive is the dataframe.
twitter_archive 是 dataframe。
twitter_archive['new_denom'] = 10
twitter_archive['new_numer'] = 0
for numer, denom in twitter_archive['rating_numerator','rating_denominator']:
if (denom > 10) & (denom % 10 == 0):
num_denom = denom / 10
new_numer = numer / num_denom
twitter_archive['new_numer'] = new_numer
So basically I am checking the denominator if it is above 10, and if it is, is it divisible by 10?所以基本上我正在检查分母是否高于 10,如果是,它是否可以被 10 整除? if it is, then find out how many times 10 goes into it, and then divide the numerator by that value to get an new numerator.
如果是,则找出有多少次 10 进入它,然后将分子除以该值以获得新分子。 I think my logic for that works fine, but the issue I have is grabbing that row, and then adding that new value to the new column I created, in that row.
我认为我的逻辑工作正常,但我遇到的问题是抓取该行,然后将该新值添加到我在该行中创建的新列中。 edit: added df head
编辑:添加了 df 头
tweet_id![]() |
timestamp![]() |
text![]() |
rating_numerator ![]() |
rating_denominator![]() |
name![]() |
doggo![]() |
floofer![]() |
pupper![]() |
puppo![]() |
avg_numerator ![]() |
avg_denom ![]() |
avg_numer ![]() |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 ![]() |
8.924206e+17 ![]() |
2017-08-01 16:23:56+00:00 ![]() |
This is Phineas.![]() ![]() ![]() |
13.0 ![]() |
10.0 ![]() |
phineas![]() |
None![]() |
None![]() |
None![]() |
None![]() |
0.0 ![]() |
10 ![]() |
0 ![]() |
1 ![]() |
8.921774e+17 ![]() |
2017-08-01 00:17:27+00:00 ![]() |
This is Tilly.![]() ![]() |
13.0 ![]() |
10.0 ![]() |
tilly![]() |
None![]() |
None![]() |
None![]() |
None![]() |
0.0 ![]() |
10 ![]() |
0 ![]() |
2 ![]() |
8.918152e+17 ![]() |
2017-07-31 00:18:03+00:00 ![]() |
This is Archie.![]() ![]() |
12.0 ![]() |
10.0 ![]() |
archie![]() |
None![]() |
None![]() |
None![]() |
None![]() |
0.0 ![]() |
10 ![]() |
0 ![]() |
3 ![]() |
8.916896e+17 ![]() |
2017-07-30 15:58:51+00:00 ![]() |
This is Darla.![]() ![]() |
13.0 ![]() |
10.0 ![]() |
darla![]() |
None![]() |
None![]() |
None![]() |
None![]() |
0.0 ![]() |
10 ![]() |
0 ![]() |
4 ![]() |
8.913276e+17 ![]() |
2017-07-29 16:00:24+00:00 ![]() |
This is Franklin.![]() ![]() |
12.0 ![]() |
10.0 ![]() |
franklin![]() |
None![]() |
None![]() |
None![]() |
None![]() |
0.0 ![]() |
10 ![]() |
0 ![]() |
copy/paste head below:在下面复制/粘贴头:
{'tweet_id': {0: 8.924206435553362e+17,
1: 8.921774213063434e+17,
2: 8.918151813780849e+17,
3: 8.916895572798587e+17,
4: 8.913275589266883e+17},
'timestamp': {0: Timestamp('2017-08-01 16:23:56+0000', tz='UTC'),
1: Timestamp('2017-08-01 00:17:27+0000', tz='UTC'),
2: Timestamp('2017-07-31 00:18:03+0000', tz='UTC'),
3: Timestamp('2017-07-30 15:58:51+0000', tz='UTC'),
4: Timestamp('2017-07-29 16:00:24+0000', tz='UTC')},
'text': {0: "This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10 ",
1: "This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10 ",
2: 'This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10 ',
3: 'This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us ',
4: 'This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek '},
'rating_numerator': {0: 13.0, 1: 13.0, 2: 12.0, 3: 13.0, 4: 12.0},
'rating_denominator': {0: 10.0, 1: 10.0, 2: 10.0, 3: 10.0, 4: 10.0},
'name': {0: 'phineas', 1: 'tilly', 2: 'archie', 3: 'darla', 4: 'franklin'},
'doggo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
'floofer': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
'pupper': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'},
'puppo': {0: 'None', 1: 'None', 2: 'None', 3: 'None', 4: 'None'}}
If you want to use for
loop to get row values, you can use iterrows()
function.如果要使用
for
循环获取行值,可以使用iterrows()
function。
for idx, row in twitter_archive.iterrows():
denom = row['rating_denominator']
numer = row['rating_numerator']
# You can add values in list and concat it with df
Faster way to iterate on df is itertuples()
:迭代 df 的更快方法是
itertuples()
:
for row in twitter_archive.itertuples():
denom = row[1]
numer = row[2]
But I think best way to create new col from old ones is to use pandas apply function.但我认为从旧的创建新 col 的最佳方法是使用 pandas 应用function。
df = pd.DataFrame(data={'a' : [1,2], 'b': [3,5]})
df['c'] = df.apply(lambda x: 'sum_is_odd' if (x['a'] + x['b']) % 2 == 1 else 'sum_is_even', axis=1)
In this case, 'c' is a new column and value is calculated using 'a' and 'b' columns.在这种情况下,“c”是一个新列,值是使用“a”和“b”列计算的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.