[英]Calculating new rows in a Pandas Dataframe on two different columns
So I'm a beginner at Python and I have a dataframe with Country, avgTemp and year.所以我是 Python 的初学者,我有一个包含 Country、avgTemp 和 year 的数据框。 What I want to do is calculate new rows on each country where the year adds 20 and avgTemp is multiplied by a variable called tempChange.我想要做的是计算每个国家的新行,其中年份增加 20,avgTemp 乘以名为 tempChange 的变量。 I don't want to remove the previous values though, I just want to append the new values.我不想删除以前的值,我只想附加新值。
This is how the dataframe looks:这是数据框的外观:
Preferably I would also want to create a loop that runs the code a certain number of times Super grateful for any help!最好我还想创建一个循环来运行代码一定次数超级感谢任何帮助!
If you need to copy the values from the dataframe as an example you can have it here:如果您需要从数据框中复制值作为示例,您可以在此处获取它:
Country avgTemp year
0 Afghanistan 14.481583 2012
1 Africa 24.725917 2012
2 Albania 13.768250 2012
3 Algeria 23.954833 2012
4 American Samoa 27.201417 2012
243 rows × 3 columns 243 行 × 3 列
If you want to repeat the rows, I'd create a new dataframe, perform any operation in the new dataframe (sum 20 years, multiply the temperature by a constant or an array, etc...) and use then use concat()
to append it to the original dataframe:如果你想重复行,我会创建一个新的数据帧,在新的数据帧中执行任何操作(总和 20 年,将温度乘以一个常数或数组等......)然后使用concat()
将其附加到原始数据帧:
import pandas as pd
tempChange=1.15
data = {'Country':['Afghanistan','Africa','Albania','Algeria','American Samoa'],'avgTemp':[14,24,13,23,27],'Year':[2012,2012,2012,2012,2012]}
df = pd.DataFrame(data)
df_2 = df.copy()
df_2['avgTemp'] = df['avgTemp']*tempChange
df_2['Year'] = df['Year']+20
df = pd.concat([df,df_2]) #ignore_index=True if you wish to not repeat the index value
print(df)
Output:输出:
Country avgTemp Year
0 Afghanistan 14.00 2012
1 Africa 24.00 2012
2 Albania 13.00 2012
3 Algeria 23.00 2012
4 American Samoa 27.00 2012
0 Afghanistan 16.10 2032
1 Africa 27.60 2032
2 Albania 14.95 2032
3 Algeria 26.45 2032
4 American Samoa 31.05 2032
where df is your data frame name:其中 df 是您的数据框名称:
df['tempChange'] = df['year']+ 20 * df['avgTemp']
This will add a new column to your df with the logic above.这将使用上述逻辑向您的 df 添加一个新列。 I'm not sure if I understood your logic correct so the math may need some work我不确定我是否理解你的逻辑正确,所以数学可能需要一些工作
I believe that what you're looking for is我相信你正在寻找的是
dfName['newYear'] = dfName.apply(lambda x: x['year'] + 20,axis=1)
dfName['tempDiff'] = dfName.apply(lambda x: x['avgTemp']*tempChange,axis=1)
This is how you apply to each row.这就是您应用于每一行的方式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.