[英]Python: how to pass 3 columns in data frame as 3 separate arguments in function and iterate through the column values
I have the following python data frame with columns listed below: This data frame is stored to the variable WSI_Hourly 我有下面列出的列的python数据框:这个数据框存储到变量WSI_Hourly
Date Rain (in)
1/5 2
1/6 0
1/7 7
1/8 10
1/9 13
1/10 11
1/11 1
I am trying to write a function that creates a new column specifying the dynamic range bucket the "Rain" values fall under. 我正在尝试编写一个函数来创建一个新列,指定“Rain”值所属的动态范围桶。 Please see desired output table : 请参阅所需的输出表 :
Date Rain Rain_Range
1/5 2 0-5 inches
1/6 0 0-5 inches
1/7 7 6-10 inches
1/8 10 6-10 inches
1/9 13 11-15 inches
1/10 11 11-15 inches
1/11 1 0-5 inches
Below is my function: 以下是我的功能:
def precip(df, min_value, max_value, desc):
if(min_value < max_value):
for i, m in df.iterrows():
if (m['Rain'] >= min_value) & (m['Rain'] <= max_value):
df.set_value(i, 'Rain_Range', desc)
precip(WSI_Hourly, min_value, max_value, desc)
Because I want to dynamically set what the 'Rain_Range' values are, I want to pass the following data frame through the function denoting the min_value, max_value, and desc arguments. 因为我想动态设置'Rain_Range'值是什么,我想通过表示min_value,max_value和desc参数的函数传递以下数据帧。
Please see data frame table below: 请参阅下面的数据框表:
min_value max_value desc
0 5 0-5 inches
6 10 6-10 inches
11 15 11-15 inches
My QUESTION IS: How do I pass the min_value, max_value, and desc columns in the data frame above into my function as arguments to get my desired output table ? 我的问题是:如何将上面数据框中的min_value,max_value和desc列传递给我的函数作为获取所需输出表的参数 ?
*Any help on this is greatly appreciated *非常感谢任何帮助
If I understand what you are looking for, you want the zip
function. 如果我理解你在寻找什么,你需要zip
功能。
def f(x,y,z):
for a,b,c in zip(x,y,z):
print(a,b,c)
x = [1, 2, 3, 4]
y = [10, 20, 30, 40]
z = [100, 200, 300, 400]
f(x,y,z)
The data is passed as columns of data, the zip
function iterates over all three columns simultaneously, returning an iterable of tuples, which you can unpack as the loop indices of the for
loop. 数据作为数据列传递, zip
函数同时迭代所有三列,返回一个可迭代的元组,您可以将其解压缩为for
循环的循环索引。
As @jeremycg suggested in the comment, use pd.cut()
: 正如@jeremycg在评论中建议的那样,使用pd.cut()
:
pd.cut(df["Rain"],
[-0.001, 5, 10, 15], # Bin boundaries
labels=["0-5 inches", "6-10 inches", "11-15 inches"] # Bin labels
)
# Result:
# 0 0-5 inches
# 1 0-5 inches
# 2 6-10 inches
# 3 6-10 inches
# 4 11-15 inches
# 5 11-15 inches
# 6 0-5 inches
# Name: Rain, dtype: category
# Categories (3, object): [0-5 inches < 6-10 inches < 11-15 inches]
You can skip your function, using pd.cut
. 您可以使用pd.cut
跳过您的功能。
Some data: 一些数据:
from io import StringIO
import pandas as pd
dat=StringIO('''Date Rain(in)
1/5 2
1/6 0
1/7 7
1/8 10
1/9 13
1/10 11
1/11 1 ''')
cuts = StringIO('''min_value max_value desc
0 5 0-5inches
6 10 6-10inches
11 15 11-15inches''')
df = pd.read_csv(dat, delim_whitespace = True)
cuts = pd.read_csv(cuts, delim_whitespace = True)
Now we 'cut' using the pd.cut
function, using bins and labels from your 'cuts' data frame: 现在我们使用pd.cut
函数'切割',使用“剪切”数据框中的区域和标签:
df['Rain_Range'] = pd.cut(df['Rain(in)'],\
bins = pd.concat([cuts.min_value[:1]-1,cuts.max_value]),\
labels = cuts.desc)
which gives: 这使:
Date Rain(in) Rain_Range
1/5 2 0-5inches
1/6 0 0-5inches
1/7 7 6-10inches
1/8 10 6-10inches
1/9 13 11-15inches
1/10 11 11-15inches
1/11 1 0-5inches
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.