简体   繁体   中英

Python: how to pass 3 columns in data frame as 3 separate arguments in function and iterate through the column values

I have the following python data frame with columns listed below: This data frame is stored to the variable WSI_Hourly

Date    Rain (in)            
1/5     2           
1/6     0          
1/7     7   
1/8     10    
1/9     13   
1/10    11   
1/11    1   

I am trying to write a function that creates a new column specifying the dynamic range bucket the "Rain" values fall under. Please see :

Date   Rain     Rain_Range    
1/5    2        0-5 inches
1/6    0        0-5 inches
1/7    7        6-10 inches
1/8    10       6-10 inches
1/9    13       11-15 inches
1/10   11       11-15 inches
1/11   1        0-5 inches 

Below is my function:

def precip(df, min_value, max_value, desc):
    if(min_value < max_value):
        for i, m in df.iterrows():
            if (m['Rain'] >= min_value) & (m['Rain'] <= max_value):
                df.set_value(i, 'Rain_Range', desc)

precip(WSI_Hourly, min_value, max_value, desc)

Because I want to dynamically set what the 'Rain_Range' values are, I want to pass the following data frame through the function denoting the min_value, max_value, and desc arguments.

Please see data frame table below:

min_value   max_value   desc      
0           5           0-5 inches   
6           10          6-10 inches    
11          15          11-15 inches

My QUESTION IS: How do I pass the min_value, max_value, and desc columns in the data frame above into my function as arguments to get my ?

*Any help on this is greatly appreciated

If I understand what you are looking for, you want the zip function.

def f(x,y,z):
    for a,b,c in zip(x,y,z):
        print(a,b,c)

x = [1, 2, 3, 4]
y = [10, 20, 30, 40]
z = [100, 200, 300, 400]

f(x,y,z)

The data is passed as columns of data, the zip function iterates over all three columns simultaneously, returning an iterable of tuples, which you can unpack as the loop indices of the for loop.

As @jeremycg suggested in the comment, use pd.cut() :

pd.cut(df["Rain"], 
       [-0.001, 5, 10, 15], # Bin boundaries
       labels=["0-5 inches", "6-10 inches", "11-15 inches"] # Bin labels
      )

# Result:
# 0      0-5 inches
# 1      0-5 inches
# 2     6-10 inches
# 3     6-10 inches
# 4    11-15 inches
# 5    11-15 inches
# 6      0-5 inches
# Name: Rain, dtype: category
# Categories (3, object): [0-5 inches < 6-10 inches < 11-15 inches]

You can skip your function, using pd.cut .

Some data:

from io import StringIO

import pandas as pd

dat=StringIO('''Date    Rain(in)            
1/5     2           
1/6     0          
1/7     7   
1/8     10    
1/9     13   
1/10    11   
1/11    1   ''')

cuts = StringIO('''min_value   max_value   desc      
0           5           0-5inches   
6           10          6-10inches    
11          15          11-15inches''')
df = pd.read_csv(dat, delim_whitespace = True)
cuts = pd.read_csv(cuts, delim_whitespace = True)

Now we 'cut' using the pd.cut function, using bins and labels from your 'cuts' data frame:

df['Rain_Range'] = pd.cut(df['Rain(in)'],\
        bins = pd.concat([cuts.min_value[:1]-1,cuts.max_value]),\
        labels = cuts.desc)

which gives:

Date    Rain(in)    Rain_Range
1/5     2   0-5inches
1/6     0   0-5inches
1/7     7   6-10inches
1/8     10  6-10inches
1/9     13  11-15inches
1/10    11  11-15inches
1/11    1   0-5inches

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM