简体   繁体   English

循环通过 dataframe 的某些列

[英]Loop through certain columns of a dataframe

I am trying to define a function with a for for loop that will iterate the column weight and return a list of patient names that have a weight <= 150. I'm honestly just confused about how I should go about this.我正在尝试使用 for for 循环定义 function,该循环将迭代列权重并返回权重 <= 150 的患者姓名列表。老实说,我只是对如何处理 go 感到困惑。 Any help will be much appreciated.任何帮助都感激不尽。

df:   Patient          Weight   LDL 
0      Rob              200      100
1      Bob              150      150
2      John             184      102
3      Phil             120      200
4      Jessica          100      143
# List of Tuples
Patients = [('Rob', 200, 100),
           ('Bob', 150, 150),
           ('John', 184, 102),
           ('Phil', 120, 200),
            ('Jessica', 100, 143 )
            ]
# Create a DataFrame object
df = pd.DataFrame(Patients, columns =['Patient', 'Weight', 'LDL'],
                      index =['0','1', '2', '3', '4'])
 
df

def greater_150(df, outcome = 'Weight'):
    new_list = []
    patient = df['Patient']
    for column in df[['Patient', 'Weight']]:
        if outcome <= 150:
           new_list.append(patient)
    return new_list

Ideally the Output I would want:理想情况下,我想要的 Output:

[ Rob, Bob, John]

TypeError:类型错误:

'<=' not supported between instances of 'str' and 'int'

Here's a simple approach that avoids iteration (as is typically ideal when pandas is involved).这是一种避免迭代的简单方法(当涉及 pandas 时通常是理想的)。

df[df["Weight"] >= 150].Patient

returns the following pandas series:返回以下 pandas 系列:

0     Rob
1     Bob
2    John
Name: Patient, dtype: object

If you want, you can make this into a list with df[df["Weight"] >= 150].Patient.tolist() , which yields ['Rob', 'Bob', 'John'] .如果需要,您可以使用df[df["Weight"] >= 150].Patient.tolist()将其放入列表中,从而产生['Rob', 'Bob', 'John']

Generally avoid iterations, as the answer by Ben points out.正如 Ben 的回答所指出的那样,通常避免迭代。 But if you want to learn how to do it, here's your function modified to iterate through the rows (not the columns:):但是如果你想学习如何去做,这里是你的 function 修改为遍历行(而不是列:):

def greater_150(df, outcome = 'Weight'):
    new_list = []
    for index, data in df.iterrows():
        if data[outcome] >= 150:
           new_list.append(data["Patient"])
    return new_list

Try the following:尝试以下操作:

def greater_150(df):
    return df.loc[df["Weight"] >= 150].Patient.tolist()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM