简体   繁体   English

如何满足熊猫数据框中列的特定条件以及检查值是否大于等于 10,000

[英]How do meet a specific criteria for column in panda data frame as well as checking whether the value is more than equal to 10,000

Hello am doing my assignment and I have encountered a question that I can't answer.你好,我正在做我的作业,我遇到了一个我无法回答的问题。 The question is to create another DataFrame df_urban consisting of all columns of the original dataset but comprising of only applicants with Urban status in their Property_Area attribute (exclude Rural and Semiurban) with ApplicantIncome of at least S$10,000.问题是创建另一个 DataFrame df_urban,其中包含原始数据集的所有列,但仅包含在其 Property_Area 属性(不包括农村和半城市)中具有城市身份且申请人收入至少为 10,000 新元的申请人。 Reset the row index and display the last 10 rows of this DataFrame.重置行索引并显示此 DataFrame 的最后 10 行。

Picture of the question问题图片

My code however will not meet the criteria of Applicant Income of at least 10,000 as well as only urban status in the area.然而,我的代码将不符合申请人收入至少 10,000 以及仅在该地区的城市地位的标准。

df_urban = df df_urban.iloc[-10:[11]] df_urban = df df_urban.iloc[-10:[11]]

I Was wondering what is the solution to the question.我想知道这个问题的解决方案是什么。 Data picture资料图片

you can use the '&' operator to limit the data by multiple column conditions:您可以使用“&”运算符通过多列条件限制数据:

df_urban = df[(df[col]==<condition>) & (df[col] >= <condition>)]

Following is a simple code snippet performing a proof of principle in extracting a subset of the primary data frame to produce a subset data frame of only "Urban" locations.以下是一个简单的代码片段,它在提取主要数据帧的子集以生成仅包含“城市”位置的子集数据帧时执行原理证明。

import pandas as pd

df=pd.read_csv('Applicants.csv',delimiter='\t')

print(df)

df_urban = df[(df['Property_Area'] == 'Urban')]

print(df_urban)

Using a simply built CSV file, here is a sample of the output.使用简单构建的 CSV 文件,这里是 output 的示例。

       ApplicantIncome  CoapplicantIncome  LoanAmount  Loan_Term  Credit_History Property_Area
0             4583               1508      128000        360               1         Rural
1             1222                  0       55000        360               1         Rural
2             8285                  0       64000        360               1         Urban
3             3988               1144       75000        360               1         Rural
4             2588                  0       84700        360               1         Urban
5             5248                  0       48550        360               1         Rural
6             7488                  0      111000        360               1     SemiUrban
7             3252               1112       14550        360               1         Rural
8             1668                  0       67500        360               1         Urban
   ApplicantIncome  CoapplicantIncome  LoanAmount  Loan_Term  Credit_History Property_Area
2             8285                  0       64000        360               1         Urban
4             2588                  0       84700        360               1         Urban
8             1668                  0       67500        360               1         Urban

Hope that helps.希望有帮助。

Regards.问候。

See below.见下文。 I leave it to you to work out how to reset index.我把它留给你来解决如何重置索引。 You might want to look at.tail() to display last rows.您可能想查看.tail() 以显示最后一行。

df_urban = df[(df['ApplicantIncome'] > 10000) & (df['Property_Area'] == 'Urban')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM