简体   繁体   English

如何对 dataframe 中的唯一值进行分类和分组?

[英]How to categorize and group the unique values in a dataframe?

I am using this dataset from Kaggle: https://www.kaggle.com/kwadwoofosu/predict-test-scores-of-students我正在使用来自 Kaggle 的数据集: https://www.kaggle.com/kwadwoofosu/predict-test-scores-of-students

Sample of data I am working with:我正在使用的数据样本:

在此处输入图像描述

I am building an input form on streamlit based on predictions made on this dataset.我正在根据对此数据集所做的预测在流光上构建一个输入表单。 Upon selecting the school name, I want to auto select the school setting and school type based on this and if possible show only the selected available classrooms of that school.选择学校名称后,我想自动 select 基于此的学校设置和学校类型,如果可能,仅显示该学校选定的可用教室。

Suppose, the school selected is ANKYI then my application should set the school_setting value as Urban, School_type as Non-public and show me only the classrooms available in the school.假设选择的学校是 ANKYI,那么我的应用程序应该将 school_setting 值设置为 Urban,School_type 设置为 Non-public,并只显示学校可用的教室。

How to achieve this categorization of the dataframe using python?如何使用 python 实现 dataframe 的这种分类?

For each column in the pandas dataframe, you can use the .unique() method to return an array of unique values.对于 pandas dataframe 中的每一列,您可以使用.unique()方法返回一个唯一值数组。

So, for your data, you could do所以,对于你的数据,你可以做

school_types = list(df[df['school']=='ANKYI']['school_type'].unique())

To break this apart - the return of the .unique() method is an array-type object, and so we can turn it into a list (if you want to).为了打破这一点 - .unique()方法的返回是一个数组类型的 object,所以我们可以把它变成一个列表(如果你愿意的话)。 Then we are using our dataframe (whatever you call it), but we want to filter to just look at rows where 'school' is equal to 'ANKYI'.然后我们使用我们的 dataframe(不管你怎么称呼它),但我们想过滤以查看“学校”等于“ANKYI”的行。 Within those rows, we just want to look at the column called 'school_type', and that column (filtered to just those rows) is what we want to return the unique values from.在这些行中,我们只想查看名为“school_type”的列,而该列(仅过滤到那些行)是我们想要从中返回唯一值的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Pandas DataFrame中对一系列值进行分类 - How to categorize a range of values in Pandas DataFrame 使用列的唯一值作为索引对数据框进行分组 - group dataframe using unique values of column as index 在Python熊猫数据框中对唯一值进行分组和计数 - Group and count unique values in Python pandas dataframe 将数据框分组并检查每组中唯一值的数量 - Group the dataframe and check number of unique values in each group 如何计算 dataframe 组中行的唯一组合? - How to count unique combinations of rows in dataframe group by? 如何使用分组计算 pandas dataframe 中的唯一非空值? - How to count unique non-null values in pandas dataframe using group by? 如何将 pandas dataframe 中前 5 个值(按大小)之外的所有其他唯一值分组到“其他”类别中以进行绘图和制表? - How to group every other unique value in pandas dataframe outside top 5 values (by size) into an 'Other' category for plotting and tabling? 如何在Pandas的一个数据框中分类两个类别 - How to categorize two categories in one dataframe in Pandas 如何在 pandas dataframe 中将时间戳分类为晚上? - How to categorize timestamp into evening in pandas dataframe? 根据组(熊猫数据框)计算多列中的唯一值 - Count unique values in multiple columns according by group (pandas dataframe)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM