简体   繁体   English

如何使用 pandas 根据列的值范围分隔数据框?

[英]How to separate a data frame based on a column's range of values with pandas?

This is a bit of weird question, but I have been importing property data from an api in the format of a json file within python.这是一个有点奇怪的问题,但我一直在从 api 导入属性数据,格式为 python 中的 json 文件格式。 I then use Pandas to convert the json into a dataframe.然后我使用 Pandas 将 json 转换为 dataframe。

I am having trouble manipulating the data within the data frame.我在处理数据框中的数据时遇到问题。 My current data is set up as to be formatted like this table.我当前的数据设置为像这张表一样格式化。

在此处输入图像描述

Each Property is assigned a name and a property id and address, and there is a record for every unit within a property.每个属性都分配有一个名称和属性 ID 和地址,并且属性中的每个单元都有一个记录。 Ideally, I would like to create multiple data frames separated by property id, such that it would look like this.理想情况下,我想创建由属性 id 分隔的多个数据框,使其看起来像这样。

在此处输入图像描述

My only problem here is that due to their being some organization issues, there are about 100 different property ids, and none of the ids are in order.我唯一的问题是,由于它们是一些组织问题,大约有 100 个不同的属性 ID,并且没有一个 ID 是按顺序排列的。 They all have a random number from 1 - 1000.它们都有一个从 1 到 1000 的随机数。

Is there a way to automatically separate dataframes based on property id by using some sort of unique identifier combined with a for loop?有没有办法通过使用某种唯一标识符与 for 循环结合来根据属性 id 自动分离数据帧?

I don't really know how to approach the scenario.我真的不知道如何处理这个场景。 Thanks!谢谢!

Try this:尝试这个:

list_of_dataframes = [x for _, x in df.groupby(df['Property Id'].ne(df['Property Id'].shift(1)).cumsum())]

Now list_of_dataframes is a list of dataframes, where each dataframe contains the rows where the Property Id was consecutively the same.现在list_of_dataframes是一个数据帧list ,其中每个 dataframe 包含Property Id连续相同的行。 So Property Id s 1 1 1 9 9 9 1 1 1 would return 3 dataframes , one containing the first three 1's, the second containing the next three 9's, and the last containing the last three 1's.所以Property Id s 1 1 1 9 9 9 1 1 1将返回3 个数据帧,一个包含前三个 1,第二个包含接下来的三个 9,最后一个包含最后三个 1。

If don't want the groups to be based on the consective order (ie, you want 1 1 1 9 9 9 1 1 1 to be two dataframes, the first containing all six 1's, and the second containing the three 9's), you can do this:如果不希望组基于连续顺序(即,您希望1 1 1 9 9 9 1 1 1是两个数据帧,第一个包含所有六个 1,第二个包含三个 9),您可以这样做:

list_of_dataframes = [x for _, x in df.groupby(df['Property Id'])]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas:根据另一个数据框列中的值范围计算单独数据框列框中的值(python) - Pandas: Calculating a value in a separate data frame column frame based on range of values in another data frame column (python) 如何根据特定行的列值查询 pandas 数据帧 - How to query on a pandas data frame based on column values of specific row 如何根据其他行值添加 pandas 数据框列 - How to add pandas data frame column based on other rows values Pandas:基于列值聚合数据框 - Pandas: Aggregate data frame based on column values 根据单独字典中分配的值对 pandas 数据框进行排序 - Sort pandas data frame based on values assigned in separate dictionary 根据 pandas 数据帧中的其他列值组合列值 - Combine column values based on the other column values in pandas data frame 如何根据 pandas 中的条件匹配从另一个数据帧更新数据帧列值 - How to update the data frame column values from another data frame based a conditional match in pandas Python / pandas:创建数据框的列并根据在另一个 dataframe 范围内找到列值来设置其值 - Python / pandas: create a data frame's column and set it's value based on finding a column value in range of another dataframe 如何在特定的熊猫数据框列中查找值,然后将该行中的其他值存储在单独的变量中 - How to look for a value in a specific pandas data frame column and then store the other values from that row in separate variables 如何使用 python 分隔 pandas 数据帧中的嵌套逗号分隔列值? - How to separate nested comma separated column values in pandas data frame using python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM