简体   繁体   English

如何根据同一数据帧的列中的唯一值列表对数据帧进行子集化?

[英]How can I subset a data frame based on a list of unique values in a columns of that same data frame?

I have a simple dataframe that looks like this. 我有一个简单的数据框,看起来像这样。 I want to be able to select all of the rows where LOC is New York, subset this dataframe and tag it as a variable I can use to append the New York rows to an email I have created using win32 to the Contact person. 我希望能够选择LOC是纽约的所有行,将此数据框作为子集并将其标记为变量,我可以使用该变量将纽约行附加到我使用win32创建的联系人电子邮件中。 Then move to Boston and do the same thing, etc. I can not figure out how to extract the LOC rows without explicitly naming them. 然后移动到波士顿并做同样的事情,等等。我无法弄清楚如何在没有明确命名的情况下提取LOC行。 I want this to be dynamic as the LOC values change. 我希望随着LOC值的变化,这是动态的。

    Contact          LOC     ...     Add_Move  First Name
0   mike@osjloc1.com     New York     ...          Add         Joe
1   mike@osjloc1.com     New York     ...         Move        Stan
2   mike@osjloc1.com     New York     ...          Add        Rick
3   mike@osjloc1.com     New York     ...          Add        Mike
4   jeff@osjloc2.com       Boston     ...          Add       Sonya
5   jeff@osjloc2.com       Boston     ...         Move        Matt
6   jeff@osjloc2.com       Boston     ...         Move       Randy
7   jeff@osjloc2.com       Boston     ...          Add         Sue
8    dave@osjloc.com  Los Angeles     ...          Add        Jill
9    dave@osjloc.com  Los Angeles     ...         Move       Steve
10   dave@osjloc.com  Los Angeles     ...          Add        Bill

Boolean indexing. 布尔索引。 You can mask a column in dataframe based on column value https://www.geeksforgeeks.org/boolean-indexing-in-pandas/ 您可以根据列值屏蔽数据框中的列https://www.geeksforgeeks.org/boolean-indexing-in-pandas/

Getting all the unique locations in the DataFrame. 获取DataFrame中的所有唯一位置。

locations = set(df.loc[:,"LOC"])

locations will return a set of {"New York","Boston",...} 地点将返回一组{“纽约”,“波士顿”,...}

for location in locations:
    variable = df[df["LOC"]==location]

The for loop will loop through the set of values created. for循环将遍历创建的值集。 To filter a data based on a column value, we can create a mask based on the operators like ==,!=,... 要根据列值过滤数据,我们可以根据==,!=,...等运算符创建一个掩码。

You can use pandas groupby . 你可以使用pandas groupby

groups = yourdataframe.groupby('LOC')

groups contains the dataframe subsets split according to the 'LOC' column. groups包含根据'LOC'列拆分的数据框子集。 If you iterate on it, each iteration you have a 2-length tuple. 如果你迭代它,每次迭代你有一个2长度的元组。 Ad index 0, a string corresponding to the value of 'LOC' , at index 1 a dataframe corresponding to the subset (still a dataframe). 广告索引0,对应于'LOC'的值的字符串,在索引1处,对应于子集(仍然是数据帧)的数据帧。

for locname, subset in groups:
    #do whatever you want with the subset

Not sure what you need to do, but for example, to print the list of the emails, you could do: 不确定您需要做什么,但是例如,要打印电子邮件列表,您可以:

for locname, subset in groups:
    print(subset['Contact'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python - 如何使用python中另一个数据框中的列中的重复值为唯一行对数据框进行子集化? - How can I subset a data frame for unique rows using repeating values from a column in another data frame in python? 如何根据字符串值列表对熊猫数据框进行子集设置? - How do I subset a pandas data frame based on a list of string values? 如何获取 pandas 数据帧的列的子集? - How to take a subset of the columns of a pandas data frame? 如何将数据框中的值更改为值列表? - How can I change values in a data frame into a list of values? 如果满足基于同一数据帧中其他2列的行值的条件,则在数据帧的列行中填充值 - Filling values in rows of column in a data frame, if condition based on 2 other columns row values in the same data frame is met 如何在pandas数据框的所有列中获取唯一值 - how to get unique values in all columns in pandas data frame 如何用正好两个唯一值绘制数据框列? - How to plot data frame columns with exactly two unique values? 如何在具有大量唯一值的数据框中的列中找到不规则值? - How do I find irregular values in columns in a data-frame that have a huge number of unique values? 如何在数据框中添加列? - How can I add columns in a data frame? 当我的日期列不是Python中的索引时,如何基于日期对数据框进行子集设置? - How can I subset a data frame based on dates, when my dates column is not the index in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM