简体   繁体   English

Pandas groupby 在具有至少一个共同元素的列表列表中

[英]Pandas groupby on list of lists with atleast one element common

I am analyzing a CSV file with names corresponding to their mobile numbers list.我正在分析一个 CSV 文件,其名称与其手机号码列表相对应。 数据框

Now, I wish to group by this dataset over 'phone_number' where at least one of the numbers in the list matches with others.现在,我希望通过“phone_number”按此数据集进行分组,其中列表中至少有一个数字与其他数字匹配。

For example,** if Dr. ABC has phone_number=['1234','3456','7890'] in one of the samples & Dr. ABC has phone number=['7676','1234','8765'] in other sample, these rows should be aggregated together as '1234' is common.例如,** 如果 Dr. ABC 在其中一个样本中有 phone_number=['1234','3456','7890'] 并且 Dr. ABC 的电话号码=['7676','1234','8765' ] 在其他示例中,这些行应聚合在一起,因为“1234”很常见。 Though rows without any match should also be retained虽然没有任何匹配的行也应该保留

The required output is list of rx_id after grouping over phone_number like this.Can this be done using pandas groupby()?所需的 output 是像这样通过 phone_number 分组后的 rx_id 列表。可以使用 pandas groupby() 来完成吗? or some other trick.或其他一些技巧。 Thanks for the help!!谢谢您的帮助!!

IIUC you can use explode and duplicated : IIUC 你可以使用explodeduplicated

df = pd.DataFrame({"doctor_name":["Dr. ABC","Dr. ABC", "Dr. Who","Dr. Strange"],
                   "phone_number":[['1234','3456','7890'],['7676','1234','8765'], np.NaN, ["8697059406"]]})

df = df.explode("phone_number")

s = df["doctor_name"].value_counts()

print (df[df.duplicated("phone_number")|df["doctor_name"].isin(s[s.eq(1)].index)]) #add .groupby("doctor_name").agg(list) if you want them back into a list

   doctor_name phone_number
1      Dr. ABC         1234
2      Dr. Who          NaN
3  Dr. Strange   8697059406

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我有一个带有列表的 pandas 列。 对包含来自同一列的至少一个公共元素的行进行分组 - I have a pandas column with lists. Group rows that contains atleast one common element from same column 检查集合/列表中至少一个元素是否在列表/集合集合中的每个元素中的最快方法 - fastest way to check if atleast one element in set/list is in each element in a collection of lists/sets 合并具有至少一个公共元素的元组以形成一个公共元组 - Merge tuples having atleast one common element to form a common tuple Python Pandas Groupby列表列表 - Python Pandas Groupby a List of Lists 熊猫在groupby上分组依据到列表列表 - Pandas group by on groupby to list of lists 我有一个带有特定列的列表的所有行。 从目标列表中选择不包含至少一个元素的行 - I have a all the rows with a particular column with lists. Select rows that does not contain atleast one element from the target list 检查字典中是否存在作为值的列表(至少一个元素)并返回 - Check if list as value(with atleast one element) exists in dictionary and return it 从两个列表中提取一个公共元素,并创建一个字典,通过避免重复从一个列表映射到另一个列表 - Extract one common element from two lists and create a dictionary with mapping from one list to other by avoiding duplicates 在列表列表中查找最常见的元素 - Find the most common element in list of lists 在列表列表中查找最常见的元素 - Finding the most common element in a list of lists
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM