简体   繁体   English

在元组列表中查找数据帧的索引,并将元组列表中具有相应索引的列添加到数据帧

[英]find index of dataframe in list of tuples and add column with its corresponding index in the list of tuples to the dataframe

So im trying to make something in python that finds an index of a dataframe within this list of tuples and if the acronym (which corresponds to the index in the dataframe is within the list of tuples with theindex 0, in a new column named cluster within the dataframe add a 0 to the row with that acronym as an index. I'm attaching columns of the list of tuples and the dataframe. Please let me know! Thank you!所以我试图在python中做一些东西,在这个元组列表中找到一个数据帧的索引,如果首字母缩写词(对应于数据帧中的索引在索引为0的元组列表中,在一个名为cluster的新列中数据框将 0 添加到具有该首字母缩写词的行作为索引。我正在附加元组列表和数据框的列。请告诉我!谢谢!

tuple index.元组索引。 Values.价值观。
0 0 MA, PA, CA马萨诸塞州宾夕法尼亚州
1 1 NY, FL纽约州,佛罗里达州
DF Index. DF 指数。 Cluster
NY纽约 1 1
MA 0 0

Here dataframes are created and indexes are set, from columns 'tuple index.'在这里,从列“元组索引”创建数据帧并设置索引。 and 'DF Index.'.和“DF 指数”。 The 'Cluster' column is populated with np.nan. 'Cluster' 列填充有 np.nan。 A list generator is created that checks rows from the 'df['Values.']' column for a match against the 'DF Index.创建了一个列表生成器,它检查 'df['Values.']' 列中的行是否与 'DF 索引匹配。 '. '。 The resulting label is used to get the index from df.index and that number is extracted.结果标签用于从 df.index 获取索引并提取该数字。 The 'df1['Cluster'] ' column is then populated with these values.然后用这些值填充“df1['Cluster']”列。

import numpy as np
import pandas

df = pandas.DataFrame({'tuple index.': (0, 1), 'Values.': [('MA,PA,CA'), ('NY,FL')]})
df = df.set_index('tuple index.')
df1 = pandas.DataFrame({'DF Index.': ('NY', 'MA'), 'Cluster': [np.nan, np.nan]})
df1 = df1.set_index('DF Index.')

q = [df.index[(df['Values.'].str.find(df1.index[i]) != -1)][0] for i in range(0, len(df['Values.']))]
df1['Cluster'] = q

print(df1)

Output输出

           Cluster
DF Index.         
NY               1
MA               0

To make it more clear, I will analyze the search algorithm into parts.为了更清楚,我将把搜索算法分成几部分来分析。

a = df['Values.'].str.find(df1.index[0])#Get indexes where there is a match, where there is none -1
b = (a != -1)#Getting a list with True and False
c = df.index[b]#Apply masking to get the value
d = c[0]#Extracting a number

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM