简体   繁体   English

检查列表值并将其分配给python中的字典键

[英]Check list values and assign it to a dictionary key in python

I have a list of words as below. 我有一个单词列表如下。

mylist = ['cat', 'yellow', 'car', 'red', 'green', 'jeep', 'rat','lorry']

I also have a list of lists for each essay in the dataset that contain values for the 'mylist' as given in the examples below (ie, if 'mylist' word appears in essay I make it 1, otherwise 0). 我还为数据集中的每篇文章提供了一个列表列表,其中包含以下示例中给出的“ mylist”值(即,如果“ mylist”一词出现在论文中,我将其设为1,否则设为0)。

[[0,1,0,0,0,1,0,1], [1,0,0,0,0,1,0,0]]

In other words, 换一种说法,

[0,1,0,0,0,1,0,1] says that this only has values 'yellow', 'jeep', 'lorry'

Now I have a dictionary of categories as below. 现在,我有以下类别的词典。

mydictionary = {'colour': ['red', 'yellow', 'green'], 'animal': ['rat','cat'], 
'vehicle': ['car', 'jeep']}

Now by using 'mydictionary' key values I want to transform the list of lists as follows (That is, if one or more values of the 'mylist' is 1, I mark the key as 1, else 0). 现在,通过使用“ mydictionary”键值,我希望按以下方式转换列表列表(即,如果“ mylist”的一个或多个值是1,则将键标记为1,否则标记为0)。

[[1,0,1], [0,1,0]]

In other words, 换一种说法,

[1,0,1] says that;
1 - one or more '1's for elements in 'colours'
0 - no elements in 'animals'
0 - one or more '1's for elements in 'vehicles'

So my output should be a list of lists as mentioned above -> [[1,0,1], [0,1,0]] 所以我的输出应该是如上所述的列表列表-> [[1,0,1],[0,1,0]]

I am new to pandas, Hence, I am interested in knowing if this is possible to do using pandas dataframes. 我是熊猫的新手,因此,我想知道使用熊猫数据框是否有可能做到这一点。

Setup 设定

a = np.array(['cat', 'yellow', 'car', 'red', 'green', 'jeep', 'rat','lorry'])
b = np.array([[0,1,0,0,0,1,0,1], [1,0,0,0,0,1,0,0]], dtype=bool)

mydictionary = {
    'colour': ['red', 'yellow', 'green'],
    'animal': ['rat','cat'], 
    'vehicle': ['car', 'jeep']
}

Solution
Some minor additional setup 一些小的附加设置
I just needed to get an array of sets in the correct order. 我只需要按正确的顺序获取一组数组即可。

o = ['colour', 'animal', 'vehicle']
s = pd.Series(mydictionary).apply(set).loc[o]

s

colour     {green, red, yellow}
animal               {cat, rat}
vehicle             {jeep, car}
dtype: object

Use set intersection with numpy broadcasting set交集与numpy广播一起使用

(s.values & [[set(a[l])] for l in b]).astype(bool).astype(int)

array([[1, 0, 1],
       [0, 1, 1]])

Additional Explanation 附加说明

If I'm to use numpy broadcasting and I already have a series with values 如果我要使用numpy广播,并且已经有一系列值

s.values

[{'green', 'red', 'yellow'} {'cat', 'rat'} {'jeep', 'car'}]

Then I need a 2-D array with the other sets 然后我需要一个二维数组和其他集合

[[set(a[l])] for l in b]

[[{'jeep', 'lorry', 'yellow'}], [{'cat', 'jeep'}]]

When I broadcast the & operation 当我广播&操作时

s.values & [[set(a[l])] for l in b]

[[{'yellow'}  set()    {'jeep'}]
 [set()       {'cat'}  {'jeep'}]]

Conveniently, empty sets evaluate to False and non-empty sets to True in a bool context. 方便地,在bool上下文中,空集的值为False ,非空集的值为True Follow that with an int context and we have our solution. 在具有int上下文的情况下进行操作,我们将提供解决方案。

(s.values & [[set(a[l])] for l in b]).astype(bool).astype(int)

array([[1, 0, 1],
       [0, 1, 1]])

I think you need: 我认为您需要:

mylist = ['cat', 'yellow', 'car', 'red', 'green', 'jeep', 'rat','lorry']
a = [[1,1,0,0,0,1,0,1], [1,0,0,0,0,1,0,0]]
mydictionary = {'colour': ['red', 'yellow', 'green'], 'animal': ['rat','cat', 'lorry'], 
'vehicle': ['car', 'jeep']}
#order of output categories
cols = ['colour','animal','vehicle']

df = pd.DataFrame(a, columns=mylist)
d = {k: oldk for oldk, oldv in mydictionary.items() for k in oldv}
df = df.rename(columns=d).groupby(axis=1, level=0).max().reindex(columns=cols)
print (df)
   colour  animal  vehicle
0       1       1        1
1       0       1        1

L = df.values.tolist()
print (L)
[[1, 1, 1], [0, 1, 1]]

Here is another approach without pandas: 这是没有熊猫的另一种方法:

list_of_list = <whatever you have>
for i, list in enumerate(list_of_list):
     #  temp_list will hold lists such [yellow, jeep, lorry]
     temp_list = [mylist[j] for j in range(len(list)) if list[j] == 1]

     for t, item in enumerate(temp_list):
           for k, key in enumerate(mydictionary.keys()):
               if item in mydictionary[key]:
                  temp_list[t] = k

     # now override the list of list
     list_of_list[i] = temp_list[i]

I didn't run the code. 我没有运行代码。 So, there might be some minor bugs. 因此,可能会有一些小错误。 But, I am hoping you get the idea 但是,我希望你能想到

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM