[英]Fill dictionary with value from the same row, but different column
Lately I've been trying to map some values, so I'm trying to create a dictionary to do so. 最近我一直试图映射一些值,所以我试图创建一个字典来做到这一点。 The odd thing is my DataFrame has a column made of lists, and DataFrames are always a bit awkward with lists.
奇怪的是我的DataFrame有一个由列组成的列,而DataFrame对列表总是有点尴尬。 The DataFrame has the following structure:
DataFrame具有以下结构:
rules procedure
['10','11','12'] 1
['13','14'] 2
['20','21','22','24'] 3
So I want to create a dictionary that maps '10' to 1, '14' to 2, and so on. 所以我想创建一个字典,将'10'映射到1,'14'映射到2,依此类推。 I tried the following:
我尝试了以下方法:
dicc=dict()
for j in df['rules']:
for i,k in zip(j,df.procedure):
dicc[i]=k
But that isn't making it. 但那不是成功的。 Probably something to do with indexes.
可能与索引有关。 What am I missing?
我错过了什么?
Edit: I'm trying to create a dictionary that maps the values '10', '11', '12' to 1; 编辑:我正在尝试创建一个将值'10','11','12'映射到1的字典; '13','14' to 2;
'13','14'到2; '20','21','22','24' to 3, so if I type
dicc['10']
I get 1
, if I type dicc['22']
I get 3
. “20”,“21”,“22”,“24”到3,所以如果我输入
dicc['10']
我得到1
,如果我输入dicc['22']
我得到3
。 Obviously, the actual DataFrame is quite bigger and I can't do it manually. 显然,实际的DataFrame非常大,我无法手动完成。
You can do it like this: 你可以这样做:
import pandas as pd
data = [[['10', '11', '12'], 1],
[['13', '14'], 2],
[['20', '21', '22', '24'], 3]]
df = pd.DataFrame(data=data, columns=['rules', 'procedure'])
d = {r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
print(d)
Output 产量
{'20': 3, '10': 1, '11': 1, '24': 3, '14': 2, '22': 3, '13': 2, '12': 1, '21': 3}
Notes: 笔记:
{r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
is a dictionary comprehension, the dictionary counterpart of list. {r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
是字典理解,是列表的字典对应物。 df[['rules', 'procedure']].values
is equivalent to zip(df.rules, df.procedure)
it outputs a pair of list, int. df[['rules', 'procedure']].values
相当于zip(df.rules, df.procedure)
它输出一对list,int。 So the rs
variable is a list and p
is an integer. rs
变量是一个列表, p
是一个整数。 rs
using the second for loop rs
的值 UPDATE UPDATE
As suggested for @piRSquared you can use zip: 正如@piRSquared的建议你可以使用zip:
d = {r : p for rs, p in zip(df.rules, df.procedure) for r in rs}
cytoolz
cytoolz
帮助 from cytoolz.dicttoolz import merge
merge(*map(dict.fromkeys, df.rules, df.procedure))
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}
I updated my post to mimic how @jpp passed multiple iterables to map
. 我更新了我的帖子以模仿@jpp如何通过多个迭代来
map
。 @jpp's answer is very good . @jpp的答案非常好 。 Though I'd advocate for upvoting all useful answers, I wish I could upvote their answer again (-:
虽然我主张提出所有有用的答案,但我希望我能再次提出他们的答案( - :
Using collections.ChainMap
: 使用
collections.ChainMap
:
from collections import ChainMap
res = dict(ChainMap(*map(dict.fromkeys, df['rules'], df['procedure'])))
print(res)
{'10': 1, '11': 1, '12': 1, '13': 2, '14': 2,
'20': 3, '21': 3, '22': 3, '24': 3}
For many uses, the final dict
conversion is not necessary: 对于许多用途,最终的
dict
转换不是必需的:
A
ChainMap
class is provided for quickly linking a number of mappings so they can be treated as a single unit.提供了一个
ChainMap
类,用于快速链接多个映射,以便将它们视为一个单元。 It is often much faster than creating a new dictionary and running multipleupdate()
calls.它通常比创建新字典和运行多个
update()
调用快得多。
See also What is the purpose of collections.ChainMap? 另请参见collections.ChainMap的目的是什么?
You may check flatten the list 您可以检查展平列表
dict(zip(sum(df.rules.tolist(),[]),df.procedure.repeat(df.rules.str.len())))
Out[60]:
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}
using itertools.chain
and DataFrame.itertuples
: 使用
itertools.chain
和DataFrame.itertuples
:
dict(
chain.from_iterable(
((rule, row.procedure) for rule in row.rules) for row in df.itertuples()
)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.