简体   繁体   English

从熊猫到字典,第一列中的值将是键,第二列中的相应值都将在列表中

[英]From pandas to dictionary so that the value in column one will be the key and the corresponding values in column two will all be in a list

I have a very big pandas DataFrame as the following:我有一个非常大的熊猫 DataFrame 如下:

        t   gid
0   2010.0  67290
1   2020.0  92780
2   2040.0  92780
3   2060.0  92780
4   2090.0  92780
5   2110.0  92780
6   2140.0  92780
7   2190.0  92780
8   2010.0  69110
9   2010.0  78420
10  2020.0  78420
11  2020.0  78420
12  2030.0  78420
13  2040.0  78420

and I want to translate it to a dictionary such that I get:我想把它翻译成字典,这样我就可以得到:

gid_to_t[gid] == list of all t's, gid_to_t[gid] == 所有 t 的列表,

for example - gid_to_t[92778] == [2020,2040,2060,2090,2110...]例如 - gid_to_t[92778] == [2020,2040,2060,2090,2110...]

I know I can do the following:我知道我可以做到以下几点:

gid_to_t = {}
for i,gid in enumerate(list(sps.gid)):
    gid_to_t[gid] = list(sps[sps.gid==gid].t)

but it takes too long, and I will be happy to find a faster way.但这需要太长时间,我很乐意找到更快的方法。

Thanks谢谢

EDIT编辑

I've checked the methods suggested in the comments, this is the data: https://drive.google.com/open?id=1d3zUkc543hm8CZ_ZyzAzdbmQUE_G55bU我检查了评论中建议的方法,这是数据: https : //drive.google.com/open?id=1d3zUkc543hm8CZ_ZyzAzdbmQUE_G55bU

import pandas as pd
df1 = pd.read_pickle('stack.pkl')

%timeit -n 2 df1.groupby('gid')['t'].apply(list).to_dict()
2 loops, best of 3: 4.76 s per loop
%timeit -n 2 df1.groupby('gid')['t'].apply(lambda x: x.tolist()).to_dict()
2 loops, best of 3: 4.21 s per loop
%timeit -n 2 df1.groupby('gid', sort=False)['t'].apply(list).to_dict()
2 loops, best of 3: 4.84 s per loop
%timeit -n 2 {name: group.tolist() for name, group in df1.groupby('gid')['t']}
2 loops, best of 3: 4 s per loop
%timeit -n 2 {name: group.tolist() for name, group in df1.groupby('gid', sort=False)['t']}
2 loops, best of 3: 3.96 s per loop
%timeit -n 2 {name: group['t'].tolist() for name, group in df1.groupby('gid', sort=False)}
2 loops, best of 3: 7.16 s per loop

Try create dictionary by to_dict from Series of list s created by groupby :尝试从groupby创建的list Series中的to_dict创建dictionary

#if necessary convert column to int
df.t = df.t.astype(int)
d = df.groupby('gid')['t'].apply(list).to_dict()
print (d)
{92780: [2020, 2040, 2060, 2090, 2110, 2140, 2190], 
 67290: [2010], 
 78420: [2010, 2020, 2020, 2030, 2040], 
 69110: [2010]}

print (d[78420])
[2010, 2020, 2020, 2030, 2040]

If performance is important add sort=False parameter to groupby :如果性能很重要,请将sort=False参数添加到groupby

d = df.groupby('gid', sort=False)['t'].apply(list).to_dict()
d = {name: group.tolist() for name, group in df.groupby('gid', sort=False)['t']}
d = {name: group['t'].tolist() for name, group in df.groupby('gid', sort=False)}

One more answer that doesn't use apply.另一个不使用的答案适用。

d = {name: group.tolist() for name, group in df.groupby('gid')['t']}

{67290: [2010.0],
 69110: [2010.0],
 78420: [2010.0, 2020.0, 2020.0, 2030.0, 2040.0],
 92780: [2020.0, 2040.0, 2060.0, 2090.0, 2110.0, 2140.0, 2190.0]}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将两个熊猫列转换为字典,但将同一第一列(键)的所有值合并为一个键? - How to convert two pandas columns into a dictionary, but merge all values of same first column (key) into one key? 通过两个文件中的一列中的值与另一列中的相应值进行汇总 - Aggregating values in one column by their corresponding value in another from two files 使用 Python Pandas 如何将一列列表值(id 编号)到一列列表值(对应于字典列表中的名称) - Using Python Pandas how to map a column of list values (of id numbers) to a new column of list values (corresponding to names from dictionary list) 从 pandas 列中的一个元素的列表中提取字典值 - Extract dictionary value from a list with one element in a pandas column Pandas - lambda - 列表中的值和来自另一列的对应值,其中列表中的值 - Pandas - lambda - values in list and corresponding value from another column where values in list 熊猫:将列值与字典键进行比较,并更新新列中的值 - Pandas: Compare column value with dictionary key and update values in the new column 如何为列表中的一个键创建具有多个值的 Python 字典,然后创建具有一列和多行的 pandas 数据框 - How can I create a Python dictionary with multiple values for one key from a list, to then create a pandas dataframe with one column and multiple rows 如何从两个列表构造字典,将该键映射到 python 值列表中相应索引处的值 - how to construct a dictionary from two lists mapping that key to the value at the corresponding index in the list of values in python 使用另一列上的键和字典中的值替换熊猫列值 - Replace pandas column values using a key on another column and value from a dictionary Pandas 如何从一列创建重复列表,并且只保留对应列的最大值? - Pandas How do I create a list of duplicates from one column, and only keep the highest value for the corresponding columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM