[英]How do I extract elements from a list in a pandas dataframe column?
I have the following lists:我有以下列表:
dates = ['12/29/2020', '12/25/2020', '12/22/2020']
numbers = [ [1, 31, 35], [17, 23, 36], [29, 53, 56] ]
I used them to make a DataFrame:我用它们制作了一个 DataFrame:
df = pd.DataFrame(
{
'date':dates,
'nums': numbers
}
)
This gives me a DataFrame with two columns.这给了我一个有两列的 DataFrame。 I want to break out the elements in the list to create 3 columns (one for each number in the list) to end up with the following DataFrame:
我想分解列表中的元素以创建 3 列(列表中的每个数字一列),最终得到以下 DataFrame:
date num1 num2 num3
0 '12/29/2020' 1 31 35
1 '12/25/2020' 17 23 36
2 '12/22/2020' 29 53 56
How can I do this?我怎样才能做到这一点?
Create a new data frame from nums
column by converting it to list first, and then concat with date
column:从
nums
列创建一个新的数据框,首先将其转换为列表,然后与date
列连接:
pd.concat([df.date, pd.DataFrame(df.nums.to_list()).add_prefix('num')], axis=1)
date num0 num1 num2
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
Create a new dataframe and join it back:创建一个新的 dataframe 并将其加入:
>>> df[['date']].join(pd.DataFrame(df['num'].tolist()).rename(lambda x: f'num{x + 1}', axis=1))
date num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
>>>
Or just add_prefix
:或者只是
add_prefix
:
>>> df[['date']].join(pd.DataFrame(df['num'].tolist()).add_prefix('num'))
date num0 num1 num2
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
>>>
So the other answers sufficiently cover the case where you need to fix an already existing dataframe , but just in case you have the opportunity, it's much easier to simply fix your data before creating a dataframe:因此,其他答案足以涵盖您需要修复已经存在的 dataframe的情况,但万一您有机会,在创建 dataframe之前简单地修复数据会容易得多:
In [1]: import pandas as pd
In [2]: dates = ['12/29/2020', '12/25/2020', '12/22/2020']
In [3]: numbers = [[1, 31, 35], [17, 23, 36], [29, 53, 56]]
In [4]: nums = {f"num{i}": n for i, n in enumerate(zip(*numbers), 1)}
In [5]: df = pd.DataFrame({"dates": dates, **nums})
In [6]: df
Out[6]:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
Or, another way:或者,另一种方式:
In [7]: data = [[date, *nums] for date, nums in zip(dates, numbers)]
In [8]: pd.DataFrame(data, columns=["dates", "num1", "num2", "num3"])
Out[8]:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
You can use a dataframe constructor like this:您可以像这样使用 dataframe 构造函数:
pd.DataFrame(numbers,
index=dates,
columns=[f'num{i+1}' for i in range(len(numbers))])\
.rename_axis('dates').reset_index()
Output: Output:
dates num1 num2 num3
0 12/29/2020 1 31 35
1 12/25/2020 17 23 36
2 12/22/2020 29 53 56
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.