简体   繁体   English

python pandas将数据框展平为列表

[英]python pandas flatten a dataframe to a list

I have a df like so:我有一个像这样的df:

import pandas
a=[['1/2/2014', 'a', '6', 'z1'], 
   ['1/2/2014', 'a', '3', 'z1'], 
   ['1/3/2014', 'c', '1', 'x3'],
   ]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])

I want to flatten the df so it is one continuous list like so:我想展平 df 所以它是一个连续的列表,如下所示:

['1/2/2014', 'a', '6', 'z1', '1/2/2014', 'a', '3', 'z1','1/3/2014', 'c', '1', 'x3']

I can loop through the rows and extend to a list, but is a much easier way to do it?我可以遍历行并extend到一个列表,但这是一种更简单的方法吗?

You can use .flatten() on the DataFrame converted to a NumPy array:您可以在转换为 NumPy 数组的 DataFrame 上使用.flatten()

df.to_numpy().flatten()

and you can also add .tolist() if you want the result to be a Python list .如果您希望结果是 Python list ,您还可以添加.tolist()

Edit编辑

In previous versions of Pandas, the values attributed was used instead of the .to_numpy() method, as mentioned in the comments below.在先前版本的 Pandas 中,使用属性values代替.to_numpy()方法,如下面的评论中所述。

Maybe use stack ?也许使用堆栈

df.stack().values
array(['1/2/2014', 'a', '3', 'z1', '1/3/2014', 'c', '1', 'x3'], dtype=object)

( Edit: Incidentally, the DF in the Q uses the first row as labels, which is why they're not in the output here.) 编辑:顺便说一句,Q 中的 DF 使用第一行作为标签,这就是为什么它们不在此处的输出中。)

You can try with numpy你可以试试 numpy

import numpy as np
np.reshape(df.values, (1,df.shape[0]*df.shape[1]))

你可以使用reshape方法

df.values.reshape(-1)

The previously mentioned df.values.flatten().tolist()<\/code> and df.to_numpy().flatten().tolist()<\/code> are concise and effective, but I spent a very long time trying to learn how to 'do the work myself' via list comprehension and without resorting built-in functions.前面提到的df.values.flatten().tolist()<\/code>和df.to_numpy().flatten().tolist()<\/code>简洁有效,但是我花了很长时间试图学习如何“自己做” ' 通过列表理解而不使用内置函数。

For anyone else who is interested, try:对于其他感兴趣的人,请尝试:

[ row for col in df for row in df[col] ]<\/code><\/strong>

Turns out that this solution to flattening a df<\/code> via list comprehension (which I haven't found elsewhere on SO) is just a small modification to the solution for flattening nested lists (that can be found all over SO):事实证明,这种通过列表理解来展平df<\/code>的解决方案(我在 SO 的其他地方没有找到)只是对展平嵌套列表的解决方案的一个小修改(可以在整个 SO 中找到):

[ val for sublst in lst for val in sublst ]<\/code>

"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM