[英]Convert Nx1 pandas dataframe with single 1xM array-containing column to M columns in Pandas dataframe
This is the current dataframe I have: It is Nx1 with each cell containing a numpy array.这是我拥有的当前数据帧:它是 Nx1,每个单元格包含一个 numpy 数组。
print (df)
age
0 [35, 34, 55, 56]
1 [25, 34, 35, 66]
2 [45, 35, 53, 16]
.
.
.
N [45, 35, 53, 16]
I would like somehow to ravel each value of each cell to a new column.我想以某种方式将每个单元格的每个值分解为一个新列。
# do conversion
print (df)
age1 age2 age3 age4
0 35 34 55 56
1 25 34 35 66
2 45 35 53 16
.
.
.
N 45 35 53 16
You can reconstruct the dataframe from the lists, and customize the column names with:您可以从列表中重建数据框,并使用以下命令自定义列名:
df = pd.DataFrame(df.age.values.tolist())
df.columns += 1
df = df.add_prefix('age')
print(df)
age1 age2 age3 age4
0 35 34 55 56
1 25 34 35 66
...
Here is another alternative:这是另一种选择:
import pandas as pd
df = pd.DataFrame({'age':[[35,34,55,54],[1,2,3,4],[5,6,7,8],[9,10,11,12]]})
df['age_aux'] = df['age'].astype(str).str.split(',')
for i in range(4):
df['age_'+str(i)] = df['age_aux'].str.get(i).map(lambda x: x.lstrip('[').rstrip(']'))
df = df.drop(columns=['age','age_aux'])
print(df)
Output:输出:
age_0 age_1 age_2 age_3
0 35 34 55 54
1 1 2 3 4
2 5 6 7 8
3 9 10 11 12
You can create DataFrame
by constructor for improve performance and change columns names by rename
with f-string
s:您可以通过构造函数创建
DataFrame
以提高性能并通过使用f-string
rename
更改列名称:
df1 = (pd.DataFrame(df.age.values.tolist(), index=df.index)
.rename(columns = lambda x: f'age{x+1}'))
Another variation is to apply pd.Series to the column and massage the column names:另一种变体是将 pd.Series 应用于列并调整列名:
df= pd.DataFrame( { "age": [[1,2,3,4],[2,3,4,5]] })
df = df["age"].apply(pd.Series)
df.columns = ["age1","age2","age3","age4"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.