熊貓數據框創建新列並使用第一列的值填充值

Question

我有一個熊貓數據框df ，它只有一列col 。 我想循環col值，並通過使用第一列col的值添加列以填充值。 例如，第一行是一個列表，其中包含3個元素['text1','text2','text3'] 。 我想添加3列，並使用'text1' ， 'text2'和'text3'填充值。

import pandas as pd

df=pd.DataFrame({'col':[['text1','text2','text3'],['mext1','mext2'],['cext1']]})
df

    col
0   [text1, text2, text3]
1   [mext1, mext2]
2   [cext1]

我想要這樣：

    col                     col_1     col_2     col_3
0   [text1, text2, text3]   text1     text2     text3
1   [mext1, mext2]          mext1     mext2     Nan
2   [cext1]                 cext1     Nan       Nan

您的幫助將不勝感激。

Answer 1

您可以通過將單列中存在的值轉換為其list表示形式來構造新的數據框。 現在， list的元素本身將成為單獨的列實體。

然后可以將它們與原始DF逐列連接（ axis=1) 。

df_expand = pd.DataFrame(df['col'].tolist(), df.index)
df_expand.columns = df_expand.columns + 1
pd.concat([df['col'], df_expand.add_prefix('col_')], axis=1)

要使None表示為NaN ，可以在最后一個語法的末尾添加.replace({None:np.NaN}) 。

Answer 2

使用DataFrame構造函數的另一種解決方案，需要rename列和add_prefix ：

print (pd.DataFrame(df.col.values.tolist(), index=df.col)
         .rename(columns = lambda x: x+1)
         .add_prefix('col_')
         .reset_index())

                     col  col_1  col_2  col_3
0  [text1, text2, text3]  text1  text2  text3
1         [mext1, mext2]  mext1  mext2   None
2                [cext1]  cext1   None   None

解決方案，其中通過str.len在col列中找到列表的max長度：

cols = df.col.str.len().max() + 1
print (cols)
4
print (pd.DataFrame(df.col.values.tolist(), index=df.col,columns = np.arange(1, cols))
         .add_prefix('col_')
         .reset_index())
                     col  col_1  col_2  col_3
0  [text1, text2, text3]  text1  text2  text3
1         [mext1, mext2]  mext1  mext2   None
2                [cext1]  cext1   None   None

熊貓數據框創建新列並使用第一列的值填充值

問題描述

2 個解決方案

解決方案1
3 2017-01-03 09:44:39

解決方案2
3 2017-01-03 10:13:08

熊貓數據框創建新列並使用第一列的值填充值

問題描述

2 個解決方案

解決方案1 3 2017-01-03 09:44:39

解決方案2 3 2017-01-03 10:13:08

解決方案1
3 2017-01-03 09:44:39

解決方案2
3 2017-01-03 10:13:08