Python Pandas數據框：使用列中的值創建新列

Question

我搜索了幾本書和網站，但找不到與我要嘗試的內容完全匹配的內容。 我想從一個數據框中創建逐項列出的列表，然后像這樣重新配置數據：

      A     B                A     B     C     D  
0     1     aa          0    1     aa  
1     2     bb          1    2     bb  
2     3     bb          2    3     bb    aa  
3     3     aa     --\  3    4     aa    bb    dd  
4     4     aa     --/  4    5     cc  
5     4     bb  
6     4     dd  
7     5     cc

我已經嘗試過分組，堆疊，拆堆等操作，但是沒有任何嘗試產生想要的結果。 如果不是很明顯，那么我對python還是很陌生，一個解決方案會很棒，但是對我需要遵循的過程的理解將是完美的。

提前致謝

Answer 1

使用熊貓，您可以查詢所有結果，例如A = 4。

一種粗略但可行的方法是迭代各種索引值，並將所有“ like”結果收集到一個numpy數組中，然后將其轉換為新的數據幀。

偽代碼演示我的示例：（將需要重寫才能真正起作用）

l= [0]*df['A'].max() 
for item in xrange(df['A'].max() ):
    l[item] = df.loc[df['A'].isin(item)]

df = pd.DataFrame(l)
# or something of the sort

希望對您有所幫助。

評論更新：

animal_list=[]

for animal in ['cat','dog'...]:
    newdf=df[[x.is('%s'%animal) for x in df['A']]]

    body=[animal]    
    for item in newdf['B']
        body.append(item)

    animal_list.append(body)

df=pandas.DataFrame(animal_list)

Answer 2

一種適用於字符串的快速而骯臟的方法。 根據需要自定義列命名。

data =  {'A': [1, 2, 3, 3, 4, 4, 4, 5],
         'B': ['aa', 'bb', 'bb', 'aa', 'aa', 'bb', 'dd', 'cc']}
df = pd.DataFrame(data)

maxlen = df.A.value_counts().values[0]  # this helps with creating 
                                    # lists of same size

newdata = {}
for n, gdf in df.groupby('A'):
    newdata[n]= list(gdf.B.values) + [''] * (maxlen - len(gdf.B)) 

# recreate DF with Col 'A' as index; experiment with other orientations
newdf = pd.DataFrame.from_dict(newdict, orient='index') 

# customize this section
newdf.columns = list('BCD')
newdf['A'] = newdf.index
newdf.index = range(len(newdf))
newdf = newdf.reindex_axis(list('ABCD'), axis=1) # to set the desired order

print newdf

結果是：

A   B   C   D
0  1  aa        
1  2  bb        
2  3  bb  aa    
3  4  aa  bb  dd
4  5  cc

Python Pandas數據框：使用列中的值創建新列

問題描述

2 個解決方案

解決方案1
0 2015-02-05 15:33:32

解決方案2
0 2015-02-06 18:29:53

Python Pandas數據框：使用列中的值創建新列

問題描述

2 個解決方案

解決方案1 0 2015-02-05 15:33:32

解決方案2 0 2015-02-06 18:29:53

解決方案1
0 2015-02-05 15:33:32

解決方案2
0 2015-02-06 18:29:53