簡體   English   中英

從列和 Dataframe 迭代條件到列表轉換(熊貓)

[英]Iterating over conditions from columns and Dataframe to list conversion(pandas)

我有一個像這樣的 dataframe:

Item   Quantity  Price     Photo1     Photo2    Photo3    Photo4

A        2         30      A1.jpg      A2.jpg 
B        4         10      B1.jpg      B2.jpg    B3.jpg    B4.jpg
C        5         15      C1.jpg

這些是我以前與以這種格式引入數據框有關的問題。

如何從列中拆分數據並從 dataframe 添加到列表中,還重復單行的列表元素? (熊貓)

我首先創建了一個列表:

df1 = df.reindex(['Item','Quantity','Price','Photo1','Photo2','Photo3','Photo4','I','Q','P','PH',] axis=1)
df1['I'] = df1['I'].fillna['I']
df1['Q'] = df1['Q'].fillna['Q']
df1['P'] = df1['P'].fillna['P']
df1['PH'] = df1['PH'].fillna['PH']
vals = [['I','Item'],['Q','Quantity'],['P','Price']]

我從第一個問題開始嘗試:

photo_df = df1.fillna('').filter(like='Photo')


vals = [y for x in photo_df.to_numpy() 
         for y in vals[:3] + [['PH',z] for z in x[x!='']] ]

列表返回

vals = [['I','Item'],['Q','Quantity'],['P','Price'],['PH','A1.jpg'],['PH','A2.jpg'],
        ['I','Item'],['Q','Quantity'],['P','Price'],['PH','B1.jpg'],['PH','B2.jpg'],['PH','B3.jpg'],['PH','B4.jpg'],
        ['I','Item'],['Q','Quantity'],['P','Price'],['PH','C1.jpg']]

我希望列表為:

vals = [['I','Item'],['Q','Quantity'],['P','Price'],['PH','Photo1'],['PH','Photo2'],
        ['I','Item'],['Q','Quantity'],['P','Price'],['PH','Photo1'],['PH','Photo2'],['PH','Photo3'],['PH','Photo4'],
        ['I','Item'],['Q','Quantity'],['P','Price'],['PH','Photo1']]
   

我想將 header 名稱保留在列表中而不是數據中,但應該以問題的格式迭代數據: How to split datas from columns and add to a list from a dataframe,還重復單行的列表元素? (熊貓)

您可以像這樣在創建photo_df的地方做一個小改動:

photo_df = df1.filter(like='Photo')
photo_df = photo_df.transform(lambda x: np.where(x.isnull(), x, x.name)) 
photo_df = photo_df.fillna('')

第二行只是將非空值替換為其列名。

Output:

[['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1'], ['PH', 'Photo2'], 
['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1'], ['PH', 'Photo2'], 
['PH', 'Photo3'], ['PH', 'Photo4'], ['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1']]

想法是過濾列名稱而不是列表理解中的值 - 將x[x!='']更改為photo_df.columns[x!='']

vals = [y for x in photo_df.to_numpy() 
          for y in vals[:3] + [['PH',z] 
          for z in photo_df.columns[x!='']]]
print (vals)
[['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1'], ['PH', 'Photo2'], 
 ['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1'], ['PH', 'Photo2'], ['PH', 'Photo3'], ['PH', 'Photo4'], 
 ['I', 'Item'], ['Q', 'Quantity'], ['P', 'Price'], ['PH', 'Photo1']]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM