![](/img/trans.png)
[英]How to transpose list elements from multiple columns into rows in pandas DataFrame?
[英]How to split datas from columns and add to a list from a dataframe, also repeat the list elements for a single row? (Pandas)
我有一個數據框
Product Photo 1 Photo 2 Photo 3 Photo 4 Price
Shirt a.jpg b.jpg c.jpg d.jpg 100
Pant e.jpg 245
Coat f.jpg g.jpg 433
列表為
values = [['A,'B','C'],['D','E','F','G'],['H','I','J','K]]
應該添加這些數據幀,並且應該從索引值[2] 開始,並且列表應該根據數據幀中的行數從索引開始增加。 列也應該像 output 格式一樣拆分。 元素應該為每一行循環。
預期 output
values = [['A,'B','C'],['D','E','F','G'],['PHOTO','a.jpg'],['PHOTO','b.jpg'],
['PHOTO','c.jpg'],['PHOTO','d.jpg'],['H','I','J','K],
['A,'B','C'],['D','E','F','G'],
['PHOTO','e.jpg'],['H','I','J','K], ['A,'B','C'],['D','E','F','G'],
['PHOTO','f.jpg',], ['PHOTO','g.jpg'], ['H','I','J','K]
]
然后,我想將此列表轉換為數據框。 我試過的:
L = [df.loc[:, x].set_axis(range(len(x)), axis=1) for x in values]
df = pd.concat(L).sort_index(kind='mergesort').fillna('').reset_index(drop=True)
df = df.fillna('')
output 來自上例中的此代碼:
A B C
D E F G
H I J K
#the data frame repeats till the number of rows in the previous df.
將嵌套列表理解與添加PHOTO
然后values
列表一起使用:
values = [['A','B','C'],['D','E','F','G'],['H','I','J','K']]
df1 = df.fillna('').filter(like='Photo')
print (df1)
Photo 1 Photo 2 Photo 3 Photo 4
0 a.jpg b.jpg c.jpg d.jpg
1 e.jpg
2 f.jpg g.jpg
out = [y for x in df1.to_numpy()
for y in values[:2] + [['PHOTO', z] for z in x[x!='']] + values[2:]]
print (out)
[['A', 'B', 'C'], ['D', 'E', 'F', 'G'], ['PHOTO', 'a.jpg'], ['PHOTO', 'b.jpg'], ['PHOTO', 'c.jpg'], ['PHOTO', 'd.jpg'], ['H', 'I', 'J', 'K'],
['A', 'B', 'C'], ['D', 'E', 'F', 'G'], ['PHOTO', 'e.jpg'], ['H', 'I', 'J', 'K'],
['A', 'B', 'C'], ['D', 'E', 'F', 'G'], ['PHOTO', 'f.jpg'], ['PHOTO', 'g.jpg'], ['H', 'I', 'J', 'K']]
你可以嘗試這樣的事情:
rows = [
['PHOTO'] + r.strip().split()
for r in df.filter(regex = 'Photo').to_string(header = False, index = False).split('\n')
]
values = values[:2] + rows + values[2:]
Output
values
[['A', 'B', 'C'], ['D', 'E', 'F', 'G'], ['PHOTO', 'a.jpg', 'b.jpg', 'c.jpg', 'd.jpg'], ['PHOTO', 'e.jpg'], ['PHOTO', 'f.jpg', 'g.jpg'], ['H', 'I', 'J', 'K']]
如果空單元格是None
,那么你必須先這樣做:
df = df.fillna('')
修改后更新:
jpgs = df.filter(regex = 'Photo').stack()
rows = [["PHOTO", jpg] for jpg in jpgs[jpgs != ''].unique()]
values = values[:2] + rows + values[2:]
Output
values
[['A', 'B', 'C'], ['D', 'E', 'F', 'G'], ['PHOTO', 'a.jpg'], ['PHOTO', 'b.jpg'], ['PHOTO', 'c.jpg'], ['PHOTO', 'd.jpg'], ['PHOTO', 'e.jpg'], ['PHOTO', 'f.jpg'], ['PHOTO', 'g.jpg'], ['H', 'I', 'J', 'K']]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.