[英]Take list as column values in pandas dataframe
I have a dataframe as below :我有一个如下的数据框:
Card_x Country Age Code Card_y
S INDIA Adult Garments S,E,D,G,M,A
S INDIA Adult Grocery D,S,G,A,M,E
I have list as below :我有如下清单:
lis1 = [S,D,G,E,M,A]
Now i wanted my dataframe to be as below :现在我希望我的数据框如下:
Explanation : Group by Card_x, Country , Age and get the lis1 values as "Card_y"说明:按 Card_x、Country、Age 分组并将 lis1 值设为“Card_y”
Card_x Country Age Card_y
S INDIA Adult S,D,G,E,M,A
Can i be helped ?我能得到帮助吗?
Note : Logic for calulating lis1 is below :注意:计算 lis1 的逻辑如下:
lis1=[]
for i in range(len(t)):
l=df.Card_y.iloc[i].split(',')
lis1.append(l)
sorted(lis1[0], key=lambda elem: sum(sublist.index(elem) for sublist in lis1) / len(lis1))
Basically, lis1 gets the Rank of each Card_y for different "Code" and gets the Average Rank and recomputes the Rank with least Average.基本上,lis1 获取不同“代码”的每个 Card_y 的排名,并获取平均排名并重新计算平均排名最低的排名。
Eg : S is in 1st Rank for Code - Garments, and 2rd Rank for Code - Grocery.so average is 1+2/2=1.5例如:S 在代码 - 服装中排名第一,在代码 - 杂货中排名第二。所以平均值是 1+2/2=1.5
D is 3rd Rank for Code - Garments, and 1st Rank for Code - Grocery. D 是代码 - 服装的第 3 名,以及代码 - 杂货的第 1 名。 so average is 3+1/2=2.所以平均值是 3+1/2=2。
Now based on the average, with least average i get the Ranked list.现在基于平均值,最低平均值我得到了排名列表。 so it will be S,D,G,E,M,A所以它将是 S,D,G,E,M,A
Try:尝试:
df_out = df.groupby(['Card_x','Country','Age'])['Card_y'].apply(lambda x: x.str.split(',', expand=True)
.rename(columns = lambda x: x+1)
.stack().reset_index(level=1))
df_out = df_out.groupby(['Card_x','Country','Age',0])['level_1'].mean().sort_values().reset_index(level=-1)
df_out.groupby(['Card_x','Country','Age'])[0].agg(','.join).rename('Card_y').reset_index()
Output:输出:
Card_x Country Age Card_y
0 S INDIA Adult S,D,G,E,A,M
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.