I have data frame named: variants_gene_list: enter image description here
I want to create a data frame, which contains all data in the "locus", "AF_afr" and "AF_nfe" columns as different arrays for each unique gene.
I have tried the following code: variants_gene_list = data.groupby('gene').apply(lambda x: [list(x['locus']),list(x['AF_afr']), list(x['AF_nfe'])]).apply(pd.Series)
I got this data frame: (currently, I have only one gene) enter image description here
Question -
Answer to the first question:
variants_gene_list[X].iloc[Y][Z]
Put the name of the column you want instead of X and the name of the row instead of Y and the array number of the list instead of Z.
Answer to the second question:
I think there is no way to do groupby() function without losing column names and the only way is renaming.
variants_gene_list = variants_gene_list.rename(columns={0: 'locus', 1: 'AF_afr', 2:'AF_nfe'})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.