pandas data frame groupby column names

Question

I have data frame named: variants_gene_list: enter image description here

I want to create a data frame, which contains all data in the "locus", "AF_afr" and "AF_nfe" columns as different arrays for each unique gene.

I have tried the following code: variants_gene_list = data.groupby('gene').apply(lambda x: [list(x['locus']),list(x['AF_afr']), list(x['AF_nfe'])]).apply(pd.Series)

I got this data frame: (currently, I have only one gene) enter image description here

Question -

How do I access the locus / AF_afr lists in the new dataframe I created?
There are no column names in the data frame I have created, what am I missing? Thanks

Answer 1

Answer to the first question:

variants_gene_list[X].iloc[Y][Z]

Put the name of the column you want instead of X and the name of the row instead of Y and the array number of the list instead of Z.

Answer to the second question:

I think there is no way to do groupby() function without losing column names and the only way is renaming.

variants_gene_list = variants_gene_list.rename(columns={0: 'locus', 1: 'AF_afr', 2:'AF_nfe'})

pandas data frame groupby column names

Question

1 answers

solution1
0 2023-01-12 08:56:14

pandas data frame groupby column names

Question

1 answers

solution1 0 2023-01-12 08:56:14

solution1
0 2023-01-12 08:56:14