I have an application that generates data frames with different numbers of columns and their cells contains two values separated by "|".
gene_1 gene_2 ...
ashb|ESNT00011 wsefsf|ENST0008
adecasd|ENST0001 uibib|ENST0008
How can I iterate over columns and split values into two columns called gene_1_name and gene_1_ID
gene_1_name gene_1_ID gene_2_name gene_2_ID ...
ashb ESNT00011 wsefsf ENST0008
adecasd ENST0001 uibib ENST0008
Use stack
and unstack
:
result = (
df.stack().str.split('|', expand=True) # split the strings
.rename(columns={0: 'name', 1: 'id'}) # rename the columns
.unstack() # unstack
)
# Merge the two levels
result.columns = [f'{gene}_{col}' for col, gene in result.columns]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.