I'd like to rebuild my dataframe by combining two columns into one column, for example,
>>>df.set_index('df1')
0 1 2 3 4 5
df1
GroupA A D G J M P
GroupB B E H K N Q
GroupC C F I L O R #It is my dataframe.
Then I'd like to see my result like below.
>>>print result
df1 0 1 2
GroupA AD GJ MP
GroupB BE HK NQ
GroupC CF IL OR
#which means column0 is combined with column1, and 2+3, and 4+5......etc
I only know I could use concat()
to combine columns and use apply(lambda xxx...)
to set up a suitable function.
Does anyone can give me a hint or know how to get it by using pandas in python? Thanks,
A bit weird what you're asking to do, but basically we can iterate over the columns in steps of 2 and then call sum
on a subsection of the df and pass axis=1
, this will concatenate the str values. One tricky point is that your columns are numbers and when using square brackets in this way it tries to parse the column name as a str which means that col+1
won't work, this is why I cast it to an int
:
In [32]:
dfnew = pd.DataFrame()
for col in df.columns[::2]:
c = int(col)
dfnew[col] = df[[c,c+1]].sum(axis=1)
dfnew
Out[32]:
0 2 4
df1
GroupA AD GJ MP
GroupB BE HK NQ
GroupC CF IL OR
EDIT
A generic approach uses the length of the number of columns to generate the integer indices to index into the columns array and extract the column names from this to perform the selection, this will work for your df and where the df has str names:
In [26]:
dfnew = pd.DataFrame()
for i in range(len(df.columns))[::2]:
col_1 = df.columns[i]
col_2 = df.columns[i+1]
dfnew[col_1] = df[[col_1,col_2]].sum(axis=1)
dfnew
Out[26]:
0 2 4
df1
GroupA AD GJ MP
GroupB BE HK NQ
GroupC CF IL OR
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.