简体   繁体   中英

Appending a column to data frame using Pandas in python

I'm trying some operations on Excel file using pandas. I want to extract some columns from a excel file and add another column to those extracted columns. And want to write all the columns to new excel file. To do this I have to append new column to old columns.

Here is my code-

import pandas as pd

#Reading ExcelFIle 
#Work.xlsx is input file

ex_file = 'Work.xlsx'
data = pd.read_excel(ex_file,'Data')

#Create subset of columns by extracting  columns D,I,J,AU from the file 
data_subset_columns = pd.read_excel(ex_file, 'Data', parse_cols="D,I,J,AU") 

#Compute new column 'Percentage' 
#'Num Labels' and 'Num Tracks' are two different columns in given file 

data['Percentage'] = data['Num Labels'] / data['Num Tracks']
data1 = data['Percentage']
print data1

#Here I'm trying to append data['Percentage'] to data_subset_columns 
Final_data = data_subset_columns.append(data1)
print Final_data
Final_data.to_excel('111.xlsx') 

No error is shown. But Final_data is not giving me expected results. ( Data not getting appended)

There is no need to explicitly append columns in pandas . When you calculate a new column, it is included in the dataframe. When you export it to excel, the new column will be included.

Try this, assuming 'Num Labels' and 'Num Tracks' are in "D,I,J,AU" [otherwise add them]:

import pandas as pd

data_subset = pd.read_excel(ex_file, 'Data', parse_cols="D,I,J,AU") 
data_subset['Percentage'] = data_subset['Num Labels'] / data_subset['Num Tracks']
data_subset.to_excel('111.xlsx') 

The append function of a dataframe adds rows, not columns to the dataframe. Well, it does add columns if the appended rows have more columns than in the source dataframe.

DataFrame.append(other, ignore_index=False, verify_integrity=False)[source]

Append rows of other to the end of this frame, returning a new object. Columns not in this frame are added as new columns.

I think you are looking for something like concat .

Combine DataFrame objects horizontally along the x axis by passing in axis=1.

>>> df1 = pd.DataFrame([['a', 1], ['b', 2]],
...                    columns=['letter', 'number'])
>>> df4 = pd.DataFrame([['bird', 'polly'], ['monkey', 'george']],
...                    columns=['animal', 'name'])
>>> pd.concat([df1, df4], axis=1)
  letter  number  animal    name
0      a       1    bird   polly
1      b       2  monkey  george

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM