I have a dataframe with 45 columns and 1000 rows. My requirement is to create a single excel sheet with the top 2 values of each column and their percentages (suppose col 1 has the value 'python' present 500 times in it, the percentage should be 50)
I used:
writer = pd.ExcelWriter('abc.xlsx')
df = pd.read_sql('select * from table limit 1000', <db connection sring>)
column_list = df.columns.tolist()
df.fillna("NULL", inplace = True)
for obj in column_list:
df1 = pd.DataFrame(df[obj].value_counts().nlargest(2)).to_excel(writer,sheet_name=obj
writer.save()
This writes the output in separate excel tabs of the same document. I need them in a single sheet in the below format:
Column Name Value Percentage
col1 abc 50
col1 def 30
col2 123 40
col2 456 30
....
Let me know any other functions as well to get to this output.
The first thing that jumps out to me is that you are changing the sheet name each time, by saying sheet_name=obj
If you get rid of that, that alone might fix your problem.
If not, I would suggest concatenating the results into one large DataFrame and then writing that DataFrame to Excel.
for obj in column_list:
df = pd.DataFrame(df[obj].value_counts().nlargest(2))
if df_master is None:
df_master = df
else:
df_master = pd.concat([df_master,df])
df_master.to_excel("abc.xlsx")
Here's more information on stacking/concatenating dataframes in Pandas https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.