I have a daraframe that returns data for each OfficeLocation
How can I split dataframe by each OfficeLocation
and insert each piece of data into separate excel spreadsheet.
import pandas
import pyodbc
server = 'MyServer'
db = 'MyDB'
myparams = ['2019-01-01','2019-02-28', None] # None substitutes NULL in sql
connection_string = pyodbc.connect('DRIVER={SQL Server};server='+server+';DATABASE='+ db+';Trusted_Connection=yes;')
df = pandas.read_sql_query('EXEC PythonTest_Align_RSrptAccountCurrentMunich @EffectiveDateFrom=?,@EffectiveDateTo=?,@ProducerLocationID=?', connection_string, params = myparams)
# sort the daraframe
df.sort_values(by=['OfficeLocation'], axis=0,inplace=True)
# set the index to be this and do not drop
df.set_index(keys=['OfficeLocation'],drop=False,inplace=True)
# get a list of unique offices
office = df['OfficeLocation'].unique().tolist()
# now we can perform a lookup on a 'view' of the dataframe
SanDiego = df.loc['San Diego']
print(SanDiego)
# how can I iterate through each office and create excel file for each office
df.loc['San Diego'].to_excel((r'\\user\name\Python\SanDIego_Office.xlsx'))
So I need 3 excel spreadsheet with data: SanDiego.xlsx,
Vista.xlsx
and SanBernardino.xlsx
You can use groupby
:
for location, d in df.groupby('OfficeLocation'):
d.to_excel(f'\\user\name\Python\{location}.xlsx')
How about something as simple as this?
for loc in df["OfficeLocation"].unique():
save_df = df[df["OfficeLocation"] == loc]
save_df.to_excel(loc + ".xlsx")
EDIT
I've generated 50,000 rows of data similar to yours.
+---------------+--------------------+----------------+---------------+----------------+-----------------+------------+--------------+
| Policy Number | ProducerLocationId | OfficeLOcation | EffectiveDate | ExpirationDate | TransactionType | BondAmount | GrossPremium |
+---------------+--------------------+----------------+---------------+----------------+-----------------+------------+--------------+
| 7563299 | 8160 | Aldora | 31/10/2018 | 28/01/2019 | Cancelled | -61081 | -2372.303665 |
| 6754151 | 3122 | Aucilla | 04/05/2019 | 15/06/2019 | New Business | -80151 | -4135.443318 |
| 3121128 | 3230 | Aulander | 11/10/2018 | 29/12/2018 | New Business | -67563 | -28394.83428 |
| 911463 | 4041 | Aullville | 30/11/2018 | 20/02/2019 | New Business | -47918 | -17840.05749 |
| 5068380 | 3794 | Ava | 10/01/2019 | 28/03/2019 | Cancelled | -41094 | -30523.0655 |
| 2174424 | 1263 | Alcan Border | 18/04/2019 | 10/07/2019 | Cancelled | -73661 | -5979.278874 |
| 475464 | 9250 | Audubon | 15/01/2019 | 17/02/2019 | New Business | -85217 | -64988.83987 |
| 2076075 | 7405 | Alderton | 20/08/2019 | 26/09/2019 | New Business | -32335 | -11144.63342 |
| 3645387 | 9357 | Austwell | 22/10/2018 | 19/12/2018 | Cancelled | -5065 | -5013.982643 |
| 3316361 | 1335 | Aurora | 29/09/2018 | 24/12/2018 | New Business | -13939 | -6333.580641 |
| 1404387 | 2656 | Auburn Hills | 04/07/2019 | 19/09/2019 | Cancelled | -12049 | -385.3522259 |
| 6908433 | 1288 | Alcester | 30/10/2018 | 18/01/2019 | Cancelled | -56902 | -27341.06181 |
| 9908879 | 6012 | Alexandria | 20/06/2019 | 21/08/2019 | Cancelled | -76226 | -12671.06376 |
| 7850879 | 4606 | Avery | 10/11/2018 | 21/01/2019 | Cancelled | -54297 | -40619.42718 |
| 8437707 | 4149 | Auxvasse | 22/09/2019 | 28/10/2019 | Cancelled | -59584 | -19800.71077 |
| 4260681 | 1889 | Auburndale | 06/07/2019 | 22/08/2019 | New Business | -55035 | -18271.5442 |
| 7234116 | 2636 | Alexander | 14/07/2019 | 31/08/2019 | New Business | -59319 | -15711.2827 |
| 3721467 | 3765 | Alexander City | 16/10/2018 | 23/12/2018 | Cancelled | -98431 | -26743.07459 |
| 6859964 | 7035 | Alburtis | 04/11/2018 | 26/12/2018 | New Business | -36917 | -11339.9049 |
| 2994719 | 6997 | Aleneva | 09/02/2019 | 13/04/2019 | New Business | -55739 | -46323.01608 |
| 7542794 | 8968 | Aullville | 25/09/2018 | 09/11/2018 | Cancelled | -44488 | -4554.278674 |
| 1340649 | 7003 | Augusta | 30/11/2018 | 17/02/2019 | New Business | -78405 | -71910.93325 |
| 8078558 | 7185 | Alderpoint | 10/06/2019 | 22/07/2019 | New Business | -37928 | -29289.29545 |
| 8198811 | 8963 | Alden | 05/07/2019 | 15/08/2019 | Cancelled | -97648 | -79946.41222 |
| 2510522 | 5714 | Avella | 03/09/2019 | 02/11/2019 | New Business | -16452 | -11230.93829 |
+---------------+--------------------+----------------+---------------+----------------+-----------------+------------+--------------+
And created two functions one using my version and the other using the groupby method.
In case any one was wondering they both perform similarly but the groupby method comes out on top with less variance and a 1 second quicker run time.
def loop_save_unique(df):
for loc in df["OfficeLOcation"].unique():
save_df = df[df["OfficeLOcation"] == loc]
save_df.to_excel("output\\test1\\" + loc + ".xlsx")
def loop_save_groupby(df):
for location, d in df.groupby('OfficeLOcation'):
d.to_excel(f'output\\test2\\{location}.xlsx')
%timeit loop_save_unique(df)
12.1 s ± 556 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit loop_save_groupby(df)
11.1 s ± 183 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.