简体   繁体   中英

How to replace a column using pandas with the matching value from another file?

I have an excel sheet with following columns.

Date, Ship-to-party ,Volume
1/09/2019 PQ01    1000
2/09/2019 PQXC    2500
...

Another sheet is like;

Document Date Deliveryid Sales
01/09/2019    153678     350
01/09/2019    236742     400

Another sheet is like;

Time        Site    Dips
01/09/2019  SiteA   1500
01/09/2019  SiteB   1222
...

In another excel/csv file I have defined what Ship-to party means. That is, I have originally 3 different work sheets, each contain a column (eg: sitename) in different names. (eg: SHip-to party,Sitename,Deliveryid) My requirement is to combine all 3 worksheets in a single sheet based on date and sitename along with other values. To do that, I have a seperate sheet which contain what ship-to-party value=Sitename=Deliveryid ( 3 columns are there)

How can I replace original 3 worksheets columns based on a single sitename and combine them to get a single excel sheet using pandas?

My mapping sheet contain

ship-to party  Sitename  Deliveryid

PQ01           SiteA      543892
PQXY           SiteB      539081
....

I expect my final sheet is like

Date       Sitename  Sales Dips Volume
1/09/2019   SiteA    500   1000 1500
1/09/2019   SiteB    100   500  2000
....

I try like as Hue mentioned;

def write_dips(writer):
    file_path = '/Users/ratha/PycharmProjects/DataLoader/output.xlsx'
    mappingfilepath ='/Users/ratha/PycharmProjects/DataLoader/data/mappings/File Mapping.csv'

    df_dips = pd.read_excel(file_path, sheet_name='DipsSummary')
    df_sales = pd.read_excel(file_path, sheet_name='SaleSummary')
    df_delivery = pd.read_excel(file_path, sheet_name='DeliverySummary')
    df_mapping = pd.read_csv(mappingfilepath, delimiter=',', skiprows=[1])
    df2 = df_dips.merge(df_mapping, left_on='Site',right_on='SHIP TO NAME').\
        merge(df_sales,left_on ='Delivery ID',right_on='Deliveryid').\
        merge(df_delivery, left_on='SHIP-TO PARTY',right_on='Ship-To Party')

    print(df2.dtypes) <--this prints all columns..so merging works
    x = df2.groupby(['Dip Time', 'Site', 'Tank ID', 'Product', 'Volume',
                'IdassId', 'TankNo', 'GradeNo','Sales','Ship-To Party', 'Material','Qty in Stock UoM'], as_index=False).apply(atg_aggregation)
    x.to_excel(writer, sheet_name='DipsNewSummary')

But final output file doesnt contain anything.The groupby I try is right ( I use group by for all columns presents in all 3 sheets?

After merging I expect my sheet should be like; ( I expect to pick few columns in all 3 sheets, so applying all the columns in groupby method)

Dip Time  Site  Tank ID Product Dips DeliveryId Sales Ship-To Party 
1/09/2019 SiteA  1      Diesel  500  526781     150   PQ01

You want to merge the 3 sheets, you just need a merge and not groupby.

Here is the sample code you can try and let me know if it helps.

df
Out[29]: 
  ship-to party Sitename  Deliveryid
1          PQ01    SiteA    543892.0
2          PQXY    SiteB    539081.0

df1
Out[30]: 
        Date Ship-to-party  Volume
0  1/09/2019          PQ01  1000.0
1  2/09/2019          PQXC  2500.0

df1=df1.merge(df,left_on='Ship-to-party',right_on='ship-to party')

df1
Out[32]: 
        Date Ship-to-party    ...      Sitename Deliveryid
0  1/09/2019          PQ01    ...         SiteA   543892.0

[1 rows x 6 columns]
df1.columns=['Date', 'Ship-to-party', 'Volume', 'ship-to party', 'Site',
       'Deliveryid']

Now merge df1 with other sheets.

df1.merge(df2,on='Site').merge(df3,on='Deliveryid')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM