简体   繁体   中英

How can I copy the value of one cell in a csv file to another csv file using Python and Pandas?

I am following this Datafish tutorial as I have been tasked to update a price list. There are over 5000 (Target) rows of data in the one dataframe and 900 (Source)in the other. I am stuck as to how to add (in the context of the tutorial) the difference that is produced by comparing the two dataframes to the second dataframe so as to update the second dataframe. Could someone point me in the direction I should be heading, which method or a snippet of how to add things.

The snippet from the tutorial here creates a price difference column (second line). I want to take that result and add it ti the Price2 column or if there is a way to simply use the True/False logic that is created in the first line and copy Price1 to Price2.

df1['pricesMatch?'] = np.where(df1['Price1'] == df2['Price2'], 'True', 'False')
df1['priceDiff?'] = np.where(df1['Price1'] == df2['Price2'], 0, df1['Price1'] - df2['Price2'])

Sample DataFrame

firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
                   'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])


secondProductSet = {'Product2': ['Computer','Phone','Printer','Desk'],
                    'Price2': [900,800,300,350]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])

IIUC then I would merge on products and then calculate the difference:

# Sample data
firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
                   'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])

secondProductSet = {'Product2': ['Computer','Phone','Printer','Desk'],
                    'Price2': [900,800,300,350]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])

# merge your frames together on products
df_m = df1.merge(df2, left_on='Product1', right_on='Product2')
# use .diff to calculate the difference in price
df_m['diff'] = df_m[['Price2', 'Price1']].diff(axis=1)['Price1']

   Product1  Price1  Product2  Price2   diff
0  Computer    1200  Computer     900  300.0
1     Phone     800     Phone     800    0.0
2   Printer     200   Printer     300 -100.0
3      Desk     350      Desk     350    0.0

Also, the reason for using merge is because np.where will compare data with the same index so if the products do not have the same index you will not get the expected result. For example if we move computer in df2 from index 0 to index 3.

firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
                   'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])

secondProductSet = {'Product2': ['Phone','Printer','Desk', 'Computer'],
                    'Price2': [800,300,350,900]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])

Then when you do np.where(df1['Price1'] == df2['Price2'], 'True', 'False') every result will be false.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM