[英]How can I copy the value of one cell in a csv file to another csv file using Python and Pandas?
I am following this Datafish tutorial as I have been tasked to update a price list.我正在关注这个Datafish 教程,因为我的任务是更新价目表。 There are over 5000 (Target) rows of data in the one dataframe and 900 (Source)in the other.
一个 dataframe 中有超过 5000 行(目标)数据,另一个中有 900 行(源)。 I am stuck as to how to add (in the context of the tutorial) the difference that is produced by comparing the two dataframes to the second dataframe so as to update the second dataframe.
我不知道如何添加(在本教程的上下文中)通过将两个数据帧与第二个 dataframe 进行比较而产生的差异,以便更新第二个 dataframe。 Could someone point me in the direction I should be heading, which method or a snippet of how to add things.
有人可以指出我应该前进的方向,哪种方法或如何添加东西的片段。
The snippet from the tutorial here creates a price difference column (second line).此处教程的片段创建了一个价格差异列(第二行)。 I want to take that result and add it ti the Price2 column or if there is a way to simply use the True/False logic that is created in the first line and copy Price1 to Price2.
我想获取该结果并将其添加到 Price2 列,或者如果有一种方法可以简单地使用在第一行中创建的 True/False 逻辑并将 Price1 复制到 Price2。
df1['pricesMatch?'] = np.where(df1['Price1'] == df2['Price2'], 'True', 'False')
df1['priceDiff?'] = np.where(df1['Price1'] == df2['Price2'], 0, df1['Price1'] - df2['Price2'])
Sample DataFrame样品 DataFrame
firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])
secondProductSet = {'Product2': ['Computer','Phone','Printer','Desk'],
'Price2': [900,800,300,350]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])
IIUC then I would merge on products and then calculate the difference: IIUC 然后我会合并产品然后计算差异:
# Sample data
firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])
secondProductSet = {'Product2': ['Computer','Phone','Printer','Desk'],
'Price2': [900,800,300,350]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])
# merge your frames together on products
df_m = df1.merge(df2, left_on='Product1', right_on='Product2')
# use .diff to calculate the difference in price
df_m['diff'] = df_m[['Price2', 'Price1']].diff(axis=1)['Price1']
Product1 Price1 Product2 Price2 diff
0 Computer 1200 Computer 900 300.0
1 Phone 800 Phone 800 0.0
2 Printer 200 Printer 300 -100.0
3 Desk 350 Desk 350 0.0
Also, the reason for using merge is because np.where
will compare data with the same index so if the products do not have the same index you will not get the expected result.此外,使用合并的原因是因为
np.where
将比较具有相同索引的数据,因此如果产品没有相同的索引,您将不会得到预期的结果。 For example if we move computer in df2 from index 0 to index 3.例如,如果我们将 df2 中的计算机从索引 0 移动到索引 3。
firstProductSet = {'Product1': ['Computer','Phone','Printer','Desk'],
'Price1': [1200,800,200,350]}
df1 = pd.DataFrame(firstProductSet,columns= ['Product1', 'Price1'])
secondProductSet = {'Product2': ['Phone','Printer','Desk', 'Computer'],
'Price2': [800,300,350,900]}
df2 = pd.DataFrame(secondProductSet,columns= ['Product2', 'Price2'])
Then when you do np.where(df1['Price1'] == df2['Price2'], 'True', 'False')
every result will be false.然后当你做
np.where(df1['Price1'] == df2['Price2'], 'True', 'False')
每个结果都会是假的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.