簡體   English   中英

根據 Python 中另一個 dataframe 的列表更新更早日期的值

[英]Update values with earlier date based on list by another dataframe in Python

我想用 dataframe dateEANdf的值更新 dataframe valueEANdf的列Value ,但只更新日期較早的那些行。

valueEANdf的摘錄如下所示:

EAN-Unique  Value
3324324   3.0
asd2343   2.0
Xjkhfsd   1.2
5234XAR   4.5
3434343   2.6

dateEANdf的摘錄如下所示,它包含每個 EAN 兩次,其中包含較早和較晚的日期。

EAN-Unique  Date         Start  Value 
3324324     2018-06-01   yes
3324324     2019-04-30   no
asd2343     2015-03-23   yes
asd2343     2015-07-11   no 
Xjkhfsd     1999-04-12   yes 
Xjkhfsd     2001-02-01   no 
5234XAR     2000-12-13   yes
5234XAR     2013-12-13   no
3434343     1972-05-23   yes   
3434343     1980-11-01   no 

更新的dateEANdf應如下所示:

EAN-Unique  Date         Start  Value 
3324324     2018-06-01   yes    3.0
3324324     2019-04-30   no
asd2343     2015-03-23   yes    2.0
asd2343     2015-07-11   no 
Xjkhfsd     1999-04-12   yes    1.2
Xjkhfsd     2001-02-01   no 
5234XAR     2000-12-13   yes    4.5
5234XAR     2013-12-13   no
3434343     1972-05-23   yes    2.6
3434343     1980-11-01   no 

我的嘗試是

dateEANdf.loc[ (dateEANdf['EAN-Unique'].isin(valueEANdf.unique().tolist())) & ( dateEANdf['Start'] == 'yes') , 'Value' ]  = valueEANdf['Value']

但是,這會將值隨機放置在“某處”,但不會放在較早的日期。 如何解決?

謝謝。

嘗試loc切片,然后map

s = dateEANdf['Start'].eq('yes')
dateEANdf.loc[s, 'Value'] = (dateEANdf.loc[s, 'EAN-Unique']
                                 .map(valueEANdf.set_index('EAN-Unique')['Value'])
                            )

或者 map 整個系列然后where

dateEANdf['Value'] = (dateEANdf['EAN-Unique'].map(valueEANdf.set_index('EAN-Unique')['Value'])
                          .where(dateEANdf['Start'].eq('yes'))
                     )

Output:

  EAN-Unique        Date Start  Value
0    3324324  2018-06-01   yes    3.0
1    3324324  2019-04-30    no    NaN
2    asd2343  2015-03-23   yes    2.0
3    asd2343  2015-07-11    no    NaN
4    Xjkhfsd  1999-04-12   yes    1.2
5    Xjkhfsd  2001-02-01    no    NaN
6    5234XAR  2000-12-13   yes    4.5
7    5234XAR  2013-12-13    no    NaN
8    3434343  1972-05-23   yes    2.6
9    3434343  1980-11-01    no    NaN

您可以進行merge ,然后使用np.where更新值:

# If 'Value' is not already in 'dateEANdf', then remove `dateEANdf.drop('Value', axis=1)`
dateEANdf = dateEANdf.drop('Value', axis=1).merge(valueEANdf, how='left', on='EAN-Unique')
dateEANdf['Value'] = np.where(dateEANdf['Start'] == 'no', np.nan, dateEANdf['Value'])
dateEANdf
Out[1]: 
  EAN-Unique        Date Start  Value
0    3324324  2018-06-01   yes    3.0
1    3324324  2019-04-30    no    NaN
2    asd2343  2015-03-23   yes    2.0
3    asd2343  2015-07-11    no    NaN
4    Xjkhfsd  1999-04-12   yes    1.2
5    Xjkhfsd  2001-02-01    no    NaN
6    5234XAR  2000-12-13   yes    4.5
7    5234XAR  2013-12-13    no    NaN
8    3434343  1972-05-23   yes    2.6
9    3434343  1980-11-01    no    NaN
import pandas as pd
import numpy as np

你也可以這樣做:

dateEANdf['Value']=dateEANdf['EAN-Unique'].apply(
    lambda row: float(valueEANdf[valueEANdf['EAN-Unique']==row].Value))

這會給你:

  EAN-Unique          Date Start  Value
0    3324324    2018-06-01   yes    3.0
1    3324324    2019-04-30    no    3.0
2    asd2343    2015-03-23   yes    2.0
3    asd2343    2015-07-11    no    2.0
4    Xjkhfsd    1999-04-12   yes    1.2
5    Xjkhfsd    2001-02-01    no    1.2
6    5234XAR    2000-12-13   yes    4.5
7    5234XAR    2013-12-13    no    4.5
8    3434343    1972-05-23   yes    2.6
9    3434343    1980-11-01    no    2.6

刪除值中的每一秒值,依賴於這個線程: Pandas 每第 n 行

dateEANdf.loc[1::2,'Value']=np.nan

這將導致:

  EAN-Unique          Date Start  Value
0    3324324    2018-06-01   yes    3.0
1    3324324    2019-04-30    no    NaN
2    asd2343    2015-03-23   yes    2.0
3    asd2343    2015-07-11    no    NaN
4    Xjkhfsd    1999-04-12   yes    1.2
5    Xjkhfsd    2001-02-01    no    NaN
6    5234XAR    2000-12-13   yes    4.5
7    5234XAR    2013-12-13    no    NaN
8    3434343    1972-05-23   yes    2.6
9    3434343    1980-11-01    no    NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM