I have two DFs as follows:
| shopID | itemID |
|--------|--------|
| 2 | 30 |
| 2 | 31 |
| 2 | 32 |
| 2 | 33 |
| 2 | 38 |
| date | shopID | itemID | price | cnt |
|------|--------|--------|--------|-----|
| 0.0 | 2.0 | 33.0 | 499.0 | 1.0 |
| 0.0 | 2.0 | 482.0 | 3300.0 | 1.0 |
| 0.0 | 2.0 | 491.0 | 600.0 | 1.0 |
| 0.0 | 2.0 | 839.0 | 3300.0 | 1.0 |
| 0.0 | 2.0 | 1007.0 | 449.0 | 3.0 |
...
The second one is a time series DF, where date
is the month (for simplicity, starts at 0 and ends at 33). The combination of shopID
and itemID
is not guaranteed to appear in both DFs. I want to left merge the DF1 with DF2 on shopID
and itemID
. I did:
pd.merge(df1, df2, how="left", on=["shopID", "itemID"])
As usual, it gives me the following DF:
| shopID | itemID | date | price | cnt |
|--------|--------|------|--------|-----|
| 2 | 30 | 2.0 | 359.00 | 1.0 |
| 2 | 30 | 5.0 | 399.00 | 1.0 |
| 2 | 30 | 15.0 | 169.00 | 1.0 |
| 2 | 30 | 16.0 | 169.00 | 1.0 |
| 2 | 31 | 1.0 | 699.00 | 4.0 |
| 2 | 31 | 2.0 | 698.50 | 1.0 |
| 2 | 31 | 3.0 | 699.00 | 1.0 |
| 2 | 31 | 16.0 | 415.92 | 1.0 |
| 2 | 31 | 33.0 | 399.00 | 1.0 |
| 2 | 32 | 12.0 | 119.00 | 1.0 |
...
My question is: I want to merge them and have or the latest price (where date
of each combination shopID-itemID
is largest). How can I do this?
EDIT: Expected output (last month only)
| shopID | itemID | date | prince | cnt |
|--------|--------|------|--------|-----|
| 2 | 30 | 16.0 | 169.0 | 1.0 |
| 2 | 31 | 33.0 | 399.00 | 1.0 |
| 2 | 32 | 31.0 | 149.00 | 1.0 |
...
Hard to answer without more information, is this just a simple max date for each itemID? If so can use drop_duplicates like so:
df = pd.DataFrame({'shopID': [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
'itemID': [30, 30, 30, 30, 31, 31, 31, 31, 31, 32],
'date': [2.0, 5.0, 15.0, 16.0, 1.0, 2.0, 3.0, 16.0, 33.0, 12.0],
'price': [359.0,
399.0,
169.0,
169.0,
699.0,
698.5,
699.0,
415.92,
399.0,
119.0],
'cnt': [1.0, 1.0, 1.0, 1.0, 4.0, 1.0, 1.0, 1.0, 1.0, 1.0]})
df.sort_values(by=['itemID', 'date']).drop_duplicates(subset=['itemID'], keep='last')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.