简体   繁体   English

通过查找价格变化来获取变化率

[英]Get the rate of change by finding the change in price

UPDATE: I'm getting a strange result in the outcome.更新:我得到了一个奇怪的结果。 Occasionally, the earliest date of the result show after 2 or 3 etc times for example有时,结果的最早日期会在 2 或 3 次之后显示,例如

Item物品 Kg公斤 Date_1日期_1 Price_1价格_1 change_1更改_1 Date_2日期_2 Price_2价格_2 change_2更改_2
Apples苹果 1 1 2022-02-01 2022-02-01 1 1 NA不适用 2022-02-16 2022-02-16 2 2 1 1
Meat NA不适用 NA不适用 NA不适用 NA不适用 2022-02-03 2022-02-03 1 1 NA不适用

As you can see, meat is showing no change at first, but the result is showing in the second one.如您所见,肉一开始没有变化,但结果显示在第二个。 This occurs throughout the program.这发生在整个程序中。 Any idea why?知道为什么吗?

I am fairly new to programming.我对编程相当陌生。 I am working on my portfolio, and am looking at a dataset regarding the price of food from distribution centers to a grocery store.我正在研究我的投资组合,并且正在查看有关从配送中心到杂货店的食品价格的数据集。 What I am looking at is a set of data with the price, item, and date of transaction.我正在查看的是一组包含价格、商品和交易日期的数据。 What I am looking for is to find the rate of change from the distribution center to the store, and when it happened.我要查找的是从配送中心到商店的变化率,以及发生的时间。

Note: the price of the item changes from the distribution center.注:商品价格从配送中心变更。

Here is an example of what I am looking at:这是我正在查看的示例:

Date日期 Item物品 Price价格 Kg公斤
01.02.2022 01.02.2022 Apple苹果 $1.00 1.00 美元 1 1
02.02.2022 02.02.2022 Meat $4.00 4.00 美元 1 1
03.02.2022 03.02.2022 Fish $3.00 3.00 美元 1 1
03.02.2022 03.02.2022 Bread面包 $1.00 1.00 美元 1 1
15.02.2022 15.02.2022 Meat $5.00 5.00 美元 1 1
15.02.2022 15.02.2022 Meat $3.00 3.00 美元 1 1
16.02.2022 16.02.2022 Apple苹果 $2.00 2.00 美元 1 1
20.02.2022 20.02.2022 Fish $3.00 3.00 美元 1 1
25.02.2022 25.02.2022 Apple苹果 $0.50 0.50 美元 1 1

As you can see, the price for the same quantity for the same product changes randomly over time.如您所见,相同数量的相同产品的价格随时间随机变化。 What I would like to analyse is:我要分析的是:

  1. The rate of change per item每个项目的变化率
  2. When the change occured当改变发生时

This is the ideal outcome:这是理想的结果:

item物品 kg公斤 1st_price第一价格 1st_price_date 1st_price_date 2nd_price 2nd_price 2nd_price_date第二个价格日期 amount_of_change amount_of_change
Apple苹果 1 1 $1.00 1.00 美元 01.02.2022 01.02.2022 $2.00 2.00 美元 16.02.2022 16.02.2022 +$1.00 +$1.00
Meat 1 1 $4.00 4.00 美元 02.02.2022 02.02.2022 $5.00 5.00 美元 15.02.2022 15.02.2022 +$1.00 +$1.00
Bread面包 1 1 $1.00 1.00 美元 03.02.2022 03.02.2022 N/A不适用 N/A不适用 N/A不适用
Fish 1 1 $3.00 3.00 美元 03.02.2022 03.02.2022 $3.00 3.00 美元 20.02.2022 20.02.2022 +$0.00 +0.00 美元

#Continuing the table below. #继续下表。 These columns would go to the right of the columns above.这些列将位于上述列的右侧。 #Unfortunetly, StackOverflow was not able to create a table with everything together. #不幸的是,StackOverflow 无法创建一个包含所有内容的表。 #total_change is for the entire period #total_change 适用于整个期间

item物品 3rd_price 3rd_price 3rd_price_date 3rd_price_date amount_of_change amount_of_change change_duration_period change_duration_period total_change总变化
Apple苹果 $0.50 0.50 美元 25.02.2022 25.02.2022 -$1.50 -$1.50 01.02.2022-25.02.2022 01.02.2022-25.02.2022 -$0.50 -0.50 美元
Meat $3.00 3.00 美元 15.02.2022 15.02.2022 -$2.00 -$2.00 02.02.2022-1502.2022 02.02.2022-1502.2022 -$1.00 -$1.00
Bread面包 N/A不适用 N/A不适用 N/A不适用 03.02.2022-03.02-2022 03.02.2022-03.02-2022 +$0.00 +0.00 美元
Fish $3.00 3.00 美元 20.02.2022 20.02.2022 +$0.00 +0.00 美元 03.02.2022-20.02.2022 03.02.2022-20.02.2022 +$0.00 +0.00 美元

As you can see, some items can have more price changes per month than others depending on the item.如您所见,某些商品每月的价格变化可能比其他商品多,具体取决于商品。 Some items have drastic changes, some have no changes at all.有些项目发生了巨大的变化,有些则根本没有变化。

Presuming there are over 14,000 unique items what would you recommend to gather the data an place them in a table as seen in the "Ideal outcome" section?假设有超过 14,000 个独特的项目,您会建议如何收集数据并将它们放在表格中,如“理想结果”部分所示?

I am still new to programming, please don't be too harsh!我还是编程新手,请不要太苛刻!

Thanks!谢谢!

Something like this?像这样的东西?

library(tidyverse)

df %>%
  # convert Date to a date, and Price to a number
  mutate(Date = as.Date(Date, format = "%d.%m.%Y"),
         Price = parse_number(Price)) %>%

  # for each Item, arrange by Date, tally, and calc price change
  group_by(Item) %>%
  arrange(Date) %>%
  mutate(appearance = row_number(),
         change = Price - lag(Price)) %>%
  ungroup() %>%

  # use the tally to reshape wider the date, price and change
  pivot_wider(names_from = appearance, 
              values_from = c(Date, Price, change),
              names_vary = "slowest")

Result结果

# A tibble: 4 × 11
  Item     Kg Date_1     Price_1 change_1 Date_2     Price_2 change_2 Date_3     Price_3 change_3
  <chr> <int> <date>       <dbl>    <dbl> <date>       <dbl>    <dbl> <date>       <dbl>    <dbl>
1 Apple     1 2022-02-01       1       NA 2022-02-16       2        1 2022-02-25     0.5     -1.5
2 Meat      1 2022-02-02       4       NA 2022-02-15       5        1 2022-02-15     3       -2  
3 Fish      1 2022-02-03       3       NA 2022-02-20       3        0 NA            NA        0  
4 Bread     1 2022-02-03       1       NA NA              NA        0 NA            NA        0  

Source data源数据

df <- data.frame(
  stringsAsFactors = FALSE,
              Date = c("01.02.2022","02.02.2022",
                       "03.02.2022","03.02.2022","15.02.2022","15.02.2022",
                       "16.02.2022","20.02.2022","25.02.2022"),
              Item = c("Apple","Meat","Fish",
                       "Bread","Meat","Meat","Apple","Fish","Apple"),
             Price = c("$1.00","$4.00","$3.00",
                       "$1.00","$5.00","$3.00","$2.00","$3.00","$0.50"),
                Kg = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)
) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM