[英]How to calculate "average sales share"
My data concerns a company and includes Total Sales and the amount of sales in three counties CA , TX and WI.我的数据涉及一家公司,包括总销售额和三个县 CA、TX 和 WI 的销售额。
Data :数据 :
> dput(head(WalData))
structure(list(CA = c(11047, 9925, 11322, 12251, 16610, 14696
), TX = c(7381, 5912, 9006, 6226, 9440, 9376), WI = c(6984, 3309,
8883, 9533, 11882, 8664), Total = c(25412, 19146, 29211, 28010,
37932, 32736), date = structure(c(1296518400, 1296604800, 1296691200,
1296777600, 1296864000, 1296950400), tzone = "UTC", class = c("POSIXct",
"POSIXt")), event_type = c("NA", "NA", "NA", "NA", "NA", "Sporting"
), snap_CA = c(1, 1, 1, 1, 1, 1), snap_TX = c(1, 0, 1, 0, 1,
1), snap_WI = c(0, 1, 1, 0, 1, 1)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
I am struggling to calculate the average sales share of the three states on the company's total sales .我正在努力计算这三个州在公司总销售额中的平均销售额份额。
Furthermore, i must calculate the same average percentages for each year, month of the year and day of the week .此外,我必须为每一年、一年中的一个月和一周中的每一天计算相同的平均百分比。
Any advice would be really helpful !任何建议都会非常有帮助!
It would be easier to perform all the calculation if you get your data in long format.如果您以长格式获取数据,则执行所有计算会更容易。
library(dplyr)
library(tidyr)
WalData %>% pivot_longer(cols = CA:WI) %>% mutate(perc = value/Total)
Using dplyr
you can also try next options.使用dplyr
您还可以尝试下一个选项。 For the average sales you can use next code:对于平均销售额,您可以使用下一个代码:
library(dplyr)
#Code 1
AvgSales <- WalData %>% select(c(CA,TX,WI)) %>%
summarise_all(mean,na.rm=T)
Output:输出:
# A tibble: 1 x 3
CA TX WI
<dbl> <dbl> <dbl>
1 12642. 7890. 8209.
For the percentages you need to compute the ratio against Total
:对于您需要计算与Total
比率的百分比:
#Code 2
AvgSalesPerc <- WalData %>% select(c(CA,TX,WI,Total)) %>%
rowwise() %>% mutate(across(CA:WI,~./Total)) %>%
select(-Total) %>% ungroup() %>%
summarise_all(mean,na.rm=T)
Output:输出:
# A tibble: 1 x 3
CA TX WI
<dbl> <dbl> <dbl>
1 0.444 0.278 0.278
In the case of year, month and day, you can extract the value from your date variable, then use group_by()
and obtain the summary.对于年、月和日,您可以从日期变量中提取值,然后使用group_by()
并获取摘要。 I will only do for year as it is easy to extend for month and day:我只会做一年,因为它很容易扩展到月和日:
#Code 3 only year avg sales
AvgSalesYear <- WalData %>% mutate(Year=format(date,'%Y')) %>%
select(c(CA,TX,WI,Year)) %>%
group_by(Year) %>%
summarise_all(mean,na.rm=T)
Output:输出:
# A tibble: 1 x 4
Year CA TX WI
<chr> <dbl> <dbl> <dbl>
1 2011 12642. 7890. 8209.
Same logic for percentages at year level:年份级别百分比的相同逻辑:
#Code 4 only year avg sales percentage
AvgSalesPercYear <- WalData %>% mutate(Year=format(date,'%Y')) %>%
select(c(CA,TX,WI,Year,Total)) %>%
rowwise() %>% mutate(across(CA:WI,~./Total)) %>%
select(-Total) %>%
group_by(Year) %>%
summarise_all(mean,na.rm=T)
Output:输出:
# A tibble: 1 x 4
Year CA TX WI
<chr> <dbl> <dbl> <dbl>
1 2011 0.444 0.278 0.278
We can use data.table
我们可以使用data.table
library(data.table)
melt(setDT(WalData), measure = c("CA", "TX", "WI"))[, perc := value/Total][]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.