简体   繁体   English

分组并检查R中的条件

[英]Grouping and checking for a condition in R

I have two tables: restaurant_trans and restaurant_master 我有两个表: restaurant_transrestaurant_master

restaurant_trans has name, date, net_sales restaurant_trans具有名称,日期,net_sales

This is a transaction file with sales for 50 restaurant recorded for 30 days each (1500 obs). 这是一个交易文件,其中记录了50家餐厅的销售情况,每个餐厅记录了30天(1500磅)。

restaurant_master has name, go.live.date, franchise restaurant_master名称为go.live.date,专营权

This is a master file with name of the restaurant and 'go.live.date' is the date a particular device was installed in the restaurant. 这是一个带有餐厅名称的主文件,“ go.live.date”是餐厅中特定设备的安装日期。

I want to find the net sales of the restaurant before and after the device was installed. 我想查找安装该设备前后餐厅的净销售额。 I first want the data to be grouped. 我首先要对数据进行分组。

I tried this code for subsetting the data 我尝试使用此代码对数据进行分组

dummayvar = 0;

for (i in 1:nrow(restaurant_master)){
  for (j in 1:nrow(restaurant_trans)){
    if(restaurant_trans$Restaurant.Name[j]==restaurant_master$Restaurant.Name[i]){
      if(restaurant_trans$Date[j] < restaurant_master$Go.Live.Date[i]){
      append(dummayvar, restaurant_trans$Date)
      }
    }
  }
}

This is giving an error : 这给出了一个错误:

"level sets of factors are different" “因素的水平集不同”

Please help!! 请帮忙!!

Consider a merge() instead of nested for loops. 考虑使用merge()而不是嵌套的for循环。 Simply merge restaurant netsales and master data frames by name and then subset data frames according to net sales' dates and master's go.live.dates. 只需按名称合并餐厅netsalesmaster数据框,然后根据净销售日期和主数据netsales合并子数据框。 Finally, aggregate net sales by restaurant name and franchise or individually. 最后,按餐厅名称和特许经营权或单独汇总销售净额。

# DATA FRAME EXAMPLES
netsales <- data.frame(name=c('A', 'A', 'A', 'A', 'A',
                              'B', 'B', 'B', 'B', 'B',
                              'C', 'C', 'C', 'C', 'C'),
              date=c('6/1/2015', '6/15/2015', '7/1/2015', '9/1/2015', '11/15/2015', 
                     '6/5/2015', '6/20/2015', '7/15/2015', '8/1/2015', '10/15/2015',
                     '6/10/2015', '7/10/2015', '8/15/2015', '9/20/2015', '9/30/2015'),
              net_sales=c(1500,  600,  1200,  850,  750,  
                          1120,  560,  720,  340,  890,  
                          1150,  410,  300,  250,  900))    
netsales$date <- as.Date(strptime(netsales$date, '%m/%d/%Y'))
str(netsales)    

master <- data.frame(name=c('A', 'B', 'C'),
                     go.live.date=c('7/25/2015', '8/1/2015', '7/1/2015'),
                     franchise=c('R Co.', 'Python, Inc.', 'C# Ltd.'))

master$go.live.date <- as.Date(strptime(master$go.live.date, '%m/%d/%Y'))
str(master)    

# MERGE AND AGGREGATE BEFORE GO LIVE SALES
beforelive <- merge(netsales, master, by='name')
beforelive <- beforelive[beforelive$date < beforelive$go.live.date,]

beforelivesales <- aggregate(net_sales ~ name + franchise, beforelive, FUN=sum)

# MERGE AND AGGREGATE AFTER GO LIVE SALES
afterlive <- merge(netsales, master, by='name')
afterlive <- afterlive[afterlive$date >= afterlive$go.live.date,]

afterlivesales <- aggregate(net_sales ~ name + franchise, afterlive, FUN=sum)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM