简体   繁体   中英

Grouping and checking for a condition in R

I have two tables: restaurant_trans and restaurant_master

restaurant_trans has name, date, net_sales

This is a transaction file with sales for 50 restaurant recorded for 30 days each (1500 obs).

restaurant_master has name, go.live.date, franchise

This is a master file with name of the restaurant and 'go.live.date' is the date a particular device was installed in the restaurant.

I want to find the net sales of the restaurant before and after the device was installed. I first want the data to be grouped.

I tried this code for subsetting the data

dummayvar = 0;

for (i in 1:nrow(restaurant_master)){
  for (j in 1:nrow(restaurant_trans)){
    if(restaurant_trans$Restaurant.Name[j]==restaurant_master$Restaurant.Name[i]){
      if(restaurant_trans$Date[j] < restaurant_master$Go.Live.Date[i]){
      append(dummayvar, restaurant_trans$Date)
      }
    }
  }
}

This is giving an error :

"level sets of factors are different"

Please help!!

Consider a merge() instead of nested for loops. Simply merge restaurant netsales and master data frames by name and then subset data frames according to net sales' dates and master's go.live.dates. Finally, aggregate net sales by restaurant name and franchise or individually.

# DATA FRAME EXAMPLES
netsales <- data.frame(name=c('A', 'A', 'A', 'A', 'A',
                              'B', 'B', 'B', 'B', 'B',
                              'C', 'C', 'C', 'C', 'C'),
              date=c('6/1/2015', '6/15/2015', '7/1/2015', '9/1/2015', '11/15/2015', 
                     '6/5/2015', '6/20/2015', '7/15/2015', '8/1/2015', '10/15/2015',
                     '6/10/2015', '7/10/2015', '8/15/2015', '9/20/2015', '9/30/2015'),
              net_sales=c(1500,  600,  1200,  850,  750,  
                          1120,  560,  720,  340,  890,  
                          1150,  410,  300,  250,  900))    
netsales$date <- as.Date(strptime(netsales$date, '%m/%d/%Y'))
str(netsales)    

master <- data.frame(name=c('A', 'B', 'C'),
                     go.live.date=c('7/25/2015', '8/1/2015', '7/1/2015'),
                     franchise=c('R Co.', 'Python, Inc.', 'C# Ltd.'))

master$go.live.date <- as.Date(strptime(master$go.live.date, '%m/%d/%Y'))
str(master)    

# MERGE AND AGGREGATE BEFORE GO LIVE SALES
beforelive <- merge(netsales, master, by='name')
beforelive <- beforelive[beforelive$date < beforelive$go.live.date,]

beforelivesales <- aggregate(net_sales ~ name + franchise, beforelive, FUN=sum)

# MERGE AND AGGREGATE AFTER GO LIVE SALES
afterlive <- merge(netsales, master, by='name')
afterlive <- afterlive[afterlive$date >= afterlive$go.live.date,]

afterlivesales <- aggregate(net_sales ~ name + franchise, afterlive, FUN=sum)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM