简体   繁体   English

根据R中的属性将表分为几组

[英]Divide a table into groups based on an attribute in R

I have a table(csv file) where first two attributes are store and dept and then there are other attributes like Date, Sales etc. The table is as follows:- 我有一个表(csv文件) ,其中前两个属性是storedept ,然后还有其他属性,如Date,Sales等。该表如下:-

Store Dept Date Sales Holiday  
1      1    ...  ...   ...  (... means some random value)
1      1    ...  ...   ...  
1      2    ...  ...   ...  
1      2    ...  ...   ...  
1      3    ...  ...   ...  
2      1    ...  ...   ...   
2      1    ...  ...   ...  
2      2    ...  ...   ...   
2      2    ...  ...   ...  
  1. Now first I loaded this file into a train variable:- 现在,我首先将此文件加载到火车变量中:

    train<- read.csv("train.csv") train <-read.csv(“ train.csv”)

  2. Then, I divided/grouped it based on Store: 然后,我根据商店将其划分/分组:

    dataByStore<- split(train, train$Store) dataByStore <-split(train,train $ Store)

  3. Now, I want to dataByStore and divide it as per department . 现在,我要dataByStore并将其按部门划分。 So, as a result I will get data of each department of each store . 因此,我将获得每个商店每个部门的数据。 I think for this, I will have to initialise an array of the size of the number of stores eg: dataByStoredept, and for each store i do 我想为此,我将不得不初始化一个存储数量大小的数组 ,例如:dataByStoredept,对于每个存储

    dataByStoreByDept[i]<- split(dataByStore[i], dataByStore[i]$Dept) dataByStoreByDept [i] <-split(dataByStore [i],dataByStore [i] $ Dept)

So, dataByStoreByDept[i][0] will contain the first department data of store i and so on. 因此, dataByStoreByDept [i] [0]将包含商店i的第一部门数据,依此类推。 Can anyone tell me the syntax to do this as I don't know how to declare such a 2d array. 谁能告诉我这样做的语法,因为我不知道如何声明这样的2d数组。 A short explanation with few lines of code would suffice. 只需几行代码的简短说明就足够了。

Do mention if any of my presumptions above are wrong. 如果我的任何上述假设有误,请提及。

Update: 更新:
For the third step, I want to write a function which should go as follows(Its only the syntax that I don't know): 对于第三步,我想编写一个函数,该函数应如下所示(它只是我不知道的语法):

dataByStoreByDept<- array(seq_len(dataByStore)) -------> seq_len(dataByStore) is the number of stores

for(i in seq_len(dataByStore)){
dataByStoreByDept[i]<- split(dataByStore, dataByStore$dept) 
}

Making up some data: 组成一些数据:

df <- data.frame(Store = sample(1:2, 20, replace = TRUE), 
                 Dept  = sample(1:2, 20, replace = TRUE))

suppose we want to split the data.frame, df , first by Store then by Dept . 假设我们要首先通过Store再通过Dept拆分data.frame, df We can do that as follows: 我们可以这样做,如下所示:

lapply(split(df, as.factor(df$Store)), FUN = function(x) split(x, x$Dept))

The split(df, as.factor(df$Store)) part does the first split, by Store . split(df, as.factor(df$Store))部分通过Store进行第一次拆分。 The result of that is a list. 结果是一个列表。 We then use lapply to apply split on each element of the list created by split(df, as.factor(df$Store)) . 然后,我们使用lapplysplit应用于split split(df, as.factor(df$Store))创建的列表的每个元素。 I put the split into a wrapper function so that I could pass the second split factor to split . 我将split放入包装函数中,以便可以传递第二个split因子进行split

This will give you a list of lists as you describe. 这将为您提供您所描述的列表列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM