[英]Divide a table into groups based on an attribute in R
I have a table(csv file) where first two attributes are store and dept and then there are other attributes like Date, Sales etc. The table is as follows:- 我有一个表(csv文件) ,其中前两个属性是store和dept ,然后还有其他属性,如Date,Sales等。该表如下:-
Store Dept Date Sales Holiday
1 1 ... ... ... (... means some random value)
1 1 ... ... ...
1 2 ... ... ...
1 2 ... ... ...
1 3 ... ... ...
2 1 ... ... ...
2 1 ... ... ...
2 2 ... ... ...
2 2 ... ... ...
Now first I loaded this file into a train variable:- 现在,我首先将此文件加载到火车变量中:
train<- read.csv("train.csv") train <-read.csv(“ train.csv”)
Then, I divided/grouped it based on Store: 然后,我根据商店将其划分/分组:
dataByStore<- split(train, train$Store) dataByStore <-split(train,train $ Store)
Now, I want to dataByStore and divide it as per department . 现在,我要dataByStore并将其按部门划分。 So, as a result I will get data of each department of each store .
因此,我将获得每个商店每个部门的数据。 I think for this, I will have to initialise an array of the size of the number of stores eg: dataByStoredept, and for each store i do
我想为此,我将不得不初始化一个存储数量大小的数组 ,例如:dataByStoredept,对于每个存储我
dataByStoreByDept[i]<- split(dataByStore[i], dataByStore[i]$Dept) dataByStoreByDept [i] <-split(dataByStore [i],dataByStore [i] $ Dept)
So, dataByStoreByDept[i][0] will contain the first department data of store i and so on. 因此, dataByStoreByDept [i] [0]将包含商店i的第一部门数据,依此类推。 Can anyone tell me the syntax to do this as I don't know how to declare such a 2d array.
谁能告诉我这样做的语法,因为我不知道如何声明这样的2d数组。 A short explanation with few lines of code would suffice.
只需几行代码的简短说明就足够了。
Do mention if any of my presumptions above are wrong. 如果我的任何上述假设有误,请提及。
Update: 更新:
For the third step, I want to write a function which should go as follows(Its only the syntax that I don't know): 对于第三步,我想编写一个函数,该函数应如下所示(它只是我不知道的语法):
dataByStoreByDept<- array(seq_len(dataByStore)) -------> seq_len(dataByStore) is the number of stores
for(i in seq_len(dataByStore)){
dataByStoreByDept[i]<- split(dataByStore, dataByStore$dept)
}
Making up some data: 组成一些数据:
df <- data.frame(Store = sample(1:2, 20, replace = TRUE),
Dept = sample(1:2, 20, replace = TRUE))
suppose we want to split the data.frame, df
, first by Store
then by Dept
. 假设我们要首先通过
Store
再通过Dept
拆分data.frame, df
。 We can do that as follows: 我们可以这样做,如下所示:
lapply(split(df, as.factor(df$Store)), FUN = function(x) split(x, x$Dept))
The split(df, as.factor(df$Store))
part does the first split, by Store
. split(df, as.factor(df$Store))
部分通过Store
进行第一次拆分。 The result of that is a list. 结果是一个列表。 We then use
lapply
to apply split
on each element of the list created by split(df, as.factor(df$Store))
. 然后,我们使用
lapply
将split
应用于split
split(df, as.factor(df$Store))
创建的列表的每个元素。 I put the split
into a wrapper function so that I could pass the second split factor to split
. 我将
split
放入包装函数中,以便可以传递第二个split因子进行split
。
This will give you a list of lists as you describe. 这将为您提供您所描述的列表列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.