[英]data frame cannot be subsetted by using cbind() with selector in R
[英]R programming; double indexing loop to find mean of subsetted data frame
我想運行一個使用三個索引的“ for”循環。 基本上,我想對數據框進行子集處理,找到子集的均值,然后將平均值放入新的數據幀中。 我在運行此循環時遇到麻煩; 我得到的只是NaN的。
第一個索引用於匹配新數據幀(我稱為data.avg)的行; 第二個索引用於索引要在子設置條件的前半部分使用的向量(日期值來自特定月份); 第二個索引與上面的相同,但是對於子設置條件的第二部分(該行與Breakfast / Dinner / Snacks相關聯)。
# Create the data frame
data1 = data.frame(date = sort(rep(as.Date(42948:43101, origin = "1899-12-30"),3)),
serving = rep(c("Breakfast", "Dinner", "Snacks"), 154),
units = rep(c(1,5,49), 154)
)
View(data1[order(data1$date),])
# take mean of each subset and place it in a new data frame called data.avgs
# it should consist of 8x3 data frame; rows (column1) are "August","September", "October", "November", "December", "January","February", "March".
# columns should be "Breakfast", "Dinner", "Snack"
month.index = c(8:12, 1)
serving.index = c("Breakfast", "Dinner", "Snack")
# create the data frame with the means using placeholder data
data.avg = data.frame(months = c(month.name[8:12], month.name[1]),
bf.avg = c(1:6),
dinner.avg = c(1:6),
snack.avg = c(1:6))
# now start replacing; find the mean of the subset of the original data frame.
# find the mean of all dates that are for August, and whose serving type are for Breakfast.
for(j in 1:6){
for(i in month.index){
for(v in 2:4){
data.avg[j,v] = mean(
subset(data1,
months(data1$date) == month.name[i] & data1$serving == serving.index[v])$units
)
}
}
}
例如,當我在沒有循環的情況下進行均值運算時,會出現這種情況;
mean(subset(data1,
months(data1$date) == "September" & data1$serving == "Breakfast")$unit)
我得到正確的意思。 因此,我認為我的問題可能出在索引設置上。
任何幫助都將不勝感激,
謝謝
編輯; 修復了上面的代碼。 結果數據幀如下;
months bf.avg dinner.avg snack.avg
1 August 5 49 NaN
2 September 5 49 NaN
3 October 5 49 NaN
4 November 5 49 NaN
5 December 5 49 NaN
6 January 5 49 NaN
這是我要找的東西;
mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Breakfast")$unit)
[1] 1
> mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Dinner")$unit)
[1] 5
> mean(subset(data1,
+ months(data1$date) == "September" & data1$serving == "Snacks")$unit)
[1] 49
我的理解是這些應該是data1.avg [1,1:3]
您在serving.index
設置了“小吃”,但在data1
卻有“小吃”。
然后在for循環中嘗試以下代碼:
data.avg[j,v+1] = mean(
subset(data1,months(data1$date) == month.name[i] & as.character(data1$serving) == serving.index[v])$units)
data.avg
months bf.avg dinner.avg snack.avg
1 August 1 5 49
2 September 1 5 49
3 October 1 5 49
4 November 1 5 49
5 December 1 5 49
6 January 1 5 49
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.