[英]loop through columns of data frame in r
我有以下問題:
levelsvar <- c("arrears", "expenses", "warmhome", "telephone", "colorTV", "washer", "car", "meatfish", "holiday")
variables <- NULL
for (i in 1:length(levelsvar)) {
variables <- sapply(levelstest, function(x) (length(test$levelsvar[i][test$country==x & test$levelsvar[i]=="1"]) + length(test$levelsvar[i][test$country==x & test$levelsvar[i]=="2"])) / length(test$levelsvar[i][test$country==x]))
}
我想使用一個for循環來執行在“ levelsvar”的所有級別上可以看到9次以上的功能。 我嘗試了很多次,但都失敗了。 我認為問題是r讀
test$"arrears"
代替
test$arrears
我已經嘗試使用noquote()了,但是沒有幫助。
您有解決此問題的方法嗎?
先感謝您!
編輯:
舉個例子
levelstest <- c("AT", "BE")
levelsvar <- c("arrears", "expenses", "warmhome", "telephone", "colorTV", "washer", "car", "meatfish", "holiday")
structure(list(country = c("AT", "AT", "AT", "BE", "BE", "BE"
), arrears = c(1L, 1L, 1L, 2L, 1L, 1L), expenses = c(3L, 1L,
3L, 1L, 1L, 2L), warmhome = c(1L, 2L, 2L, 1L, 1L, 1L), telephone = c(4L,
1L, 4L, 4L, 3L, 3L), colorTV = c(2L, 1L, 3L, 4L, 3L, 1L), washer = c(4L,
1L, 3L, 3L, 1L, 2L), car = c(4L, 4L, 4L, 4L, 3L, 2L), meatfish = c(2L,
1L, 1L, 4L, 1L, 1L), holiday = c(2L, 2L, 1L, 3L, 4L, 2L)), .Names = c("country",
"arrears", "expenses", "warmhome", "telephone", "colorTV", "washer",
"car", "meatfish", "holiday"), row.names = c(NA, 6L), class = "data.frame")
現在我嘗試
variables <- NULL
for (i in 1:length(levelsvar)) {
variables <- sapply(levelstest, function(x) (length(test[levelsvar[i]][test$country==x & test[levelsvar[i]]=="1"]) + length(test[levelsvar[i]][test$country==x & test[levelsvar[i]]=="2"])) / length(test[levelsvar[i]][test$country==x]))
}
但這不起作用。
我想要實現的是獲得百分比(length(test$arrears[test$country==x & test$arrears=="1"]) + length(test$arrears[test$country==x & test$arrears=="2"])) / length(test$arrears[test$country==x]))
levelsvar
(length(test$arrears[test$country==x & test$arrears=="1"]) + length(test$arrears[test$country==x & test$arrears=="2"])) / length(test$arrears[test$country==x]))
的所有級別都為levelsvar
(值1和2),並且在levelstest
所有國家/ levelstest
。
解決我的問題的方法如下:
test <- (structure(list(country = c("AT", "AT", "AT", "BE", "BE", "BE"
), arrears = c(1L, 1L, 1L, 2L, 1L, 1L), expenses = c(3L, 1L,
3L, 1L, 1L, 2L), warmhome = c(1L, 2L, 2L, 1L, 1L, 1L), telephone = c(4L,
1L, 4L, 4L, 3L, 3L), colorTV = c(2L, 1L, 3L, 4L, 3L, 1L), washer = c(4L,
1L, 3L, 3L, 1L, 2L), car = c(4L, 4L, 4L, 4L, 3L, 2L), meatfish = c(2L,
1L, 1L, 4L, 1L, 1L), holiday = c(2L, 2L, 1L, 3L, 4L, 2L)), .Names = c("country",
"arrears", "expenses", "warmhome", "telephone", "colorTV", "washer",
"car", "meatfish", "holiday"), row.names = c(NA, 6L), class = "data.frame"))
levelsvar <- c("arrears", "expenses", "warmhome", "telephone", "colorTV", "washer", "car", "meatfish", "holiday")
levelstest <- c("AT", "BE")
variables <- NULL
for (i in 1:length(levelsvar)) {
variables <- cbind(variables, sapply(levelstest, function(x) (length(test[levelsvar[i]][test[1]==x & test[levelsvar[i]]=="1"]) + length(test[levelsvar[i]][test[1]==x & test[levelsvar[i]]=="2"])) / length(test[levelsvar[i]][test[1]==x])))
}
您需要做的只是測試,這是:
apply(test[-1],MARGIN = 2,function(x){
tapply(x,test$country,function(y){
sum(y %in% c(1,2))/length(y)
})
})
帶有margin = 2的apply()將沿着您的列前進,而tapply()將基於分組(國家/地區)計算自定義函數。 它甚至保留您的變量名。 test [-1]將跳過國家/地區列。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.