[英]R studio Error in `$<-.data.frame`(`*tmp*`) : replacement has 1 row, data has 0
[英]R Error in `$<-.data.frame`(`*tmp*`, "newCol", value = "categories") : replacement has 1 row, data has 0
我正在嘗試使用具有四列的 dataframe 對雜亂的數據進行分類:
"company_name" "categories" "search" "company_type" John landscaping Landscaping lawn NA Brother Lawn care Cleaning clean NA Top cleaning Painting paint NA
我希望我的最終結果如下所示:
"company_name" "categories" "search" "company_type" John landscaping Landscaping lawn Landscaping Brother Lawn care Cleaning clean Landscaping Top cleaning Painting paint Cleaning
我在這里使用由 Chris Leonard 創建的 function: https://r-dir.com/blog/2015/01/quickly-categorize-messy-data.ZFC35FDC70D5FC69D239888A
這是代碼
df$company_type <- NA
categorizeDF <- function(df, searchColName, searchList, catList, newColName="Category") {
catDF <- data.frame(matrix(ncol=ncol(df), nrow=0))
colnames(catDF) <- paste0(names(df))
df$sequence <- seq(nrow(df))
for (i in seq_along(searchList)) {
rownames(df) <- NULL
index <- grep(searchList[i], df[,which(colnames(df) == searchColName)], ignore.case=TRUE)
tempDF <- df[index,]
tempDF$newCol <- catList[i]
catDF <- rbind(catDF, tempDF)
df <- df[-index,]
}
if (nrow(df) > 0) {
df$newCol <- "OTHER"
catDF <- rbind(catDF, df)
}
catDF <- catDF[order(catDF$sequence),]
catDF$sequence <- NULL
rownames(catDF) <- NULL
catDF$newCol <- as.factor(catDF$newCol)
colnames(catDF)[which(colnames(catDF) == "newCol")] <- newColName
catDF
}
sorted <- categorizeDF(df, "company_name", "search", "categories", "company_type")
但是,我收到一個錯誤(回溯):
Error in `$<-.data.frame`(`*tmp*`, "newCol", value = "categories") :
replacement has 1 row, data has 0
4.
stop(sprintf(ngettext(N, "replacement has %d row, data has %d",
"replacement has %d rows, data has %d"), N, nrows), domain = NA)
3.
`$<-.data.frame`(`*tmp*`, "newCol", value = "categories")
2.
`$<-`(`*tmp*`, "newCol", value = "categories")
1.
categorizeDF(df, "company_name", "search", "categories", "company_type")
任何幫助,將不勝感激。
這是由不在任何雜亂數據列中的搜索字符串引起的
更新並且有效:
categorizeDF <- function(df, searchColName, searchList, catList, newColName="Category") {
catDF <- data.frame(matrix(ncol=ncol(df), nrow=0))
colnames(catDF) <- paste0(names(df))
df$sequence <- seq(nrow(df))
for (i in seq_along(searchList)) {
rownames(df) <- NULL
index <- grep(searchList[i], df[,which(colnames(df) == searchColName)], ignore.case=TRUE)
if (identical(index,integer(0))){
next
}
tempDF <- df[index,]
tempDF$newCol <- catList[i]
catDF <- rbind(catDF, tempDF)
df <- df[-index,]
}
if (nrow(df) > 0) {
df$newCol <- "OTHER"
catDF <- rbind(catDF, df)
}
catDF <- catDF[order(catDF$sequence),]
catDF$sequence <- NULL
rownames(catDF) <- NULL
catDF$newCol <- as.factor(catDF$newCol)
colnames(catDF)[which(colnames(catDF) == "newCol")] <- newColName
catDF
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.