[英]Count the occurrences of factors
我有這樣的df,由colname Var1上的1列組成
df <- read.table(text = "Var1
|12|24|22|1|4
|12|23|22|1|445
|12|22|22|1|4
|101|116
|101|116|116|174
|101|116|125|174
|101|116|150|174
|101|116|156
|101|116|156|174
|101|116|162", header = TRUE, stringsAsFactors = FALSE)
問題:
DF <- read.table(text = "Var1
|12|24|22|1|4
|12|23|22|1|445
|12|22|22|1|4
|101|116
|101|116|116|174
|101|116|125|174
|101|116|150|174
|101|116|156
|101|116|156|174
|101|116|162", header = TRUE, stringsAsFactors = FALSE)
x <- strsplit(DF$Var1, "|", fixed = TRUE)
sum(unlist(x) == "22")
#[1] 4
sum(sapply(x, function(s) "22" %in% s))
#[1] 3
您也可能只使用|
讀取了數據集|
作為列分隔符,那么所有操作將非常簡單
df <- as.matrix(read.table(text = "|12|24|22|1|4
|12|23|22|1|445
|12|22|22|1|4
|101|116
|101|116|116|174
|101|116|125|174
|101|116|150|174
|101|116|156
|101|116|156|174
|101|116|162", fill = TRUE, sep = "|"))
sum(df == 22, na.rm = TRUE)
# [1] 4
(rowSums(df == 22, na.rm = TRUE) > 0) + 0
# [1] 1 1 1 0 0 0 0 0 0 0
sum(rowSums(df == 22, na.rm = TRUE) > 0)
# [1] 3
另外,您也可以將原始df
轉換為data.table
並使用tstrsplit
函數
df <- read.table(text = "Var1
|12|24|22|1|4
|12|23|22|1|445
|12|22|22|1|4
|101|116
|101|116|116|174
|101|116|125|174
|101|116|150|174
|101|116|156
|101|116|156|174
|101|116|162", header = TRUE)
library(data.table)
DT <- setDT(df)[, tstrsplit(Var1, "|", fixed = TRUE)]
DT[, sum(.SD == 22, na.rm = TRUE)]
# [1] 4
DT[, sum(rowSums(.SD == 22, na.rm = TRUE) > 0)]
# [1] 3
使用正則表達式很容易
sum(grepl("\\|22(\\||$)", df$Var1))
請下次發布可復制的示例。
您可以使用帶有grepl的正則表達式來執行此操作。 使用df作為您的data.frame
length(df[grepl('|22|',df$Var, fixed=T),])
這將回答您的第二個問題,並且可以輕松地適用於問題1。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.