[英]in R, how to search for a particular pattern in a table of string
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17
1 1 1 round-0 10523 180 yellow NA NA NA NA NA NA NA NA
2 11973 1 round-1 19478 150 yellow NA NA NA NA NA NA NA NA
3 22428 1 round-2 28928 130 yellow 29928 150 brake 31433 160 red NA NA NA NA
4 39333 1 round-3 47333 160 yellow 48588 185 red NA NA NA NA NA NA
5 49788 1 round-4 56288 130 yellow 58038 165 brake 59038 175 red NA NA NA NA
6 64688 1 round-5 71693 140 yellow 74293 192 red 74393 194 crash NA NA NA NA
7 85148 1 round-6 91648 130 yellow 94648 190 red NA NA NA NA NA NA
8 95598 1 round-7 103653 130 yellow 104903 155 brake 105403 165 red NA NA NA NA
9 112703 1 round-8 122758 130 yellow 125758 190 red 125758 190 crash NA NA NA NA
10 136513 1 round-9 146563 130 yellow 147963 158 brake 148063 160 red NA NA NA NA
11 157118 1 round-10 164618 150 yellow 167118 200 red NA NA NA NA NA NA
12 167568 1 round-11 179123 120 yellow 182023 178 red 182373 185 brake 182623 190 crash NA NA
13 193378 1 round-12 200378 140 yellow 201878 170 red 203278 198 crash NA NA NA NA
所以我在 csv 中有一個表,用 read.table 導入,看起來像上面那樣。 如您所見,數據甚至不在每一行中。 它是三元組的形式,V3 V6等等存儲字符串,它們對應前面兩個單元格中存儲的數字
所以一直困擾着我的是,我似乎無法弄清楚如何編寫一個計數來計算表中模式的數量。 我的任務是統計右剎出現碰撞的次數。 我知道我可以像嵌套 ifelse 一樣使用,就像我在存儲剎車前找出兩個單元格的數字時所做的那樣:
df$brake <- ifelse(df$V9 == "brake", df$V7,
ifelse(df$V12 == "brake", df$V10,
ifelse(df$V15 == "brake", df$V13,
ifelse(df$V18 == "brake", df$V16,
ifelse(df$V21 == "brake", df$V19,
ifelse(df$V24 == "brake", df$V22,
ifelse(df$V27 == "brake", df$V25, NA)))))))
,但我有 35 列,我只是想知道是否有更精簡的方法來做到這一點。 任何幫助將不勝感激 :)
編輯:對不起,不清楚。 我遇到的主要問題是如何計算剎車后發生碰撞的頻率。 (我添加了用於提取剎車前 2 個單元格的數值的代碼,因為我認為兩者的解決方案是相似的:基本上,一個是在任何剎車崩潰后查看 3 個單元格,另一個是提取值 2剎車前的單元格。抱歉造成混亂)
根據顯示的代碼,我假設“brake”列是 9、12、15 等。因此,我們創建一個數字索引 (“indx”) 來提取這些列。 還為“剎車”列(“indx1”)創建了一個邏輯矩陣。 然后,我們可以創建行索引( 1:nrow(df)
)和列索引( max.col(indx1, 'first')
), cbind
其cbind
並提取屬於7, 10, 13, etc.
列的元素7, 10, 13, etc.
。 我們將元素更改為NA
對應於 'indx1' 中為 rowSums 為 '0' 的rowSums
indx <- seq(9,ncol(df), by=3)
indx1 <- df[indx]=='brake'
df$brake <- df[indx-2][cbind(1:nrow(df), max.col(indx1, 'first'))]*
NA^!rowSums(indx1)
df$brake
#[1] NA NA 29928 NA 58038 NA NA 104903 NA 147963
#[11] NA 182373 NA
df <- structure(list(V1 = c(1L, 11973L, 22428L, 39333L, 49788L, 64688L,
85148L, 95598L, 112703L, 136513L, 157118L, 167568L, 193378L),
V2 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), V3 = c("round-0", "round-1", "round-2", "round-3", "round-4",
"round-5", "round-6", "round-7", "round-8", "round-9", "round-10",
"round-11", "round-12"), V4 = c(10523L, 19478L, 28928L, 47333L,
56288L, 71693L, 91648L, 103653L, 122758L, 146563L, 164618L,
179123L, 200378L), V5 = c(180L, 150L, 130L, 160L, 130L, 140L,
130L, 130L, 130L, 130L, 150L, 120L, 140L), V6 = c("yellow",
"yellow", "yellow", "yellow", "yellow", "yellow", "yellow",
"yellow", "yellow", "yellow", "yellow", "yellow", "yellow"
), V7 = c(NA, NA, 29928L, 48588L, 58038L, 74293L, 94648L,
104903L, 125758L, 147963L, 167118L, 182023L, 201878L), V8 = c(NA,
NA, 150L, 185L, 165L, 192L, 190L, 155L, 190L, 158L, 200L,
178L, 170L), V9 = c("", "", "brake", "red", "brake", "red",
"red", "brake", "red", "brake", "red", "red", "red"), V10 = c(NA,
NA, 31433L, NA, 59038L, 74393L, NA, 105403L, 125758L, 148063L,
NA, 182373L, 203278L), V11 = c(NA, NA, 160L, NA, 175L, 194L,
NA, 165L, 190L, 160L, NA, 185L, 198L), V12 = c("", "", "red",
"", "red", "crash", "", "red", "crash", "red", "", "brake",
"crash"), V13 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 182623L, NA), V14 = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 190L, NA), V15 = c("", "", "", "", "", "", "",
"", "", "", "", "crash", ""), V16 = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), V17 = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("V1", "V2",
"V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12",
"V13", "V14", "V15", "V16", "V17"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13"
))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.