[英]How to separate data from one row into two rows in R?
對於在“ Submit.and.module”列中具有“ Separate”值的行,我想在其上方直接插入一行並將數據從某些列移至該新行。
具體來說,我想將名為“ Submit.help”和“策略”的列中的數據移動到上面的新行。
現在,我的數據如下所示:
我希望數據看起來像這樣:
我怎樣才能做到這一點?
以為我會加入另一種解決方案參加聚會。
數據
data <- data.frame(Which.mod = c("TMH", "TMH-C", "TMH", "FC", "FC"),
Mod.time = c(1.43, 2.31, 0.67, 2.35, 8.22),
Submit.help = c(NA, "Help", NA, NA, "Submit"),
Strategy = c(NA, "Ratio", NA, NA, "Count"),
Submit.and.module = c(NA, "Separate", NA, NA, "Separate"))
基數R
步驟1:制作一個ID列和一個新的數據框,該框是要分離的行的子集。
data$id <- 1:nrow(data)
data1 <- subset(data, !is.na(Submit.and.module))
步驟2:將適當的列設置為NA並綁定數據幀
data[, c("Submit.help", "Strategy")] <- NA
data1[, c("Which.mod", "Mod.time", "Submit.and.module")] <- NA
步驟3:綁定數據幀和順序。
final <- rbind(data1, data)
final.ordered <- df1[order(df1$id), ]
# Which.mod Mod.time Submit.help Strategy Submit.and.module id
# 1 TMH 1.43 <NA> <NA> <NA> 1
# 2 <NA> NA Help Ratio <NA> 2
# 21 TMH-C 2.31 <NA> <NA> Separate 2
# 3 TMH 0.67 <NA> <NA> <NA> 3
# 4 FC 2.35 <NA> <NA> <NA> 4
# 5 <NA> NA Submit Count <NA> 5
# 51 FC 8.22 <NA> <NA> Separate 5
Tidyverse
尼斯,易於遵循。 與上述相同的步驟,但是要盡可能多地鏈接。
library(tidyverse)
dat1 <- data %>% mutate(id = 1:n(), Submit.help = NA, Strategy = NA)
dat2 <- data %>% mutate(id = 1:n()) %>%
filter(!is.na(Submit.and.module)) %>%
mutate(Which.mod = NA, Mod.time = NA, Submit.and.module = NA)
final <- rbind(dat2, dat1) %>% arrange(id)
這個想法是將要復制的列剪切到一個單獨的數據表中,將其添加到原始數據中(最后),並使用名為“ sort”的幫助器列以正確的順序對行進行排序:
library(data.table)
data <- data.table(Which.mod = c("TMH", "TMH-C", "TMH", "FC", "FC"),
Mod.time = c(1.43, 2.31, 0.67, 2.35, 8.22),
Submit.help = c(NA, "Help", NA, NA, "Submit"),
Strategy = c(NA, "Ratio", NA, NA, "Count"),
Submit.and.module = c(NA, "Separate", NA, NA, "Separate"))
data[, sort := (1:.N) * 10] # add a row sorting value with a gap to fill in inserted rows
# cut columns to be duplicated and "insert" at the end
res <- rbind(data,
data[Submit.and.module == "Separate", .(sort, Submit.help, Strategy)] [, sort := sort - 1],
use.names = TRUE,
fill = TRUE)
# Purge content of moved columns (credits go to @ismiregal - I forgot this initially)
res[Submit.and.module %in% "Separate", c("Submit.help", "Strategy") := NA]
# sort the result accordingly
res <- res[order(sort),]
結果:
res
Which.mod Mod.time Submit.help Strategy Submit.and.module sort
1: TMH 1.43 <NA> <NA> <NA> 10
2: <NA> NA Help Ratio <NA> 19
3: TMH-C 2.31 <NA> <NA> Separate 20
4: TMH 0.67 <NA> <NA> <NA> 30
5: FC 2.35 <NA> <NA> <NA> 40
6: <NA> NA Submit Count <NA> 49
7: FC 8.22 <NA> <NA> Separate 50
代碼很丑陋,但是如果您需要的話,我會盡力解釋它的作用(如果我早上會記得的話)。 當然,還有另一種優雅的方法可以做到這一點。 但是只是為了慶祝多樣性...
dat <- data.frame(
Wich.mod = c("TMH", "TMH-C", "TMH", "FC", "FC"),
Mod.time = c(1.43, 2.31, 0.67, 2.35, 8.22),
Submit.help = c(NA, "Help", NA, NA, "Submit"),
Strategy = c(NA, "Ratio", NA, NA, "Count"),
Submit.and.module = c(NA, "Separate", NA, NA, "Separate"),
stringsAsFactors = F
)
## create new data.frame, filled with NA. The same cols, but extra N("Separate") rows
newdata <- data.frame(matrix(NA, nrow(dat), ncol(dat) + sum(grepl("Separate", dat[, 5]))))
## insert data from dat, leaving empty spaces before "Separate"
newdata[1:nrow(dat) + cumsum(grepl("Separate", dat[, 5])), ] <- dat[1:nrow(dat),]
## give newdata column names from old data
colnames(newdata) <- colnames(dat)
## move Submit.help and Strategy related to "Separate" a row up
newdata[
which(newdata[, 5] == "Separate") - 1, 3:4
] <- newdata[which(newdata[, 5] == "Separate"), 3:4]
## for variables above, replace old values related to "Separate" with NA
newdata[which(newdata[, 5] == "Separate"), 3:4] <- NA
# Wich.mod Mod.time Submit.help Strategy Submit.and.module
# 1 TMH 1.43 NA NA NA
# 2 NA NA Help Ratio NA
# 3 TMH-C 2.31 NA NA Separate
# 4 TMH 0.67 NA NA NA
# 5 FC 2.35 NA NA NA
# 6 NA NA Submit Count NA
# 7 FC 8.22 NA NA Separate
這是另一個data.table解決方案(看來我太慢了):
library(data.table)
DT <- data.table(stringsAsFactors=FALSE,
Index = seq(5),
Which.mod = c("TMH", "TMH-C", "TMH", "FC", "FC"),
Mod.time = c(1.43, 2.31, 0.67, 2.35, 8.22),
Submit.help = c(NA, "Help", NA, NA, "Submit"),
Strategy = c(NA, "Ratio", NA, NA, "Count"),
Submit.and.module = c(NA, "Separate", NA, NA, "Separate")
)
DT <- rbindlist(list(DT, DT[Submit.and.module %in% "Separate", c("Index", "Submit.help", "Strategy")]), use.names=TRUE, fill=TRUE)
DT[Submit.and.module %in% "Separate", c("Submit.help", "Strategy") := NA]
setorder(DT, Index, Mod.time, na.last=FALSE)
print(DT)
這是一個tidyverse
解決方案:
library(tidyverse)
nms <- names(df1)
df1 %>%
rowid_to_column %>%
gather(,,-rowid) %>%
filter(!is.na(value)) %>%
mutate(tmp = key %in% c("Which.mod","Mod.time","Submit.and.module")) %>%
spread(key,value) %>%
select_at(nms)
# Which.mod Mod.time Submit.help Strategy Submit.and.module
# 1 TMH 1.43 <NA> <NA> <NA>
# 2 <NA> <NA> Help Ratio <NA>
# 3 TMH-C 2.31 <NA> <NA> Separate
# 4 TMH 0.67 <NA> <NA> <NA>
# 5 FC 2.35 <NA> <NA> <NA>
# 6 <NA> <NA> Submit Count <NA>
# 7 FC 8.22 <NA> <NA> Separate
數據
df1 <- data.frame(
Which.mod = c("TMH", "TMH-C", "TMH", "FC", "FC"),
Mod.time = c(1.43, 2.31, 0.67, 2.35, 8.22),
Submit.help = c(NA, "Help", NA, NA, "Submit"),
Strategy = c(NA, "Ratio", NA, NA, "Count"),
Submit.and.module = c(NA, "Separate", NA, NA, "Separate"),
stringsAsFactors = FALSE
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.