[英]How to transform observations to column and reprensent the number of occurence of these observations
这是我的数据集的一个子集。 患者有不同类型的严重程度(观察)的不良事件(变量)。 我想创建表示严重程度(“严重”、“严重”、“中等”)的其他变量,并为每个患者提供严重程度类型的数量。
mydata<-structure(list(record_id = c("2", "4", "5", "9", "10", "11",
"12", "15", "22", "23"), `Dégré Cytolyse hep ` = structure(c(NA,
3L, NA, NA, 1L, NA, 2L, NA, 3L, NA), .Label = c("modéré", "grave",
"sévère"), class = "factor"), `Dégré Trble digest` = structure(c(1L,
NA, NA, NA, NA, 2L, 1L, 1L, NA, 3L), .Label = c("modéré", "grave",
"sévère"), class = "factor"), `Dégré Erupt cutanées` = structure(c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("modéré",
"grave", "sévère"), class = "factor"), `Dégré Ins renale` = structure(c(NA,
NA, NA, 1L, NA, NA, NA, NA, NA, NA), .Label = c("modéré", "grave",
"sévère"), class = "factor"), `Dégré Neuropath` = structure(c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("modéré",
"grave", "sévère"), class = "factor"), `Dégré Autre 1` = structure(c(NA,
NA, 1L, NA, NA, 1L, NA, 1L, 3L, NA), .Label = c("modéré", "grave",
"sévère"), class = "factor"), `Dégré Autre 2` = structure(c(NA,
NA, NA, NA, NA, 1L, NA, 1L, NA, NA), .Label = c("modéré", "grave",
"sévère"), class = "factor"), `Dégré Autre 3` = structure(c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("modéré",
"grave", "sévère"), class = "factor"), `Dégré Autre 4` = structure(c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("modéré",
"grave", "sévère"), class = "factor"), `Dégré Autre 5` = structure(c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("modéré",
"grave", "sévère"), class = "factor")), row.names = c(NA, 10L
), class = "data.frame")
预期的数据集将是:
record_id Dégré Cytolyse hep Dégré Trble digest Dégré Erupt cutanées Dégré Ins renale Dégré Neuropath Dégré Autre 1 Dégré Autre 2 Dégré Autre 3 Dégré Autre 4 Dégré Autre 5 modéré sévère grave
1 2 <NA> modéré <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 1 0 0
2 4 sévère <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0 1 0
3 5 <NA> <NA> <NA> <NA> <NA> modéré <NA> <NA> <NA> <NA> 1 0 0
4 9 <NA> <NA> <NA> modéré <NA> <NA> <NA> <NA> <NA> <NA> 1 0 0
5 10 modéré <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 1 0 0
6 11 <NA> grave <NA> <NA> <NA> modéré modéré <NA> <NA> <NA> 2 0 1
7 12 grave modéré <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 1 0 1
8 15 <NA> modéré <NA> <NA> <NA> modéré modéré <NA> <NA> <NA> 3 0 0
9 22 sévère <NA> <NA> <NA> <NA> sévère <NA> <NA> <NA> <NA> 0 2 0
10 23 <NA> sévère <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 0 1 0
这是一个tidyverse
方法。 它假设所有需要计算的列都以“Dégré”开头,然后对这些与严重性相匹配的列进行rowwise()
求和。
library(tidyverse)
mydata %>%
rowwise() %>%
mutate(sévère = sum(c_across(starts_with("Dégré")) == "sévère", na.rm = T),
modéré = sum(c_across(starts_with("Dégré")) == "modéré", na.rm = T),
grave = sum(c_across(starts_with("Dégré")) == "grave", na.rm = T)) %>%
ungroup()
这与我大约一周前回答的问题非常相似。 使用apply()
和用户编写的 function:
# Defining useful function, to be passed within apply().
useful.fun = function(x) sum(x == i, na.rm = TRUE)
for (i in c("modéré", "sévère", "grave")) # Iterating over possible severity levels.
{
mydata$temp = apply(mydata, MARGIN = 1, useful.fun) # Requested results.
colnames(mydata)[dim(mydata)[[2]]] = i # Renaming new column.
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.