簡體   English   中英

如何從因子存儲中提取特定值作為數據框中的列值

[英]How to extract particular values from a factor store as column value in dataframe

我有一個包含以下列的數據框:

df <- A B C 
    heart_rate  ['53.0', '1']   94
    heart_rate  ['54.0', '2']   1
    heart_rate  ['54.0', '1']   9
    heart_rate  ['55.0', '0']   1
    heart_rate  ['55.0', '1']   7

如何讀取只存儲一個值的 df1 用於 B[x,y] 為 y=1 的情況。這意味着:

輸出所需的數據幀 df1

  df1 

    A B C  
    heart_rate  53.0    94
    heart_rate  54.0    9
    heart_rate  55.0    7


    structure(list(source = structure(c(1L, 1L, 1L), .Label = "heart_rate", class = "factor"), 
        values = structure(3:1, .Label = c("['171.0', '1']", "['172.0', '1']", 
        "['173.0', '0']"), class = "factor"), timediff = c(6L, 7L, 
        10L)), class = "data.frame", row.names = c(NA, -3L))

使用在上一篇文章中的回答,我們可以使用extract將數據放入單獨的B列和D列中,然后使用filter來選擇D = 1

tidyr::extract(df, B, into = c('B', 'D'), "(\\d+\\.\\d+).*(\\d)") %>%
  dplyr::filter(D == 1) %>%
  dplyr::select(-D)

#           A    B  C
#1 heart_rate 53.0 94
#2 heart_rate 54.0  9
#3 heart_rate 55.0  7

數據

df <- structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L), 
.Label = "heart_rate", class = "factor"), 
B = structure(c(1L, 3L, 2L, 4L, 5L), .Label = c("[53.0, 1]", 
"[54.0, 1]", "[54.0, 2]", "[55.0, 0]", "[55.0, 1]"), class = "factor"), 
C = c(94L, 1L, 9L, 1L, 7L)), class = "data.frame", row.names = c(NA, -5L))

我們可以在filter后使用parse_number做到這一點

library(dplyr)
library(stringr)
df %>%
    filter(str_detect(B, "1\\]")) %>% 
    mutate(B = readr::parse_number(as.character(B)))
#           A  B  C
#1 heart_rate 53 94
#2 heart_rate 54  9
#3 heart_rate 55  7

或者另一種選擇是base R

transform(subset(cbind(df, read.csv(text = gsub("[][ ]", "", 
      df$B), header = FALSE)), V2 == 1), B = V1)[names(df)]
#           A  B  C
#1 heart_rate 53 94
#3 heart_rate 54  9
#5 heart_rate 55  7

數據

df <- structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L), 
.Label = "heart_rate", class = "factor"), 
B = structure(c(1L, 3L, 2L, 4L, 5L), .Label = c("[53.0, 1]", 
"[54.0, 1]", "[54.0, 2]", "[55.0, 0]", "[55.0, 1]"), class = "factor"), 
C = c(94L, 1L, 9L, 1L, 7L)), class = "data.frame", row.names = c(NA, -5L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM