簡體   English   中英

如何將某些條目連接在一起以形成 R 中的新列

[英]How to join certain entries together to form a new column in R

我有一個小標題 dataframe,df:

df <- structure(list(prob_blue = c(0.929572356778338, 0.0260967458595827, 
0.941634205740072, 0.000908530458014964, 0, 0.0322897338624395, 
0.96947026747672, 0.0549822742699063, 0.39632563113532, 1.49342246697533e-05
), prob_red = c(0.0289283895123213, 0.125496787021455, 0.0294092713166607, 
0.000337896513434257, 1, 0.945123549045104, 0.0189977638740104, 
0.00632470440415813, 0.505560271745452, 0.999781439802145), prob_green = c(0.0414992537093407, 
0.848406467118963, 0.0289565229432678, 0.998753573028551, 0, 
0.0225867170924565, 0.0115319686492698, 0.938693021325936, 0.0981140971192273, 
0.000203625973185612), predicted_colour = c("blue", "green", 
"blue", "green", "red", "red", "blue", "green", "red", "red"), 
    actual_colour = c("green", "green", "blue", "green", "red", 
    "blue", "blue", "green", "green", "red")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

# A tibble: 10 x 5
   prob_blue prob_red prob_green predicted_colour actual_colour
       <dbl>    <dbl>      <dbl> <chr>            <chr>        
 1 0.930     0.0289     0.0415   blue             green        
 2 0.0261    0.125      0.848    green            green        
 3 0.942     0.0294     0.0290   blue             blue         
 4 0.000909  0.000338   0.999    green            green        
 5 0         1          0        red              red          
 6 0.0323    0.945      0.0226   red              blue         
 7 0.969     0.0190     0.0115   blue             blue         
 8 0.0550    0.00632    0.939    green            green        
 9 0.396     0.506      0.0981   red              green        
10 0.0000149 1.00       0.000204 red              red 

我想為匹配的actual_colour條目獲取prob_*列值,並創建一個新的 dataframe。 例如,如果actual_colour為綠色,則取prob_green值,並將其放入新的prob列中。 本質上,我想創建這個:

colours
# A tibble: 10 x 3
   actual_colour   prob predicted_colour
   <chr>          <dbl> <chr>           
 1 red           1      red             
 2 red           1.00   red             
 3 green         0.0415 blue            
 4 green         0.848  green           
 5 green         0.999  green           
 6 green         0.939  green           
 7 green         0.0981 red             
 8 blue          0.942  blue            
 9 blue          0.0323 red             
10 blue          0.969  blue

目前我正在這樣做:

blue <- df %>% 
    filter(actual_colour == "blue") %>%
    select(actual_colour, prob = prob_blue, predicted_colour)

green <- df %>% 
    filter(actual_colour == "green") %>%
    select(actual_colour, prob = prob_green, predicted_colour)

red <- df %>% 
    filter(actual_colour == "red") %>%
    select(actual_colour, prob = prob_red, predicted_colour)

colours <- rbind(red, green, blue)

有沒有更簡單的方法來做到這一點?

使用pivot_longer的主要技巧是使用names_prefix參數。 如果您想擺脫name列,只需將select(-name)添加到鏈的末尾。

library(tidyverse)

df %>%
    pivot_longer(cols = starts_with("prob"),
                 names_prefix = "prob_",
                 values_to = "prob") %>%
    filter(actual_colour == name)

 #   predicted_colour actual_colour name    prob
 #   <chr>            <chr>         <chr>  <dbl>
 # 1 blue             green         green 0.0415
 # 2 green            green         green 0.848 
 # 3 blue             blue          blue  0.942 
 # 4 green            green         green 0.999 
 # 5 red              red           red   1     
 # 6 red              blue          blue  0.0323
 # 7 blue             blue          blue  0.969 
 # 8 green            green         green 0.939 
 # 9 red              green         green 0.0981
 #10 red              red           red   1.00  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM