[英]How to join certain entries together to form a new column in R
我有一個小標題 dataframe,df:
df <- structure(list(prob_blue = c(0.929572356778338, 0.0260967458595827,
0.941634205740072, 0.000908530458014964, 0, 0.0322897338624395,
0.96947026747672, 0.0549822742699063, 0.39632563113532, 1.49342246697533e-05
), prob_red = c(0.0289283895123213, 0.125496787021455, 0.0294092713166607,
0.000337896513434257, 1, 0.945123549045104, 0.0189977638740104,
0.00632470440415813, 0.505560271745452, 0.999781439802145), prob_green = c(0.0414992537093407,
0.848406467118963, 0.0289565229432678, 0.998753573028551, 0,
0.0225867170924565, 0.0115319686492698, 0.938693021325936, 0.0981140971192273,
0.000203625973185612), predicted_colour = c("blue", "green",
"blue", "green", "red", "red", "blue", "green", "red", "red"),
actual_colour = c("green", "green", "blue", "green", "red",
"blue", "blue", "green", "green", "red")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
# A tibble: 10 x 5
prob_blue prob_red prob_green predicted_colour actual_colour
<dbl> <dbl> <dbl> <chr> <chr>
1 0.930 0.0289 0.0415 blue green
2 0.0261 0.125 0.848 green green
3 0.942 0.0294 0.0290 blue blue
4 0.000909 0.000338 0.999 green green
5 0 1 0 red red
6 0.0323 0.945 0.0226 red blue
7 0.969 0.0190 0.0115 blue blue
8 0.0550 0.00632 0.939 green green
9 0.396 0.506 0.0981 red green
10 0.0000149 1.00 0.000204 red red
我想為匹配的actual_colour
條目獲取prob_*
列值,並創建一個新的 dataframe。 例如,如果actual_colour
為綠色,則取prob_green
值,並將其放入新的prob
列中。 本質上,我想創建這個:
colours
# A tibble: 10 x 3
actual_colour prob predicted_colour
<chr> <dbl> <chr>
1 red 1 red
2 red 1.00 red
3 green 0.0415 blue
4 green 0.848 green
5 green 0.999 green
6 green 0.939 green
7 green 0.0981 red
8 blue 0.942 blue
9 blue 0.0323 red
10 blue 0.969 blue
目前我正在這樣做:
blue <- df %>%
filter(actual_colour == "blue") %>%
select(actual_colour, prob = prob_blue, predicted_colour)
green <- df %>%
filter(actual_colour == "green") %>%
select(actual_colour, prob = prob_green, predicted_colour)
red <- df %>%
filter(actual_colour == "red") %>%
select(actual_colour, prob = prob_red, predicted_colour)
colours <- rbind(red, green, blue)
有沒有更簡單的方法來做到這一點?
使用pivot_longer
的主要技巧是使用names_prefix
參數。 如果您想擺脫name
列,只需將select(-name)
添加到鏈的末尾。
library(tidyverse)
df %>%
pivot_longer(cols = starts_with("prob"),
names_prefix = "prob_",
values_to = "prob") %>%
filter(actual_colour == name)
# predicted_colour actual_colour name prob
# <chr> <chr> <chr> <dbl>
# 1 blue green green 0.0415
# 2 green green green 0.848
# 3 blue blue blue 0.942
# 4 green green green 0.999
# 5 red red red 1
# 6 red blue blue 0.0323
# 7 blue blue blue 0.969
# 8 green green green 0.939
# 9 red green green 0.0981
#10 red red red 1.00
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.