简体   繁体   English

如何使用purrr匹配查找表中的记录?

[英]How can I use purrr to match records from a lookup table?

I have this dataset 我有这个数据集

library(dplyr)
data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54)) 
# A tibble: 4 x 3
     Q1    Q2 value
  <chr> <chr> <dbl>
1    AL    MN    10
2  <NA>    CO    24
3    TX  <NA>    12
4    FL  <NA>    54

And I am trying to use purrr to convert the values in Q1 and Q2 into full state names using a lookup table 我试图使用purrr使用查找表将Q1Q2的值转换为完整的州名

lktState <- data_frame(abb=state.abb, name=state.name)

So far I've tried this but it doesn't work 到目前为止,我已经尝试了这个,但它没有用

data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54)) %>% 
  mutate_at(vars('Q1','Q2'), purrr::map(.x = ., lktState$name[match(.x, lktState$abb)]))

Error in match(.x, lktState$abb) : object '.x' not found 匹配错误(.x,lktState $ abb):找不到对象'.x'

base R version (which can be vectorized but this illustrates the concept): 基本R版本(可以进行矢量化,但这说明了这个概念):

xdf <- data.frame(
  Q1= c('AL', NA, 'TX', 'FL'),
  Q2 = c('MN', 'CO', NA, NA),
  value = c(10, 24, 12, 54),
  stringsAsFactors=FALSE
) -> xdf

xdf
##     Q1   Q2 value
## 1   AL   MN    10
## 2 <NA>   CO    24
## 3   TX <NA>    12
## 4   FL <NA>    54
lktState <- setNames(state.name, state.abb)

xdf$Q1 <- lktState[xdf$Q1]
xdf$Q2 <- lktState[xdf$Q2]

xdf
##        Q1        Q2 value
## 1 Alabama Minnesota    10
## 2    <NA>  Colorado    24
## 3   Texas      <NA>    12
## 4 Florida      <NA>    54

"tidyverse" “tidyverse”

library(dplyr)

xdf <- data_frame(
  Q1= c('AL', NA, 'TX', 'FL'),
  Q2 = c('MN', 'CO', NA, NA),
  value = c(10, 24, 12, 54)
) -> xdf

xdf
## # A tibble: 4 x 3
##      Q1    Q2 value
##   <chr> <chr> <dbl>
## 1    AL    MN    10
## 2  <NA>    CO    24
## 3    TX  <NA>    12
## 4    FL  <NA>    54
lktState <- setNames(state.name, state.abb)

mutate_at(xdf, .vars=vars(-value), .funs=funs(lktState[.]))
## # A tibble: 4 x 3
##        Q1        Q2 value
##     <chr>     <chr> <dbl>
## 1 Alabama Minnesota    10
## 2    <NA>  Colorado    24
## 3   Texas      <NA>    12
## 4 Florida      <NA>    54

There's no need to use "apply"-like idioms to do this basic lookup table assignment. 没有必要使用“apply”式成语来执行此基本查找表分配。

I agree with Sotos that a join is the natural way to do this. 我同意Sotos的观点,即加入是自然而然的方式。 However, your purrr solution is definitely fixable. 但是,您的purrr解决方案绝对purrr解决。

You are missing three things, 你错过了三件事,

  1. For anything other than a simple single function, you need to use funs in mutate_at . 对于不是简单的单一功能的其他任何东西,你需要使用funsmutate_at
  2. map functions use ~ notation for anonymous functions. map函数使用~表示匿名函数。
  3. You don't want to return a list, but rather a character vector, so use _chr variant. 您不想返回列表,而是返回字符向量,因此请使用_chr variant。

.

mutate_at(df,
          vars('Q1', 'Q2'), 
          funs(purrr::map_chr(.x = ., ~lktState$name[match(.x, lktState$abb)])))

Gives: 得到:

 # A tibble: 4 x 3 Q1 Q2 value <chr> <chr> <dbl> 1 Alabama Minnesota 10 2 <NA> Colorado 24 3 Texas <NA> 12 4 Florida <NA> 54 

Data 数据

df <- data_frame(Q1= c('AL', NA, 'TX', 'FL'), Q2=c('MN', 'CO', NA, NA), value=c(10,24,12,54))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM