简体   繁体   中英

How to mutate selected rows while keeping the whole dataframe using dplyr

I want to filter by a few criteria but keep the dataframe as is. Is there a filter equivalent that does this?

The overall dataset has 650 entries with 400 columns, and want to edit one subject ID while keeping the rest of the other 650 entries, this is the code I thought of using:

#191-35193895 3 ciders
df %>% 
  select(subject_id, alc_exp_1:alcohol_exposure_questionnaire_enrolment_complete) %>% 
  filter (subject_id == "191-25881304" & alc_exp_1b == 999) %>% 
  mutate(alc_exp_1b = recode(alc_exp_1b, '999' = "3")) %>% 
  mutate(alc_exp_1c = replace_na(alc_exp_1c, 7))

In detail, I am trying to pick a subject ID and mutate the alc_exp_1b column by replacing 999 with 3, then modify alc_exp_1c by replacing the NA with 7.

This code of course works but over-rides the original dataset.

Please help.

As per my comment, using data.table :

library(data.table)
setDT(df)

df[subject_id == "191-25881304" & alc_exp_1b == 999, alc_exp_1b := 3]#replace 999 with 3
df[subject_id == "191-25881304" & is.na(alc_exp_1b), alc_exp_1b := 7]#replace NA with 7

Note that this is updating by reference , so there is not need to write something like df <- df %>%... in this situation.

If I understand you correctly, you want to replace "999" with "3" using mutate() .

Here is an example using the palmerpenguins dataset :

# Load the libraries
library(tidyverse)
library(palmerpenguins)

# Look at the first 6 liens of the dataset
head(penguins)
#> # A tibble: 6 × 8
#>   species island    bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex    year
#>   <fct>   <fct>              <dbl>         <dbl>       <int>   <int> <fct> <int>
#> 1 Adelie  Torgersen           39.1          18.7         181    3750 male   2007
#> 2 Adelie  Torgersen           39.5          17.4         186    3800 fema…  2007
#> 3 Adelie  Torgersen           40.3          18           195    3250 fema…  2007
#> 4 Adelie  Torgersen           NA            NA            NA      NA <NA>   2007
#> 5 Adelie  Torgersen           36.7          19.3         193    3450 fema…  2007
#> 6 Adelie  Torgersen           39.3          20.6         190    3650 male   2007
#> # … with abbreviated variable names ¹​flipper_length_mm, ²​body_mass_g

# Change the first penguin's bill_length_mm from "39.1" to "999"
penguins$bill_length_mm[1] <- 999

penguins %>%
  na.omit() %>%
  head()
#> # A tibble: 6 × 8
#>   species island    bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex    year
#>   <fct>   <fct>              <dbl>         <dbl>       <int>   <int> <fct> <int>
#> 1 Adelie  Torgersen          999            18.7         181    3750 male   2007
#> 2 Adelie  Torgersen           39.5          17.4         186    3800 fema…  2007
#> 3 Adelie  Torgersen           40.3          18           195    3250 fema…  2007
#> 4 Adelie  Torgersen           36.7          19.3         193    3450 fema…  2007
#> 5 Adelie  Torgersen           39.3          20.6         190    3650 male   2007
#> 6 Adelie  Torgersen           38.9          17.8         181    3625 fema…  2007
#> # … with abbreviated variable names ¹​flipper_length_mm, ²​body_mass_g

# Use mutate to replace any "999" value in the bill_length_mm column with "3"
penguins %>%
  na.omit() %>%
  mutate(bill_length_mm = ifelse(bill_length_mm == 999, 3, bill_length_mm)) %>%
  head()
#> # A tibble: 6 × 8
#>   species island    bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex    year
#>   <fct>   <fct>              <dbl>         <dbl>       <int>   <int> <fct> <int>
#> 1 Adelie  Torgersen            3            18.7         181    3750 male   2007
#> 2 Adelie  Torgersen           39.5          17.4         186    3800 fema…  2007
#> 3 Adelie  Torgersen           40.3          18           195    3250 fema…  2007
#> 4 Adelie  Torgersen           36.7          19.3         193    3450 fema…  2007
#> 5 Adelie  Torgersen           39.3          20.6         190    3650 male   2007
#> 6 Adelie  Torgersen           38.9          17.8         181    3625 fema…  2007
#> # … with abbreviated variable names ¹​flipper_length_mm, ²​body_mass_g

If you only want to change the "999" value for the first penguin, and not for every penguin with "999", you can do it using an identifier (id):

# Add an "id" column and give each penguin a unique value (in this case, the rownumber)
penguins %>%
  na.omit() %>%
  mutate(id = row_number()) %>%
  head()
#> # A tibble: 6 × 9
#>   species island    bill_length_mm bill_dept…¹ flipp…² body_…³ sex    year    id
#>   <fct>   <fct>              <dbl>       <dbl>   <int>   <int> <fct> <int> <int>
#> 1 Adelie  Torgersen          999          18.7     181    3750 male   2007     1
#> 2 Adelie  Torgersen           39.5        17.4     186    3800 fema…  2007     2
#> 3 Adelie  Torgersen           40.3        18       195    3250 fema…  2007     3
#> 4 Adelie  Torgersen           36.7        19.3     193    3450 fema…  2007     4
#> 5 Adelie  Torgersen           39.3        20.6     190    3650 male   2007     5
#> 6 Adelie  Torgersen           38.9        17.8     181    3625 fema…  2007     6
#> # … with abbreviated variable names ¹​bill_depth_mm, ²​flipper_length_mm,
#> #   ³​body_mass_g
  
# Then, for only the first penguin (id == 1), change "999" to "3"
penguins %>%
  na.omit() %>%
  mutate(id = row_number()) %>%
  mutate(bill_length_mm = ifelse(bill_length_mm == 999 & id == 1, 3, bill_length_mm)) %>%
  head()
#> # A tibble: 6 × 9
#>   species island    bill_length_mm bill_dept…¹ flipp…² body_…³ sex    year    id
#>   <fct>   <fct>              <dbl>       <dbl>   <int>   <int> <fct> <int> <int>
#> 1 Adelie  Torgersen            3          18.7     181    3750 male   2007     1
#> 2 Adelie  Torgersen           39.5        17.4     186    3800 fema…  2007     2
#> 3 Adelie  Torgersen           40.3        18       195    3250 fema…  2007     3
#> 4 Adelie  Torgersen           36.7        19.3     193    3450 fema…  2007     4
#> 5 Adelie  Torgersen           39.3        20.6     190    3650 male   2007     5
#> 6 Adelie  Torgersen           38.9        17.8     181    3625 fema…  2007     6
#> # … with abbreviated variable names ¹​bill_depth_mm, ²​flipper_length_mm,
#> #   ³​body_mass_g

Created on 2022-09-19 by the reprex package (v2.0.1)

Does this approach solve your problem?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM