简体   繁体   中英

how to select a row from multiple rows with the same value in a column keep rows without null value in a column in R?

Assuming I have an id column, a Gene_ID, and a value column. more than one row of data has same Gene_ID and there is no value in some rows.

I'd like to search for rows by non null value in that column and only need one row contains each Gene_ID. For example, I have the below data frames:

 # ID Gene_ID  Value
 # 6  26470  1.137318
 # 7  10878  -1.051181
 # 8   ""    -1.316229
 # 9 26470  -1.015734

And I want the result to be:

 # ID Gene_ID  Value
 # 6  26470  1.137318
 # 7  10878  -1.051181
library(tidyverse)

df %>%
  filter(Gene_ID != '') %>%
  group_by(Gene_ID) %>%
  slice(1) %>%
  ungroup()

This will keep the first row per Gene_Id.

Note that the filter command depends on the structure of your Gene ID column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM