I would like to remove some rows from my data.frame. Let's start with example:
> tbl_EOD[20:40,]
AGI.identifier location_subacon
20 AT1G11360.4 plastid
21 AT1G11650.2 nucleus
22 AT1G11930.2 cytosol
23 AT1G12010.1 peroxisome
24 AT1G12080.2 nucleus
25 AT1G12140.1 plasma membrane
26 AT1G12250.2 cytosol,nucleus ## row which I want to delete
27 AT1G12520.2 peroxisome
28 AT1G13320.2 cytosol
29 AT1G13930.3 nucleus
30 AT1G14250.1 extracellular,plasma membrane ## row which I want to delete
31 AT1G15340.2 nucleus
32 AT1G15470.1 cytosol
33 AT1G16460.4 cytosol
34 AT1G16820.2 cytosol,mitochondrion ## row which I want to delete
35 AT1G17150.1 extracellular
36 AT1G17330.1 cytosol
37 AT1G17470.2 cytosol
38 AT1G17890.3 cytosol
39 AT1G19730.1 cytosol
40 AT1G20060.1 nucleus
As I show on the example I just want to remove those rows which have two localizations separated by coma.
You can use grepl
for this.
tbl_EOD <- tbl_EOD[!grepl(",", tbl_EOD$location_subacon), ]
Explanation: grepl
searches a character vector, call it S
, for a pattern. It returns a vector of the same length with TRUE
if the corresponding element of S
contains the patter, and FALSE
otherwise. In this case, the pattern is ","
. What you really want are the rows where there aren't commas, so you can tack on the "!" in front of grepl
, which turns all values that are TRUE
into FALSE
and vice versa.
If you want to keep all rows, but remove everything after the commas, you could use gsub
.
tbl_EOD$location_subacon <- gsub("(.*),.*", "\\1", tbl_EOD$location_subacon)
Explanation: gsub
searches a character vector S
for a pattern and replaces every occurrence of that pattern with the replacement. In this case, the pattern is "(.*),.*"
and the replacement is "\\\\1"
. The pattern is a regular expression that says something like "(zero or more characters) followed by a comma followed by zero or more characters"
. Here, the parentheses capture the enclosed portion so that you can refer to it later. The replacement is simply the captured portion in this case, and it's denoted by \\\\1
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.