简体   繁体   中英

How to remove certain junk values from a column in a data frame?

I have a column called Region in a data frame which is of character type. It has certain junk values as below which I want to remove:

"#VALUE!","10.1","10.2","138","145","161"

But when I try to remove using things like subset they don't get removed as follows:

subset(pro_202_data,Region != c("#VALUE!","10.1","10.2","138","145","161"))

I have tried using only != but that also doesn't work.

Please help.

Does this answer, I've created a dataframe with what you've provided and tried to filter out first two rows, you can try similarly for your entire dataframe.

> pro_202_data
   Region
1 #VALUE!
2    10.1
3    10.2
4     138
5     145
6     161
> subset(pro_202_data, !(Region %in% c("#VALUE!","10.1")))
  Region
3   10.2
4    138
5    145
6    161
> 

You can subset like this:

Single vector:

x <- c("#VALUE!","10.1","10.2","138","145","161")
x[!x=="#VALUE!"]
[1] "10.1" "10.2" "138"  "145"  "161"

Dataframe:

df <- data.frame(
  Region = c("#VALUE!","10.1","10.2","138","145","161"), stringsAsFactors = F
)

df[!df$Region=="#VALUE!",]
[1] "10.1" "10.2" "138"  "145"  "161"

Note the addition of the , to select all columns of the dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM