Deleting a specific row without knowing row number

Question

I would like to delete a row from a data frame and sum the resulting columns. I know the row I want to delete based on its contents, but not its row number. Below I present three examples, two of which work. Using - to delete the row only works if the first row is to be deleted. Why is that?

My question is similar to this one: How to delete the first row of a dataframe in R? However, there the row is deleted based on its row number.

# This works.

state = 'OH'

my.data = read.table(text = "
      county  y1990 y2000
        cc       NA    2
        OH       NA   10
        bb       NA    1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)

my.colsums2 <- colSums(my.data[!(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2

# y1990 y2000 
#    0     3

# This works.

my.data = read.table(text = "
      county  y1990 y2000
        OH       NA   10
        cc       NA    2
        bb       NA    1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)

my.colsums2 <- colSums(my.data[-(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2

# y1990 y2000 
#    0     3

# This does not work.

my.data = read.table(text = "
      county  y1990 y2000
        cc       NA    2
        OH       NA   10
        bb       NA    1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)

my.colsums2 <- colSums(my.data[-(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2

# y1990 y2000 
#    0    11

I guess I am still confused over the difference between ! and - . Thank you for any advice.

Answer 1

This should clear up the difference between - and ! , and I suspect you can take it from there ;)

my.data$county == state
# [1]  TRUE FALSE FALSE

!(my.data$county == state)
# [1] FALSE  TRUE  TRUE

-(my.data$county == state)
# [1] -1  0  0

! , which negates Boolean values, is the operator that you should be using here.

Answer 2

I think it's important to remember what you're doing. When you pass a conditional argument to subset a row or column, it needs to be a full length TRUE or FALSE test or, it needs to be numbers that represent the row (or column).

Here's a simple example with a vector. Try entering the conditions into the console to see what they provide

Try these:

x <- rnorm(20)

## These use integer values for indexing
x[which(x > 1)]  # Numbers > Only those numbers which match

## These use logical values for indexing
x[x > 1]    # Logical > Only those that are true
x[!(x < 1)] # Logical > Only those that are false

Bad Behaviour:

x[-which(x > 1)] # Positive numbers to negative numbers = BAD
x[!which(x > 1)] # Converts numbers to logical = BAD
x[-(x > 1)] # Converts logical to numeric = BAD

Specific to your example:

!(my.data$county == state) # Converts TRUE/FALSE to FALSE/TRUE
which(my.data$county != state) # Rows where my.data$count not equal state

Personally, I recommend using which() in all cases to avoid potential negation of a logical or conversion of numeric. It also tends to be easier to "translate"

Deleting a specific row without knowing row number

Question

2 answers

solution1
6 ACCPTED 2013-04-02 22:39:58

solution2
3 2013-04-02 22:46:04

Deleting a specific row without knowing row number

Question

2 answers

solution1 6 ACCPTED 2013-04-02 22:39:58

solution2 3 2013-04-02 22:46:04

solution1
6 ACCPTED 2013-04-02 22:39:58

solution2
3 2013-04-02 22:46:04