I would like to delete a row from a data frame and sum the resulting columns. I know the row I want to delete based on its contents, but not its row number. Below I present three examples, two of which work. Using -
to delete the row only works if the first row is to be deleted. Why is that?
My question is similar to this one: How to delete the first row of a dataframe in R? However, there the row is deleted based on its row number.
# This works.
state = 'OH'
my.data = read.table(text = "
county y1990 y2000
cc NA 2
OH NA 10
bb NA 1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)
my.colsums2 <- colSums(my.data[!(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2
# y1990 y2000
# 0 3
# This works.
my.data = read.table(text = "
county y1990 y2000
OH NA 10
cc NA 2
bb NA 1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)
my.colsums2 <- colSums(my.data[-(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2
# y1990 y2000
# 0 3
# This does not work.
my.data = read.table(text = "
county y1990 y2000
cc NA 2
OH NA 10
bb NA 1
", sep = "", header = TRUE, na.strings = "NA", stringsAsFactors = FALSE)
my.colsums2 <- colSums(my.data[-(my.data$county == state), 2:ncol(my.data)], na.rm=TRUE)
my.colsums2
# y1990 y2000
# 0 11
I guess I am still confused over the difference between !
and -
. Thank you for any advice.
This should clear up the difference between -
and !
, and I suspect you can take it from there ;)
my.data$county == state
# [1] TRUE FALSE FALSE
!(my.data$county == state)
# [1] FALSE TRUE TRUE
-(my.data$county == state)
# [1] -1 0 0
!
, which negates Boolean values, is the operator that you should be using here.
I think it's important to remember what you're doing. When you pass a conditional argument to subset a row or column, it needs to be a full length TRUE or FALSE test or, it needs to be numbers that represent the row (or column).
Here's a simple example with a vector. Try entering the conditions into the console to see what they provide
Try these:
x <- rnorm(20)
## These use integer values for indexing
x[which(x > 1)] # Numbers > Only those numbers which match
## These use logical values for indexing
x[x > 1] # Logical > Only those that are true
x[!(x < 1)] # Logical > Only those that are false
Bad Behaviour:
x[-which(x > 1)] # Positive numbers to negative numbers = BAD
x[!which(x > 1)] # Converts numbers to logical = BAD
x[-(x > 1)] # Converts logical to numeric = BAD
Specific to your example:
!(my.data$county == state) # Converts TRUE/FALSE to FALSE/TRUE
which(my.data$county != state) # Rows where my.data$count not equal state
Personally, I recommend using which()
in all cases to avoid potential negation of a logical or conversion of numeric. It also tends to be easier to "translate"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.