简体   繁体   中英

Extract a portion of 1 column from data.frame/matrix

I get flummoxed by some of the simplest of things. In the following code I wanted to extract just a portion of one column in a data.frame called 'a'. I get the right values, but the final entity is padded with NAs which I don't want. 'b' is the extracted column, 'c' is the correct portion of data but has extra NA padding at the end.

How do I best do this where 'c' is ends up naturally only 9 elements long? (ie - the 15 original minus the 6 I skipped)

NumBars = 6
a = as.data.frame(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
a[,2] = c(11,12,13,14,15,16,17,18,19,20,21,22,23,24,25)
names(a)[1] = "Data1"
names(a)[2] = "Data2"

{Use 1st column of data only}

b = as.matrix(a[,1])
c = as.matrix(b[NumBars+1:length(b)])

The immediate reason why you're getting NA's is that the sequence operator : takes precedence over the addition operator + , as is detailed in the R Language Definition . Therefore NumBars+1:length(b) is not the same as (NumBars+1):length(b) . The first adds NumBars to the vector 1:length(b) , while the second adds first and then takes the sequence.

ind.1 <- 1+1:3   # == 2:4
ind.2 <- (1+1):3 # == 2:3 

When you index with this longer vector, you get all the elements you want, and you also are asking for entries like b[length(b)+1] , which the R Language Definition tells us returns NA . That's why you have trailing NA 's.

If i is positive and exceeds length(x) then the corresponding selection is NA . A negative out of bounds value for i causes an error.

b <- c(1,2,3)
b[ind.1] 
#[1] 2 3 NA
b[ind.2] 
#[1] 2 3

From a design perspective, the other solutions listed here are good choices to help avoid this mistake.

It is often easier to think of what you want to remove from your vector / matrix. Use negative subscripts to remove items.

c = as.matrix(b[-1:-NumBars])
c
##      [,1]
## [1,]    7
## [2,]    8
## [3,]    9
## [4,]   10
## [5,]   11
## [6,]   12
## [7,]   13
## [8,]   14
## [9,]   15

If your goal is to remove NA s from a column, you can also do something like

c <- na.omit(a[,1])

Eg

> x
[1]  1  2  3 NA NA
> na.omit(x)
[1] 1 2 3
attr(,"na.action")
[1] 4 5
attr(,"class")
[1] "omit"

You can ignore the attributes - they are there to let you know what elements were removed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM