简体   繁体   中英

What's the difference between those 2 versions or R?

products <- data.frame(key=c("Kettles", "Fryers", "Toasters", "Irons"),
    price=c(20, 90, 60, 80))


prod <- sample(products$key, 5, replace=T, prob=c(4, 1, 2, 3))

prod

str(prod)

amount <- products[prod,]$price

amount      #problem  in 4.03 not in 3.6.1, in 4.0.3 gives [1] NA NA NA NA NA 

The biggest difference is between factor and character .

R3> str(products)
'data.frame':   4 obs. of  2 variables:
 $ key  : Factor w/ 4 levels "Fryers","Irons",..: 3 1 4 2
 $ price: num  20 90 60 80

R4> str(products)
'data.frame':   4 obs. of  2 variables:
 $ key  : chr  "Kettles" "Fryers" "Toasters" "Irons"
 $ price: num  20 90 60 80

In R-3.6 and earlier, the default behavior is data.frame(..., stringsAsFactors=TRUE) , whereas a long-standing request by many (but not all users) was to change that default to stringsAsFactors=FALSE in R-4 and later.

You can mimic R4's behavior in R3 with:

R3> products[as.character(prod),]$price
[1] NA NA NA NA NA

or

R2> products <- data.frame(key=c("Kettles", "Fryers", "Toasters", "Irons"),
    price=c(20, 90, 60, 80),
    stringsAsFactors = FALSE)
R3> prod <- sample(products$key, 5, replace=T, prob=c(4, 1, 2, 3))
R3> prod
[1] "Fryers"   "Fryers"   "Kettles"  "Toasters" "Irons"   
R3> products[prod,]$price
[1] NA NA NA NA NA

The reason that it returns non- NA when it's a factor is that underneath, prod as a factor is really just integer . Returning to the factor -based frame:

R3> products <- data.frame(key=c("Kettles", "Fryers", "Toasters", "Irons"),
    price=c(20, 90, 60, 80))
R3> set.seed(42)
R3> prod <- sample(products$key, 5, replace=T, prob=c(4, 1, 2, 3))
R3> prod
[1] Fryers   Fryers   Kettles  Toasters Irons   
Levels: Fryers Irons Kettles Toasters
R3> as.integer(prod)
[1] 1 1 3 4 2
R3> products[prod,]$price
[1] 20 20 60 80 90
R3> products[as.integer(prod),]$price
[1] 20 20 60 80 90

So R-3 is really just using the underlying integer of the factors. While I cannot find a clear bullet in https://cran.r-project.org/doc/manuals/r-devel/NEWS.html that explains this specific use, to me it seems a reasonable explanation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM