简体   繁体   中英

Plyr/Apply In Place Of Using A For Loop Instead

I have a for for loop runs splits a character column of dimensions (length x width x depth) then assigns those values to a new column. Right now, I loop over every observation in my dataframe to produce those variables. Here is my code (which works) but is slow.

library(stringi)

for(i in 1:nrow(data3)){
  data3$Length[i] <- stri_split_fixed(str = as.character(data3$Dimensions[i]),pattern = "x", omit_empty = NA)[[1]][1]
  data3$Width[i] <- stri_split_fixed(str = as.character(data3$Dimensions[i]),pattern = "x", omit_empty = NA)[[1]][2]
  data3$Depth[i] <- stri_split_fixed(str = as.character(data3$Dimensions[i]),pattern = "x", omit_empty = NA)[[1]][3]
}

Is there a plyr or apply function that would speed this operation up? If so, what is the syntax?

Update: Sample data:

structure(list(Shape = structure(c(24L, 24L, 24L, 24L, 24L, 24L, 
24L, 24L, 24L, 24L), .Label = c("Asscher", "Baguette", "Briolette", 
"Bullets", "Circular Brilliant", "Cushion", "Emerald", "Flanders", 
"Half Moon", "Heart", "Hexagon", "Kite", "Lozenge", "Marquise", 
"Octagonal", "Old European", "Old Miner", "Other", "Oval", "Pear", 
"Princess", "Radiant", "Rose Cut", "Round", "Shield", "Square", 
"Tapered Baguette", "Trapezoid", "Triangular"), class = "factor"), 
    Carats = c(0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4
    ), Color = structure(c(6L, 7L, 2L, 4L, 6L, 1L, 1L, 7L, 8L, 
    2L), .Label = c("D", "E", "F", "G", "H", "I", "J", "K", "L", 
    "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", 
    "Y", "Z"), class = "factor"), Dimensions = c("4.58x4.61x2.97", 
    "4.66x4.69x2.95", "4.71x4.75x2.93", "4.73x4.74x2.91", "4.62x4.67x2.93", 
    "4.79x4.82x2.91", "4.6x4.62x2.92", "4.7x4.73x2.93", "4.66x4.7x2.91", 
    "4.68x4.71x2.92")), .Names = c("Shape", "Carats", "Color", 
"Dimensions"), row.names = 400:409, class = "data.frame")

No need for a loop. The function is vectorized, meaning it operates on the whole column at once. In fact, you can get all three columns in one go:

data3[,c("Length","Width","Depth")] =  
          do.call(rbind, stri_split_fixed(str = as.character(data3$Dimensions), 
                                          pattern = "x", omit_empty = NA))

    Shape Carats Color     Dimensions Length Width Depth
400 Round    0.4     I 4.58x4.61x2.97   4.58  4.61  2.97
401 Round    0.4     J 4.66x4.69x2.95   4.66  4.69  2.95
402 Round    0.4     E 4.71x4.75x2.93   4.71  4.75  2.93
403 Round    0.4     G 4.73x4.74x2.91   4.73  4.74  2.91
404 Round    0.4     I 4.62x4.67x2.93   4.62  4.67  2.93
405 Round    0.4     D 4.79x4.82x2.91   4.79  4.82  2.91
406 Round    0.4     D  4.6x4.62x2.92    4.6  4.62  2.92
407 Round    0.4     J  4.7x4.73x2.93    4.7  4.73  2.93
408 Round    0.4     K  4.66x4.7x2.91   4.66   4.7  2.91
409 Round    0.4     E 4.68x4.71x2.92   4.68  4.71  2.92

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM