简体   繁体   中英

R - using mutate function from dplyr creates the wrong col name

I am using mutate function to add a new column in the data. I used 2 variations but both created the column with the wrong name "Domestic_boxoffice" instead of "total_boxoffice ".

See the implementation below:

Option 1:

starwars <- mutate(starwars, total_boxoffice = Domestic_boxoffice + Worldwide_boxoffice, .after=Worldwide_boxoffice)

Option 2:

starwars %>% mutate(total_boxoffice = Domestic_boxoffice + Worldwide_boxoffice)

在此处输入图像描述

Any idea why this could happen? In the screenshot below you can see that the first print displays the data structure and the second print after the addition of the column.

Full code:

homeDir <- getwd()

csvPath <- paste(homeDir, "/starwars.csv", sep = "")

starwars <- read.csv(csvPath)

starwars <- starwars %>% mutate(total_boxoffice = Domestic_boxoffice + Worldwide_boxoffice, .after=Worldwide_boxoffice)

starwars

dput(starwars):

structure(list(Release_date = c("Dec 20, 2019", "May 25, 2018", 
"Dec 15, 2017", "Dec 16, 2016", "Dec 18, 2015", "Aug 15, 2008", 
"May 19, 2005", "May 16, 2002", "May 19, 1999", "May 25, 1983", 
"May 21, 1980", "May 25, 1977"), Movie = c("Star Wars: The Rise of Skywalker", 
"Solo: A Star Wars Story", "Star Wars Ep. VIII: The Last Jedi", 
"Rogue One: A Star Wars Story", "Star Wars Ep. VII: The Force Awakens", 
"Star Wars: The Clone Wars", "Star Wars Ep. III: Revenge of the Sith", 
"Star Wars Ep. II: Attack of the Clones", "Star Wars Ep. I: The Phantom Menace", 
"Star Wars Ep. VI: Return of the Jedi", "Star Wars Ep. V: The Empire Strikes Again", 
"Star Wars Ep. IV: A New Hope"), Production_budget = structure(c(275, 
275, 200, 200, 306, 8.5, 115, 115, 115, 32.5, 23, 11), dim = c(12L, 
1L), dimnames = list(NULL, "Production_budget")), Opening_weekend = structure(c(177.383864, 
84.420489, 220.009584, 155.081681, 247.966675, 14.611273, 108.435841, 
80.027814, 64.81097, 23.019618, 4.910483, 1.554475), dim = c(12L, 
1L), dimnames = list(NULL, "Opening_weekend")), Domestic_boxoffice = structure(c(515.202542, 
213.767512, 620.181382, 532.177324, 936.662225, 35.161554, 380.270577, 
310.67674, 474.544677, 309.205079, 291.73896, 460.998007), dim = c(12L, 
1L), dimnames = list(NULL, "Domestic_boxoffice")), Worldwide_boxoffice = structure(c(1072.848487, 
393.151347, 1331.635141, 1055.135598, 2064.615817, 68.695443, 
848.998877, 656.695615, 1027.044677, 475.106177, 549.001242, 
775.398007), dim = c(12L, 1L), dimnames = list(NULL, "Worldwide_boxoffice")), 
    total_boxoffice = structure(c(1588.051029, 606.918859, 1951.816523, 
    1587.312922, 3001.278042, 103.856997, 1229.269454, 967.372355, 
    1501.589354, 784.311256, 840.740202, 1236.396014), dim = c(12L, 
    1L), dimnames = list(NULL, "Domestic_boxoffice")), US_avg_ticket_price_in_USD = c(9.16, 
    9.11, 8.97, 8.65, 8.43, 7.18, 6.41, 5.81, 5.08, 3.15, 2.69, 
    2.23)), row.names = c(NA, -12L), class = "data.frame")

As suggested in the comments, it looks like your problem is that some of your columns are matrices. @JonSpring suggests as.numeric() , but drop() might be more precise. I also suggest that you check the upstream path: it's not clear to me how the code in your question ( read.csv() ) could lead to output with this form; maybe there were other steps you didn't tell us about?

(starwars 
   |> mutate(across(everything(), drop),
             total_boxoffice = Domestic_boxoffice + Worldwide_boxoffice, 
                 .after=Worldwide_boxoffice)
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM