简体   繁体   中英

R - Warning: "argument is not an atomic vector" when attempting to remove whitespace

I'm at the final stage of tidying my data before analysis and have encountered an issue i'm not really able to understand when removing whitespace in the data.table. See complete code below for description of the steps in the code.

Started from the following page ( How to remove all whitespace from a string? ) and have attempted to troubleshoot through other pages talking about errors/warning with atomic vectors without luck.

At step 6 I recieved the flowing warning

In stri_replace_all_fixed(allData, " ", "") :
  argument is not an atomic vector; coercing

And at step 7 the following warning

> #Change sold and taxed columes from character to numerical
> allData$SoldAmount <- as.numeric(allData$SoldAmount)
Warning message:
NAs introduced by coercion 
> allData$Tax <- as.numeric(allData$Tax)
Warning message:
NAs introduced by coercion

Both step 6 and 7 seem to run, but the result ends up being NA in two of the colums(see image)

Result after wihtespace are removed

The complete code is listed below and I would love some advice on how to get step 6 and 7 to give me colums that are without whitespace and are numerical.

#Step 1: Load needed library 
library(tidyverse) 
library(rvest) 
library(jsonlite)
library(stringi)

#Step 2: Access the URL 
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/" 

#Step 3: Direct JSON as format of data in URL 
data <- jsonlite::fromJSON(url, flatten = TRUE) 

#Step 4: Access all items in API 
totalItems <- data$TotalNumberOfItems 

#Step 5: Summarize all data from API 
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>% 
  jsonlite::fromJSON(., flatten = TRUE) %>% 
  .[1] %>% 
  as.data.frame() %>% 
  rename_with(~str_replace(., "ListItems.", ""), everything())

#Step 6: removing colums not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]

#Step 6: remove whitespace in all colums
stri_replace_all_fixed(allData, " ", "")

#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)

You call stri_replace_all_fixed(allData, " ", "") but ignore/discard its output. Save it somewhere.

#Step 6: remove whitespace in all colums
allData[] <- lapply(allData, gsub, pattern = " ", replacement = "")

#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
head(allData)
#     County Municipality      Tax SoldAmount           Type Date
# 1 Akershus        FROGN  2400000    2550000          Bolig 2004
# 2 Akershus        FROGN  2225000    2100000          Bolig 2004
# 3 Akershus          SKI  7600000   18000000    Næringstomt 2006
# 4  Østfold    SARPSBORG  3000000    3815000           Tomt 2004
# 5  Østfold        RYGGE 10000000   16000000 Næringseiendom 2006
# 6 Vestfold       LARVIK    61950      61950           Tomt 2013

Alternatively, do it once, and only to the columns you need:

# allData <- paste0(...) %>% ...
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
head(allData)
#     County Municipality      Tax SoldAmount           Type Date
# 1 Akershus        FROGN  2400000    2550000          Bolig 2004
# 2 Akershus        FROGN  2225000    2100000          Bolig 2004
# 3 Akershus          SKI  7600000   18000000    Næringstomt 2006
# 4  Østfold    SARPSBORG  3000000    3815000           Tomt 2004
# 5  Østfold        RYGGE 10000000   16000000 Næringseiendom 2006
# 6 Vestfold       LARVIK    61950      61950           Tomt 2013

The specificity of replacing only for those two columns is important, as there are many values in other columns that have spaces, and I don't know that it was your intention to compress them all:

str(sapply(allData, function(z) unique(grep(" ", z, value = TRUE)), simplify = FALSE))
# List of 6
#  $ County      : chr [1:2] "Møre og Romsdal" "Sogn- og fjordane"
#  $ Municipality: chr [1:4] "EVJE OG HORNNES" "VESTRE TOTEN" "ØSTRE TOTEN" "NORDRE LAND"
#  $ Tax         : chr [1:414] " 2 400 000" " 2 225 000" " 7 600 000" " 3 000 000" ...
#  $ SoldAmount  : chr [1:538] " 2 550 000" " 2 100 000" " 18 000 000" " 3 815 000" ...
#  $ Type        : chr "Annen kategori"
#  $ Date        : chr(0) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM