简体   繁体   中英

Commas instead of points with XLConnect and readWorksheet in R

I am using the XLConnect library to read .xlsx data for paneldata analysis purposes. My problem: If reading a data frame, I get commas instead of points as a decimal separator and I am not sure why this is the case and how I should solve it. I am from Europe, but I use a decimal point in Excel.

Reproducing an example is quite difficult, here are the important lines:

wb = loadWorkbook("Bel_PANEL_DATA.xlsx") 
df_price <- readWorksheet(wb, sheet="Prices", keep=c(3,10))
colnames(df_price) <- c("Year", "Price")

The output of some random lines is:

      Year          Price
38    2000          175,1735
39    2001          196,2913
40    2002          204,3013
41    2003          251,2955
42    2004          259,8135
43    2005          265,1185
44    2006          370,9554
45    2007          367,2868
46    2008          339,0321
47    2009          348,6053

and ...

> typeof(df_price$Price)
[1] "character"

If I use as.numeric I only get NA values (all of them)...

Before setting them to as.numeric() , you'd want to do substitution of "," to "." :

df_price$Price <- as.numeric(sub(",", ".", df_price$Price))
data<-read.table(header=T,text="
Year          Price
2000          175,1735
2001          196,2913
2002          204,3013
2003          251,2955
2004          259,8135
2005          265,1185
2006          370,9554
2007          367,2868
2008          339,0321
2009          348,6053")
price<-paste(substr(data$Price,1,3),substr(data$Price,5,8))
library(stringr)
data$Price <- as.numeric(str_replace_all(price, fixed(" "), ""))
str(data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM