简体   繁体   中英

Reading haven created dta file in Stata - how to deal with dots in variable names?

We are working in Stata with data created in R, that have been exported using haven package. We stumbled upon an issue with variables that have a dot in the name. To replicate the problem, some minimal R code:

library("haven")
var.1 <- c(1,2,3)
var_2 <- c(1,2,3)
test_df <- employ.data <- data.frame(var.1, var_2)
str(test_df)
write_dta(test_df, "D:/test_df.dta")

Now, in Stata, when I do:

use "D:\test_df.dta"
d

First problem - I get an empty dataset. Second problem - we get variable name with a dot - which in Stata should be illegal. Therefore any command using directly the variable name like

drop var.1

returns an error:

factor variables and time-series operators not allowed
r(101);

What is causing such behaviour? Any solutions to this problem?

This will drop var.1 in Stata:

drop var?1

Here (as in Excel), ? is used as a wildcard for a single character. (The regular expression equivalent to . )

Unfortunately, this will also drop var_1 , if it exists.

I am not sure about the missing values when writing a .dta file with haven . I am able to replicate this result in Stata 14.1 and haven 0.2.0. However, using the read_dta function from haven ,

temp2 <- read_dta("test_df.dta")

returns the data.frame. As an alternative to haven , I have used the readstata13 package in the past without issues.

library(readstata13)
save.dta13(test_df, "testdf.dta")

While this code has the same variable names issue, it provided a .dta file that contained the correct values when read into Stata 14.1. There is a convert.underscore argument to save.dta13 , that is intended to remove non-valid characters in Stata variable names. I verified that it will work properly in this example for readstata13 for version 0.8.5, but had a bug in some earlier versions including version 0.8.2.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM