简体   繁体   中英

Error when reading SPS (.sav) file in R: file not in any supported SPSS format

Trying to read a SPSS file (.sav format) in R raises:

Error: file is not in any supported SPSS format.

This happens when trying to read the.sav file with foreign and read.spss . Trying the memsic package and its as.data.set(spss.system.file("my_file")) raises:

Error in spss.readheader(file): not a sysfile

The file is a very long SPSS file containing over 2 million entries and hundreds of factors. The factors vary: Many are categorical "Yes" / "No" / "Missing" / "None", some are numerical (IDS etc), some are labelled with texts ("State One" / "State 2" / "State 3") and some are mixed ("1" / "20" / "3732" / "Technical Problem"). Sadly, I can't give you a subset of my data (severe restrictions on privacy and I don't have a SPSS license).

Reading this file in and storing it as a feather file (.fea format) already has worked on another computer - that might have had another version of R installed. I have no way of checking what version that was though... Currently, I'm working in R version 3.4.4 (2018-03-2015) on windows 10, and use packages memisc_0.99.17.2 and foreign_0.8-71. The file is stored on a server, my R is installed in a user on the local drive.

This is the code I've tried:

require(foreign)
ws <- "my_workspace_in_local_user"
setwd(ws)
dataDir <- "my_directory_on_the_server_containing_the_file"
fn <- paste0(dataDir, "my_file.sav")
dat <- read.spss(fn, to.data.frame = TRUE)

and

require(foreign)
ws <- "my_workspace_in_local_user"
setwd(ws)
dataDir <- "my_directory_on_the_server_containing_the_file"
fn <- paste0(dataDir, "my_file.sav")
install.packages("memisc")
require("memisc")
dat <- as.data.set(fn, to.data.frame = TRUE)

Does anybody have an idea why this wouldn't work? I'm suspecting it's a problem of which version of R and the packages to use...?

Your first set of code worked for me on macOS 10.15.1 (Catalina) and R 3.6.1 with memisc_0.99.17.2 and foreign_0.8-71.


R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[R.app GUI 1.70 (7684) x86_64-apple-darwin15.6.0]


> require(foreign)
Loading required package: foreign
> dataDir <- "~/Samples/English/"
> fn <- paste0(dataDir, "accidents.sav")
> dat <- read.spss(fn, to.data.frame = TRUE)
> print(dat)
    agecat gender accid    pop
1 Under 21 Female 57997 198522
2    21-25 Female 57113 203200
3    26-30 Female 54123 200744
4 Under 21   Male 63936 187791
5    21-25   Male 64835 195714
6    26-30   Male 66804 208239

The "accidents.sav" is an example data file that ships with IBM SPSS Statistics versions 19.0 thru 26.0.

If this code works for you against known data from IBM SPSS, then you can probably rule out your R version and configuration as a cause. Unfortunately that probably means your *.sav file is corrupted in some way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM