I am trying to read data from the PISA 2012 study ( http://pisa2012.acer.edu.au/downloads.php ) into R using the read.table function. This is the code I tried:
pisa <- read.table("pisa2012.txt", sep = "")
unfortunately I keep getting the following error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: line 2 did not have 184 elements
I have tried to set
header = T
but then get the following error message
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:line 1 did not have 184 elements
Lastly, this is what the .txt file looks like ...
http://postimg.org/image/4u9lqtxqd/
Thanks for your help!
You can see from the first line that you'll need some sort of control file to delimit the individual variables. So, from working with PISA in other environments, I know the first three columns corrrespond to the ISO 3 letter country code (eg, ALB). What follows are numbers and letters that need to be made sense of in a meaninful way by separating them. You could use the codebook for this ( https://pisa2012.acer.edu.au/downloads/M_stu_codebook.pdf ), but that is a real bear for every single variable. Why not download in SPSS or sAS and import? Not a 'slick' solution, but without a control file, you'd have a lot of manual work to do.
I just read the files using readr package. So what will you need: readr package, the TXT file, SAScii package and the associated sas file.
So, let say you want to read the student files. Then you will need the following files: INT_STU12_DEC03.txt and INT_STU12_DEC03.sas.
##################### READING STUDENT DATA ###################
## Loading the dictionary
dic_student = parse.SAScii(sas_ri = 'INT_STU12_SAS.sas')
## Creating the positions to read_fwf
student <- read_fwf(file = 'INT_STU12_DEC03.txt', col_positions = fwf_widths(dic_student$width), progress = T)
colnames(student) <- dic_student$varname
OBS 1: As i'm using Linux, I needed to delete the first lines from the sas file and change the encoding to UTF-8.
OBS 2: The lines deleted, were:
libname M_DEC03 "C:\XXX";
filename STU "C:\XXX\INT_STU12_DEC03.txt";
options nofmterr;
OBS 3: The dataset takes about 1Gb, so you will need enougth RAM.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.