I have a large ASCII file that looks something like this if you open it in a text editor:
11112223423 4434 555534 5533 54534 5354 5532 434 4 43434 23424234 34 4534 34453 345345345 345344 344 43423453453 43444 99098 234090 4354550 345399 43453 9900 4
I have been given a mapping of the columns. For example: The first variable sits in columns 1-9. The second column sits in 104-105. And so on.
Is there an easy way to read this type of data into R so that I end up with a data.frame?
Thanks for the help!
I've used the standard read.fwf()
for this kind of thing.
I also like read_fwf()
from the readr package. For example:
#create some dummy fixed-width-field data
fixed_width_data <- "line1 field1 datafield2 dataetc\nline2 field1 datafield2 dataetc\n"
#specify the data columns
field_info <- fwf_widths(c(7, 11, 11, 3), col_names=c("line_number", "field1", "field2", "fieldn"))
#read it in
parsed <- read_fwf(fixed_width_data, field_info)
To specify start/end positions for the columns of data, you can use fwf_positions()
instead of fwf_widths()
:
#create some dummy fixed-width-field data
fixed_width_data2 <- "line1 field1 datafield2 dataTEXT TO SKIPetc\nline2 field1 datafield2 dataTEXT TO SKIPetc\n"
#specify the data columns using start and end positions
field_info2 <- fwf_positions(start=c(1, 8, 19, 42), end=c(5, 18, 29, 44), col_names=c("line_number", "field1", "field2", "fieldn"))
#read it in
parsed2 <- read_fwf(fixed_width_data2, field_info2)
You can do this in base R using read.fwf
(fixed width fields) I wrote a file with your single line of input and got:
FullFile = read.fwf("Test.txt", widths=c(9,94,2))
Interesting = FullFile[,c(1,3)]
Interesting
V1 V3
1 111122234 42
Note that I am reading the columns to skip into a variable and then just discarding that variable.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.