简体   繁体   中英

R write.table to txt with specified leading 0s

I have a df that I need to export as txt. But I want for some variables to consider their width so in the simplified example below (actual df has around 300 cols):

col1 <-  c(1,2,3,4,5)
col2 <-  c(1,6,7,10,1)

df <- data.frame(col1,col2)
write.table(df, file = "dataset.txt", sep = "", row.names = F, col.names = F)

What I get is:

11
26
37
410
51

But what I need is

101
206
307
410
501

So as some variables can have a width of 2, then I need to add leading zeros in other languages such as SPSS you can do something like this:

WRITE OUTFILE=dataset.txt /
col1                (N1)
col1                (N2).

Is there something like this for R? Thx!

A simple approach is to simply add the leading zeros before exporting the dataset:

df$col2 <- sprintf("%02d", df$col2)

Here is a solution that combines lapply() with gdata::write.fwf() to write a fixed record file.

We will use the mtcars data, convert the row names to a column in the data frame, format the columns with sprintf() , and write them to an output file that can be read with base::read.fwf() or another program that reads fixed record files.

data <- cbind(car = rownames(mtcars),mtcars)
fmtList <- c("%20s","%03.1f","%02d","%05.1f","%04d","%04.2f",
             "%06.3f","%05.2f","%02d","%02d","%02d","%02d")
result <- lapply(1:12,function(x,y,z) {
     y[[x]] <- sprintf(z[x],y[[x]])
},data,fmtList)

output <- do.call(cbind,result)
library(gdata)
write.fwf(output,'./data/output.dat',
          rownames = FALSE,colnames = FALSE,
          formatInfo = TRUE)

write.fwf() produces a format listing to list the start and end columns for each variable in the output file.

> write.fwf(output,'./data/output.dat',
+           rownames = FALSE,colnames = FALSE,
+           formatInfo = TRUE)
   colname nlevels position width digits exp
1       V1      32        1    20      0   0
2       V2      25       22     4      0   0
3       V3       3       27     2      0   0
4       V4      27       30     5      0   0
5       V5      22       36     4      0   0
6       V6      22       41     4      0   0
7       V7      29       46     6      0   0
8       V8      30       53     5      0   0
9       V9       2       59     2      0   0
10     V10       2       62     2      0   0
11     V11       3       65     2      0   0
12     V12       6       68     2      0   0
>

...and the first few rows of the resulting output file:

           Mazda RX4 21.0 06 160.0 0110 3.90 02.620 16.46 00 01 04 04
       Mazda RX4 Wag 21.0 06 160.0 0110 3.90 02.875 17.02 00 01 04 04
          Datsun 710 22.8 04 108.0 0093 3.85 02.320 18.61 01 01 04 01
      Hornet 4 Drive 21.4 06 258.0 0110 3.08 03.215 19.44 01 00 03 01
   Hornet Sportabout 18.7 08 360.0 0175 3.15 03.440 17.02 00 00 03 02
             Valiant 18.1 06 225.0 0105 2.76 03.460 20.22 01 00 03 01
          Duster 360 14.3 08 360.0 0245 3.21 03.570 15.84 00 00 03 04
           Merc 240D 24.4 04 146.7 0062 3.69 03.190 20.00 01 00 04 02
            Merc 230 22.8 04 140.8 0095 3.92 03.150 22.90 01 00 04 02
            Merc 280 19.2 06 167.6 0123 3.92 03.440 18.30 01 00 04 04

To eliminate the spaces separating the columns we can add sep = "" to the write.fwf() function.

write.fwf(output,'./data/output.dat',
          rownames = FALSE,colnames = FALSE,
          formatInfo = TRUE,sep = "")

...and the first 10 rows of the modified output file:

           Mazda RX421.006160.001103.9002.62016.4600010404
       Mazda RX4 Wag21.006160.001103.9002.87517.0200010404
          Datsun 71022.804108.000933.8502.32018.6101010401
      Hornet 4 Drive21.406258.001103.0803.21519.4401000301
   Hornet Sportabout18.708360.001753.1503.44017.0200000302
             Valiant18.106225.001052.7603.46020.2201000301
          Duster 36014.308360.002453.2103.57015.8400000304
           Merc 240D24.404146.700623.6903.19020.0001000402
            Merc 23022.804140.800953.9203.15022.9001000402
            Merc 28019.206167.601233.9203.44018.3001000404

You can use str_pad from stringr :

library(stringr)  
df$col3 <- str_c(df$col1, str_pad(df$col2, 2, pad = 0))
df

#  col1 col2 col3
#1    1    1  101
#2    2    6  206
#3    3    7  307
#4    4   10  410
#5    5    1  501

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM