I have a series of very big single-lined files of space separated values. It looks like
0.993194 0.9684194 0.846847658 1.0 1.0 1.0 1.0 0.78499 0.54879564 0.9998545 ...
I would like to read the first copy the first n elements of each file.
I could convert the spaces into new lines ( cat file.txt | tr ' ' '\\n' > file2.txt
) and then read it line by line and save each line in a new file ( head -n $n file2.txt | while read line; do echo $line >> file3.txt;done
) but that would be very slow. (Above code not tested)
How can I efficiently copy the first n values of a single-lined file?
Note: I am fine with copying the first n characters even if this correspond to an undefined number of values.
How about just using awk
with specifying the number of records you want?
awk -v n=5 '{for(i=1;i<=n;i++) print $i}' file
0.993194
0.9684194
0.846847658
1.0
1.0
(or) to print in the same line using printf
awk -v n=5 '{for(i=1;i<=n;i++) printf "%s ",$i}' file
0.993194 0.9684194 0.846847658 1.0 1.0
(or) using cut
with POSIX
compliant options, -d
for setting the de-limiter and -f 1-5
for fields 1 through 5.
cut -d' ' -f 1-5 file
0.993194 0.9684194 0.846847658 1.0 1.0
I'd use a carefully-designed regex in egrep
, with the -o
flag to make it only print the output that matches:
egrep -e '^([0-9.]+[ ]*){3}' -o file.txt
Prints out:
0.993194 0.9684194 0.846847658
As grep is a pretty well-known and very heavily-optimized tool, this performs pretty well; I just tried it on a 3-megabyte text file and it didn't take significantly longer than it took on a 30-byte text file.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.