[英]sed how to delete first 17 lines and last 8 lines in a file
I have a big file 150GB CSV file and I would like to remove the first 17 lines and the last 8 lines. 我有一个150GB的大文件CSV文件,我想删除前17行和后8行。 I have tried the following but seems that's not working right
我尝试了以下但似乎没有正常工作
sed -i -n -e :a -e '1,8!{P;N;D;};N;ba'
and 和
sed -i '1,17d'
I wonder if someone can help with sed or awk, one liner will be great? 我想知道是否有人可以帮助sed或awk,一个班轮会很棒吗?
head
和tail
比sed
或awk
更适合工作。
tail -n+18 file | head -n-8 > newfile
awk -v nr="$(wc -l < file)" 'NR>17 && NR<(nr-8)' file
所有awk:
awk 'NR>y+x{print A[NR%y]} {A[NR%y]=$0}' x=17 y=8 file
Try this :
sed '{[/]<n>|<string>|<regex>[/]}d' <fileName>
sed '{[/]<adr1>[,<adr2>][/]d' <fileName>
where 哪里
/.../=delimiters /.../=delimiters
n = line number n =行号
string = string found in in line string =在行中找到的字符串
regex = regular expression corresponding to the searched pattern regex =对应于搜索模式的正则表达式
addr = address of a line (number or pattern ) addr =一行的地址(数字或模式)
d = delete d =删除
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-17)) > file
Edit: As mtk posted in comment this won't work. 编辑:由于mtk在评论中发布,这将无法正常工作。 If you want to use
wc
and track file length you should use: 如果你想使用
wc
和跟踪文件长度你应该使用:
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-8-17)) > file
or: 要么:
LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file > file
LENGTH=`wc -l < file`
tail -n $((LENGTH-17)) file > file
What makes this solution less elegant than that posted by choroba :) 是什么让这个解决方案不如choroba发布的优雅:)
I learnt this today for the shell. 我今天为shell学到了这个。
{
ghead -17 > /dev/null
sed -n -e :a -e '1,8!{P;N;D;};N;ba'
} < my-bigfile > subset-of
One has to use a non consuming head
, hence the use of ghead
from the GNU coreutils. 一个人必须使用非消耗
head
,因此使用GNU coreutils中的ghead
。
Similar to Thor's answer , but a bit shorter: 类似雷神的答案 ,但有点短:
sed -i '' -e $'1,17d;:a\nN;19,25ba\nP;D' file.txt
The -i ''
tells sed to edit the file in place. -i ''
告诉sed编辑文件。 (The syntax may be a bit different on your system. Check the man page.) (系统上的语法可能略有不同。请查看手册页。)
If you want to delete front
lines from the front and tail
from the end, you'd have to use the following numbers: 如果要从前端删除
front
从tail
删除tail
,则必须使用以下数字:
1,{front}d;:a\nN;{front+2},{front+tail}ba\nP;D
(I put them in curly braces here, but that's just pseudocode. You'll have to replace them by the actual numbers. Also, it should work with {front+1}
, but it doesn't on my machine (macOS 10.12.4). I think that's a bug.) (我把它们放在花括号中,但这只是伪代码。你必须用实际的数字替换它们。而且,它应该用
{front+1}
,但它不在我的机器上(macOS 10.12。 4)。我认为这是一个错误。)
I'll try to explain how the command works. 我将尝试解释该命令的工作原理。 Here's a human-readable version:
这是一个人类可读的版本:
1,17d # delete lines 1 ... 17, goto start
:a # define label a
N # add next line from file to buffer, quit if at end of file
19,25ba # if line number is 19 ... 25, goto start (label a)
P # print first line in buffer
D # delete first line from buffer, go back to start
First we skip 17 lines. 首先我们跳过17行。 That's easy.
这很简单。 The rest is tricky, but basically we keep a buffer of eight lines.
剩下的很棘手,但基本上我们保留了8行的缓冲区。 We only start printing lines when the buffer is full, but we stop printing when we reach the end of the file, so at the end, there are still eight lines left in the buffer that we didn't print - in other words, we deleted them.
我们只在缓冲区已满时才开始打印行,但是当我们到达文件末尾时我们就会停止打印,所以最后,缓冲区中还剩下八行我们没有打印 - 换句话说,我们删除它们。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.