简体   繁体   English

sed如何删除文件中的前17行和后8行

[英]sed how to delete first 17 lines and last 8 lines in a file

I have a big file 150GB CSV file and I would like to remove the first 17 lines and the last 8 lines. 我有一个150GB的大文件CSV文件,我想删除前17行和后8行。 I have tried the following but seems that's not working right 我尝试了以下但似乎没有正常工作

sed -i -n -e :a -e '1,8!{P;N;D;};N;ba' 

and

sed -i '1,17d' 

I wonder if someone can help with sed or awk, one liner will be great? 我想知道是否有人可以帮助sed或awk,一个班轮会很棒吗?

headtailsedawk更适合工作。

tail -n+18 file | head -n-8 > newfile
awk -v nr="$(wc -l < file)" 'NR>17 && NR<(nr-8)' file

所有awk:

awk 'NR>y+x{print A[NR%y]} {A[NR%y]=$0}' x=17 y=8 file
Try this :

sed '{[/]<n>|<string>|<regex>[/]}d' <fileName>       
sed '{[/]<adr1>[,<adr2>][/]d' <fileName>

where 哪里

  1. /.../=delimiters /.../=delimiters

  2. n = line number n =行号

  3. string = string found in in line string =在行中找到的字符串

  4. regex = regular expression corresponding to the searched pattern regex =对应于搜索模式的正则表达式

  5. addr = address of a line (number or pattern ) addr =一行的地址(数字或模式)

  6. d = delete d =删除

Refer this link 请参阅此链接

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-17)) > file

Edit: As mtk posted in comment this won't work. 编辑:由于mtk在评论中发布,这将无法正常工作。 If you want to use wc and track file length you should use: 如果你想使用wc和跟踪文件长度你应该使用:

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-8-17)) > file

or: 要么:

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file > file
LENGTH=`wc -l < file`
tail -n $((LENGTH-17)) file > file

What makes this solution less elegant than that posted by choroba :) 是什么让这个解决方案不如choroba发布的优雅:)

I learnt this today for the shell. 我今天为shell学到了这个。

{
  ghead -17  > /dev/null
  sed -n -e :a -e '1,8!{P;N;D;};N;ba'
} < my-bigfile > subset-of

One has to use a non consuming head , hence the use of ghead from the GNU coreutils. 一个人必须使用非消耗head ,因此使用GNU coreutils中的ghead

Similar to Thor's answer , but a bit shorter: 类似雷神的答案 ,但有点短:

sed -i '' -e $'1,17d;:a\nN;19,25ba\nP;D' file.txt

The -i '' tells sed to edit the file in place. -i ''告诉sed编辑文件。 (The syntax may be a bit different on your system. Check the man page.) (系统上的语法可能略有不同。请查看手册页。)

If you want to delete front lines from the front and tail from the end, you'd have to use the following numbers: 如果要从前端删除fronttail删除tail ,则必须使用以下数字:

1,{front}d;:a\nN;{front+2},{front+tail}ba\nP;D

(I put them in curly braces here, but that's just pseudocode. You'll have to replace them by the actual numbers. Also, it should work with {front+1} , but it doesn't on my machine (macOS 10.12.4). I think that's a bug.) (我把它们放在花括号中,但这只是伪代码。你必须用实际的数字替换它们。而且,它应该用{front+1} ,但它不在我的机器上(macOS 10.12。 4)。我认为这是一个错误。)

I'll try to explain how the command works. 我将尝试解释该命令的工作原理。 Here's a human-readable version: 这是一个人类可读的版本:

1,17d     # delete lines 1 ... 17, goto start
:a        # define label a
N         # add next line from file to buffer, quit if at end of file
19,25ba   # if line number is 19 ... 25, goto start (label a)
P         # print first line in buffer
D         # delete first line from buffer, go back to start

First we skip 17 lines. 首先我们跳过17行。 That's easy. 这很简单。 The rest is tricky, but basically we keep a buffer of eight lines. 剩下的很棘手,但基本上我们保留了8行的缓冲区。 We only start printing lines when the buffer is full, but we stop printing when we reach the end of the file, so at the end, there are still eight lines left in the buffer that we didn't print - in other words, we deleted them. 我们只在缓冲区已满时才开始打印行,但是当我们到达文件末尾时我们就会停止打印,所以最后,缓冲区中还剩下八行我们没有打印 - 换句话说,我们删除它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM