如何从文件中提取文本行？

Question

I have a directory full of files and I need to pull the headers and footers off of them. 我有一个充满文件的目录，我需要从它们中提取页眉和页脚。 They are all variable length so using head or tail isn't going to work. 它们都是可变长度的，因此使用头或尾将不起作用。 Each file does have a line I can search for, but I don't want to include the line in the results. 每个文件确实都有我可以搜索的行，但是我不想在结果中包括该行。

It's usually 通常是

*** Start (more text here)

And ends with 并以

*** Finish (more text here)

I want the file names to stay the same, so I need to overwrite the originals, or write to a different directory and I'll overwrite them myself. 我希望文件名保持不变，因此我需要覆盖原始文件或写入其他目录，然后自己覆盖它们。

Oh yeah, it's on a linux server of course, so I have Perl, sed, awk, grep, etc. 哦，是的，它当然在Linux服务器上，所以我有Perl，sed，awk，grep等。

Answer 1

Try the flip flop! 尝试触发器！ ".." operator. “ ..”运算符。

# flip-flop.pl
use strict;
use warnings;

my $start  = qr/^\*\*\* Start/;
my $finish = qr/^\*\*\* Finish/;

while ( <> ) {
    if ( /$start/ .. /$finish/ ) {
        next  if /$start/ or /$finish/;
        print $_;
    }
}

U can then use the -i perl switch to update your file(s) like so..... 然后，您可以使用-i perl开关来更新您的文件，如.....

 $ perl -i'copy_*' flip-flop.pl data.txt

...which changes data.txt but makes a copy beforehand as "copy_data.txt". ...会更改data.txt，但会预先复制为“ copy_data.txt”。

Answer 2

GNU coreutils are your friend... GNU coreutils是您的朋友...

csplit inputfile %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}

This produces your desired file as xx00 . 这将生成所需文件xx00 。 You can change this behaviour through the options --prefix , --suffix , and --digits , but see the manual for yourself. 通过选项可以更改此行为--prefix ， --suffix和--digits ，但看到手册自己。 Since csplit is designed to produce a number of files, it is not possible to produce a file without suffix, so you will have to do the overwriting manually or through a script: 由于csplit旨在生成许多文件，因此无法生成没有后缀的文件，因此您将不得不手动或通过脚本进行覆盖：

csplit $1 %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}
mv -f xx00 $1

Add loops as you desire. 根据需要添加循环。

Answer 3

To get the header : 获取标题：

cat yourFileHere | awk '{if (d > 0) print $0} /.*Start.*/ {d = 1}'

To get the footer : 要获得页脚：

cat yourFileHere | awk '/.*Finish.*/ {d = 1} {if (d < 1) print $0}'

To get the file from header to footer as you want: 要根据需要从页眉到页脚获取文件：

cat yourFileHere | awk '/.*Start.*/ {d = 1; next} /.*Finish.*/ {d = 0; next} {if (d > 0) print $0}'

There's one more way, with csplit command, you should try something like: 还有另一种方法，使用csplit命令，您应该尝试类似的方法：

csplit yourFileHere /Start/ /Finish/

And examine files named 'xxNN' where NN is running number, also take a look at csplit manpage . 并检查名为xxxx的文件（其中NN是运行编号），还请查看csplit联机帮助页。

Answer 4

Maybe? 也许？ Start to Finish with not-delete. 从完成删除开始。

$ sed -i '/^\*\*\* Start/,/^\*\*\* Finish/d!' *

or...less sure of it...but, if it works, should remove the Start and Finish lines as well: 或...对此不太确定...但是，如果可行，还应删除“开始”和“结束”行：

$ sed -i -e '/./,/^\*\*\* Start/d' -e '/^\*\*\* Finish/,/./d' *

d! may depend on the build of sed you have -- not sure. 可能取决于您拥有的sed的版本-不确定。
And, I wrote that entirely on (probably poor) memory. 而且，我完全是在（可能很差的）内存上写的。

Answer 5

A quick Perl hack, not tested. 快速的Perl hack，未经测试。 I am not fluent enough in sed or awk to get this effect with them, but I would be interested in how that would be done. 我对sed或awk的语言不够流利，无法对他们产生这种效果，但是我会对如何做到这一点感兴趣。

#!/usr/bin/perl -w
use strict;
use Tie::File;
my $Filename=shift;  
tie my @File, 'Tie::File', $Filename or die "could not access $Filename.\n";  
while (shift @File !~ /^\*\*\* Start/) {};  
while (pop @File !~ /^\*\*\* Finish/) {};  
untie @File;

Answer 6

A Perl solution that overwrites the original file. 覆盖原始文件的Perl解决方案。

#!/usr/bin/perl -ni
if(my $num = /^\*\*\* Start/ .. /^\*\*\* Finish/) {
    print if $num != 1 and $num + 0 eq $num;
}

Answer 7

Some of the examples in perlfaq5: How do I change, delete, or insert a line in a file, or append to the beginning of a file? perlfaq5中的一些示例：如何更改，删除或在文件中插入一行，或追加到文件的开头？ may help. 可能会有所帮助。 You'll have to adapt them to your situation. 您必须使它们适应您的情况。 Also, Leon's flip-flop operator answer is the idiomatic way to do this in Perl, although you don't have to modify the file in place to use it. 另外，Leon的触发器运算符答案是在Perl中执行此操作的惯用方式，尽管您不必就地修改文件即可使用它。

如何从文件中提取文本行？

问题描述

7 个解决方案

解决方案1
3 已采纳 2008-11-17 19:54:14

解决方案2
2 2008-11-17 23:26:50

解决方案3
1 2008-11-17 18:23:54

解决方案4
0 2008-11-17 18:17:15

解决方案5
0 2008-11-17 18:34:51

解决方案6
0 2008-11-17 19:37:24

解决方案7
0 2008-11-17 22:52:31

如何从文件中提取文本行？

问题描述

7 个解决方案

解决方案1 3 已采纳 2008-11-17 19:54:14

解决方案2 2 2008-11-17 23:26:50

解决方案3 1 2008-11-17 18:23:54

解决方案4 0 2008-11-17 18:17:15

解决方案5 0 2008-11-17 18:34:51

解决方案6 0 2008-11-17 19:37:24

解决方案7 0 2008-11-17 22:52:31

解决方案1
3 已采纳 2008-11-17 19:54:14

解决方案2
2 2008-11-17 23:26:50

解决方案3
1 2008-11-17 18:23:54

解决方案4
0 2008-11-17 18:17:15

解决方案5
0 2008-11-17 18:34:51

解决方案6
0 2008-11-17 19:37:24

解决方案7
0 2008-11-17 22:52:31