如何从tar文件中的csv文件中提取前几行而不在Linux中提取它？

Question

I have a tar file which has lot of csv files in it. 我有一个tar文件，其中包含很多csv文件。 How to get the first few lines of each csv file without extracting it? 如何不提取每个csv文件的前几行？

I tried: 我试过了：

$(tar -Oxf $tarfile $file | head -n "$NL") >> cdn.log

But got error saying: 但是有错误说：

time(http:index: command not found

This is some line in one of the csv files. 这是其中一个csv文件中的一行。 Similar errors are reported for all csv files... Any idea?? 所有csv文件都报告了类似的错误...任何想法吗？

Answer 1

Using -O you can tell tar to extract a file to standard output instead of to file. 使用-O可以告诉tar将文件提取到标准输出而不是文件中。 So you should be able to first use tar tf <YOUR_FILE> to list the files from archive and filter it using grep to find the CSV files, and then for each file use tar xf <YOUR_FILE> <NAME_OF_CSV> -O | head 因此，您应该能够首先使用tar tf <YOUR_FILE>列出存档中的文件，并使用grep对其进行过滤以找到CSV文件，然后对每个文件使用tar xf <YOUR_FILE> <NAME_OF_CSV> -O | head tar xf <YOUR_FILE> <NAME_OF_CSV> -O | head to get the file's beginning to stdout. tar xf <YOUR_FILE> <NAME_OF_CSV> -O | head开始获取文件的开始stdout。 This may be a bit ineffective since you unpack the archive as many tiems as there are CSV files, but should work. 这可能有点无效，因为您解压缩了归档文件，其中包含与CSV文件一样多的tiems，但是应该可以。

Answer 2

You can use perl and its Archive::Tar module. 您可以使用perl及其Archive::Tar模块。 Here a one-liner that extract the first two lines of each one: 这里是一个单线，可提取每行的前两行：

perl -MArchive::Tar -E '
    for (Archive::Tar->new(shift)->get_files) { 
        say (join qq|\n|, (split /\n/, $_->get_content, 3)[0..1]) 
    }
' file.tar

It assumes that the tar file only has text files and they are csv . 假定tar文件只有文本文件，并且它们是csv 。 Otherwise you will have to grep the list to filter those you want. 否则，您将必须grep列表以过滤所需的列表。

如何从tar文件中的csv文件中提取前几行而不在Linux中提取它？

问题描述

2 个解决方案

解决方案1
2 已采纳 2013-09-27 10:23:41

解决方案2
0 2013-09-27 10:50:56

如何从tar文件中的csv文件中提取前几行而不在Linux中提取它？

问题描述

2 个解决方案

解决方案1 2 已采纳 2013-09-27 10:23:41

解决方案2 0 2013-09-27 10:50:56

解决方案1
2 已采纳 2013-09-27 10:23:41

解决方案2
0 2013-09-27 10:50:56