简体   繁体   English

zcat读取gzip文件,然后在Perl中将它们串联

[英]zcat to read gzip files and then concatenate them in Perl

I need to write a perl script to read gzipped files from a text file list of their paths and then concatenate them together and output to a new gzipped file. 我需要编写一个perl脚本,以从其路径的文本文件列表中读取压缩文件,然后将它们连接在一起并输出到新的压缩文件中。 ( I need to do this in perl as it will be implemented in a pipeline) I am not sure how to accomplish the zcat and concatenation part, as the file sizes would be in Gbs, I need to take care of the storage and run time as well. (我需要在perl中执行此操作,因为它将在管道中实现)我不确定如何完成zcat和串联部分,因为文件大小为Gbs,所以我需要注意存储和运行时间也一样

So far I can think of it as - 到目前为止,我可以认为它是-

use strict;
use warnings;
use IO::Compress::Gzip qw(gzip $GzipError) ;

#-------check the input file specified-------------#

$num_args = $#ARGV + 1;
if ($num_args != 1) {
    print "\nUsage: name.pl Filelist.txt \n";
exit;

$file_list = $ARGV[0];

#-------------Read the file into arrray-------------#

my @fastqc_files;   #Array that contains gzipped files 
use File::Slurp;
my @fastqc_files = $file_list;


#-------use the zcat over the array contents 
my $outputfile = "combined.txt"
open(my $combined_file, '>', $outputfile) or die "Could not open file '$outputfile' $!";

for my $fastqc_file (@fastqc_files) {

    open(IN, sprintf("zcat %s |", $fastqc_file)) 
      or die("Can't open pipe from command 'zcat $fastqc_file' : $!\n");
    while (<IN>) {
        while ( my $line = IN ) {
          print $outputfile $line ;
        }
    }
    close(IN);

my $Final_combied_zip = new IO::Compress::Gzip($combined_file);
  or die "gzip failed: $GzipError\n";

Somehow I am not able to get it to run. 我无法以某种方式使其运行。 Also if anyone can guide on the correct way to output this zipped file. 另外,如果有人可以指导正确的方式输出此压缩文件。

Thanks! 谢谢!

You don't need perl for this. 您不需要为此使用perl。 You don't even need zcat/gzip as gzipped files are cat able: 您甚至不需要zcat / gzip,因为压缩文件支持cat

cat $(cat pathfile) >resultfile

But if you really really need to try to get the extra compression by combining: 但是,如果您真的需要结合以下方法来尝试获得额外的压缩:

zcat $(cat pathfile)|gzip >resultfile

Adding: Also note the very first "related" link on the right, which seems to already answer this very question: How to concat two or more gzip files/streams 添加:还请注意右侧的第一个“相关”链接,该链接似乎已经回答了这个问题: 如何连接两个或多个gzip文件/流

Thanks for the replies - the script runs well now - 感谢您的答复-脚本现在运行良好-

#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp;
use IO::Compress::Gzip qw(gzip $GzipError);


my @data = read_file('./File_list.txt');
my $out = "./test.txt";


foreach my $data_file (@data)

{
    chomp($data_file);
    system("zcat $data_file >> $out");
}
my $outzip = "./test.gz";
gzip $out => $outzip;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM