简体   繁体   中英

How to open/join more than one file (depending on user input) and then use 2 files simultaneously

EDIT: Sorry for the misunderstanding, I have edited a few things, to hopefully actually request what I want.

I was wondering if there was a way to open/join two or more files to run the rest of the program on.

For example, my directory has these files:

taggedchpt1_1.txt , parsedchpt1_1.txt , taggedchpt1_2.txt , parsedchpt1_2.txt etc...

The program must call a tagged and parsed simultaneously. I want to run the program on both of chpt1_1 and chpt1_2, preferably joined together in one.txt file, unless it would be very slow to do so. For instance run what would be accomplished having two files:

taggedchpt1_1_and_chpt1_2 and parsedchpt1_1_and_chpt1_2

Can this be done through Perl? Or should I just combine the text files myself(or automate that process, making chpt1.txt which would include chpt1_1, chpt1_2, chpt1_3 etc...)

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the form chp#_sec#:\n"; ##So the user inputs 31_3, for example
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);

open(my $tag_corpus, '<', "tagged${chapter_and_section}.txt") or die $!;
open(my $parse_corpus, '<', "parsed${chapter_and_section}.txt") or die $!;

For the rest of the program to work, I need to be able to have:

my @sentences = <$tag_corpus>; ##right now this is one file, I want to make it more
my @typeddependencies = <$parse_corpus>; ##same as above

EDIT2 : Really sorry about the misunderstanding. In the program, after the steps shown, I do 2 for loops. Reading through the lines of the tagged and parsed.

What I want is to accomplish this with more files from the same directory, without having to re-input the next files. (ie. I can run taggedchpt31_1.txt and parsedchpt31_1.txt...... I want to run taggedchpt31 and parsedchpt31 - which includes ~chpt31_1, ~chpt31_2, etc...)

Ultimately, it would be best if I joined all the tagged files and all the parsed files that have a common chapter (in the end still requiring only two files I want to run) but not have to save the joined file to the directory... Now that I put it into words, I think I should just save files that include all the sections.

Sorry and Thanks for all your time. Look at FMc's breakdown of my question for more help.

You could iterate over the file names, opening and reading each one in turn. Or you could produce an iterator that knows how to read lines from sequence of files.

sub files_reader {
    # Takes a list of file names and returns a closure that
    # will yield lines from those files.
    my @handles = map { open(my $h, '<', $_) or die $!; $h } @_;
    return sub {
        shift @handles while @handles and eof $handles[0];
        return unless @handles;
        return readline $handles[0];
    }
}

my $reader = files_reader('foo.txt', 'bar.txt', 'quux.txt');

while (my $line = $reader->()) {
    print $line;
}

Or you could use Perl's built-in iterator that can do the same thing:

local @ARGV = ('foo.txt', 'bar.txt', 'quux.txt');
while (my $line = <>) {
    print $line;
}

Edit in response to follow-up questions:

Perhaps it would help to break your problem down into smaller sub-tasks. As I understand it, you have three steps.

  • Step 1 is to get some input from the user -- perhaps a directory name, or maybe a couple of file name patterns ( taggedchpt and parsedchpt ).

  • Step 2 is for the program to find all of the relevant file names. For this task, glob() or readdir() might be useful. There are many questions on StackOverflow related to such issues. You'll end up with two lists of file names, one for the tagged files and one for the parsed files.

  • Step 3 is to process the lines across all of the files in each of the two sets. Most of the answers you have received, including mine, will help you with this step.

You're almost there... this is a bit more efficient than discrete opens on each file...

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the for chp#_sec#:\n";
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);

open(FH, '>output.txt') or die $!;   # Open an output file for writing
foreach ("tagged${chapter_and_section}.txt", "parsed${chapter_and_section}.txt") {
    open FILE, "<$_" or die $!;      # Read a filename (from the array)
    foreach (<FILE>) {
       $_ =~ s/THIS/THAT/g;   # Regex replace each line in the open file (use 
                              #     whatever you like instead of "THIS" &
                              #     "THAT"
       print FH $_;           # Write to the output file
    }
}

No one has mentioned the @ARGV hack yet? Ok, here it is.

{
    local @ARGV = ('taggedchpt1_1.txt', 'parsedchpt1_1.txt', 'taggedchpt1_2.txt',  
                   'parsedchpt1_2.txt');
    while (<ARGV>) {
       s/THIS/THAT/;
       print FH $_;
    }
}

ARGV is a special filehandle that iterates through all the filenames in @ARGV , closing a file and opening the next one as necessary. Normally @ARGV contains the command-line arguments that you passed to perl , but you can set it to anything you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM