简体   繁体   中英

How to compare two directories and their files in perl

Fred here again with a little issue I'm having that I hope you guys can help me with.

I'm reviewing for midterms and going over an old file I found on here and I wanted to get it working. I can't find it on here anymore but I still have the source code so I'll make another question on it.

So here was his assignment: Write a perl script that will compare two directories for differences in regular files. All regular files with the same names should be tested with the unix function /usr/bin/diff -q which will determine whether they are identical. A file in dir1 which does not have a similarly named file in dir2 will have it's name printed after the string <<< while a file in dir2 without a corresponding dir1 entry will be prefixed with the string >>>. If two files have the same name but are different then the file name will be surrounded by > <.

Here is the script:

#!/usr/bin/perl -w 
use File::Basename;

@files1 = `/usr/bin/find $ARGV[0] -print`;
chop @files1;
@files2 = `/usr/bin/find $ARGV[1] -print`;
chop @files2;

statement:
for ($i=1; @files1 >= $i; $i++) {
    for ($x=1; @files2 >= $x; $x++) {

        $file1 = basename($files1[$i]);
        $file2 = basename($files2[$x]);

        if ($file1 eq $file2) {
            shift @files1;
            shift @files2;
            $result = `/usr/bin/diff -q $files1[$i] $files2[$x]`;
            chop $result;

            if ($result eq "Files $files1[$i] and $files2[$x] differ") {
                print "< $file1 >\n";
                next statement;
        } else {
                print "> $file1 <\n";
            }
        } else  {
            if ( !-e "$files1[$i]/$file2") { print ">>> $file2\n";}
            unless ( -e "$files2[$x]/$file1") { print "<<< $file1\n";}
        }
    }
}

This is the output:

> file2 <
>>> file5
<<< file1

The output should be:

> file1 <
> file2 <
<<< file4
>>> file5

I already checked the files to make sure that they all match and such but still having problems. If anyone can help me out I would greatly appreciate it!

First off, always use these:

use strict;
use warnings;

It comes with a short learning curve, but they more than make up for it in the long run.

Some notes:

  • You should use the File::Find module instead of using a system call.
  • You start your loops at array index 1. In perl, the first array index is 0. So you skip the first element.
  • Your loop condition is wrong. @files >= $x means you will iterate to 1 more than max index (normally). You want either $x < @files or $x <= $#files .
  • You should use chomp , which is a safer version of chop .
  • Altering the arrays you are iterating over is a sure way to cause yourself some confusion.
  • Why use if (! -e ...) and then unless (-e ...) ? That surely just adds confusion.

And this part:

$file1 = basename($files1[$i]);
...
if ( !-e "$files1[$i]/$file2" )

Assuming @files1 contains file names and not just directories, this will never match anything. For example:

$file2 = basename("dir/bar.html");
$file1 = basename("foo/bar.html"); 
-e "foo/bar.html/bar.html";         # does not compute

I would recommend using hashes for the lookup, assuming you only want to match against identical file names and missing file names:

use strict;
use warnings;
use File::Find;
use List::MoreUtils qw(uniq);

my (%files1, %files2);
my ($dir1, $dir2) = @ARGV;

find( sub { -f && $files1{$_} = $File::Find::name }, $dir1);
find( sub { -f && $files2{$_} = $File::Find::name }, $dir2);

my @all = uniq(keys %files1, keys %files2);

for my $file (@all) {
    my $result;
    if ($files1{$file} && $files2{$file}) { # file exists in both dirs
        $result = qx(/usr/bin/diff -q $files1{$file} $files2{$file});
        # ... etc
    } elsif ($files1{$file}) {              # file only exists in dir1
    } else {                                # file only exists in dir2
    }
}

In the find() subroutine, $_ represents the base name, and $File::Find::name the name including path (which is suitable for use with diff ). The -f check will assert that you only include regular files in your hash.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM