简体   繁体   English

Perl用公共列合并两个文本文件

[英]Perl merging two text files with common column

Very new to Perl and this is giving me a headache. Perl的新手,这让我头疼。 I have a text file with user specific information like addresses, email, dates. 我有一个文本文件,其中包含用户特定的信息,例如地址,电子邮件,日期。 I want to compare this file to the /etc/passwd file and if the username exists in the passwd file ,merge the etc/passwd data in an array to then manipulate further. 我想将此文件与/ etc / passwd文件进行比较,如果用户名存在于passwd文件中,请将etc / passwd数据合并到一个数组中,然后进行进一步操作。 The user names as alpha-numerical like "abc123" 用户名称为字母数字,例如“ abc123”

this is what I have done so far: 到目前为止,这是我所做的:

    open PASSWD, "</etc/passwd" or die "$!"; # /etc/passwd file is opened to be searched for a matching username
    open INFO, "<gtc_members.txt" or die "$!"; #information.txt is opened to search for matching user name


             while (<PASSWD>) {
                    chomp;

                    my $userName = (split ":", $_)[0];
                    $homeDirectory = (split ":", $_)[5];
                    $fullName = (split ":", $_)[4];
                    $shell = (split ":", $_)[6];

                    push @userNames, ($fullName, $homeDirectory,$shell); # if $userName eq $searchName;
                    }

            while (<INFO>) {
                    chomp;
                    my $userName1 = (split ",", $_)[2];
                    $userAddress = (split ",", $_)[3];
                    $userEmail = (split ",", $_)[4];
                    $userManager = (split ",", $_)[6];

    push @userNames_1, ($userAddress,$userEmail,$userManager);# if $userName1 eq $searchName;
                    }

And this is where I get stuck. 这就是我卡住的地方。 Despite many attempts with different code I cannot merge the two arrays @userNames and @userNames1 on a common user name. 尽管多次尝试使用不同的代码,但我无法将两个数组@userNames和@ userNames1合并到一个通用用户名上。 It has got so ridiculous now that none of this makes sense. 现在变得如此荒谬,以至于没有任何道理。 I have tried using a hash but struggling to match the common user name as in this example. 我已尝试使用哈希,但是在匹配此示例中的普通用户名时却遇到了困难。

First of all, never ever parse /etc/passwd directly. 首先,永远不要直接解析/etc/passwd Use getpw* functions instead. 请改用getpw*函数。 The second, there is not clear what you are trying to do with data you will get but you can take idea from this: 第二,目前尚不清楚您将要处理的数据,但是您可以从中得到启发:

perl -F, -plae'if(my($fullName, $Dir, $Shell) = (getpwnam($F[2]))[6,7,8]) {$_ .= ",$fullName,$Dir,$Shell"}' gtc_members.txt

All /etc/passwd entries can be obtained by 所有/etc/passwd条目都可以通过以下方式获得

perl -E'$,=":";say @a while @a = getpwent()'

Indeed better to use getpw* functions to fetch system user data, but if you insist reading /etc/passwd then use following strategy: 使用getpw*函数来获取系统用户数据确实更好,但是如果您坚持要读取/ etc / passwd,则可以使用以下策略:

  1. read /etc/passwd into hash, store anonymous array of interesting passwd file data inside that hash, so no @userNames array is needed anymore. 将/ etc / passwd读入哈希,在该哈希内存储有趣的passwd文件数据的匿名数组,因此不再需要@userNames数组。
  2. parse the INFO file, push interesting fields, then lookup by $userName in hash and if exists - push the array from hash which stores /etc/passwd data. 解析INFO文件,推送有趣的字段,然后通过$userName在哈希中查找,如果存在-从存储/ etc / passwd数据的哈希中推送数组。

This example does not invent anything new, but let me put it anyway. 这个例子并没有发明任何新东西,但是无论如何我还是要说。 Notes: final array @userNames_1 is reset on evey iteration of second loop. 注意:在第二个循环的遍历迭代中,将重置最终数组@ userNames_1。 Assume you do something with it inside second loop. 假设您在第二个循环内对其进行了处理。 Putting all into large array is much useless IMHO. 恕我直言,将所有内容放入大型阵列是没有用的。 But if needed, please remove my right after push and it won't be reset. 但是,如果需要,请在push后移除my权利,并且不会被重置。 $userName is lexically local to the while loop blocks, so same name is used. $userName在词汇上对于while循环块是局部的,因此使用相同的名称。 Also I am fetching anonymous array as a reference, as dereferencing it as @{$hash{$index}} gives me "Uninitialized value" error. 另外,我正在获取匿名数组作为引用,因为将其取消引用为@{$hash{$index}}会给我“未初始化的值”错误。 Not sure if it's very correct, but it works anyway. 不知道它是否非常正确,但是仍然可以。

open PASSWD, "</etc/passwd" or die "$!";
open INFO, "<gtc_members.txt" or die "$!";

my %pwhash;

while (<PASSWD>) {
    chomp;
    my ($userName, $fullName, $homeDirectory, $shell) = (split /:/)[0, 4, 5, 6];
    $pwhash{$userName} = [$fullName, $homeDirectory, $shell];
}

while (<INFO>) {
    chomp;
    my ($userName, $userAddress, $userEmail, $userManager) = (split /,/)[2, 3, 4, 6];
    push my @userNames_1, ($userAddress, $userEmail, $userManager);
    push @userNames_1, (@$pwinfo) if ($pwinfo = $pwhash{$userName});
    # test printout
    print join(',', @userNames_1), "\n";
}

Try this out: 试试看:

  use Data::Dumper;
  use strict;
  use warnings;
  my %users_in_passwd;
  while (<PASSWD>) {
      chomp;      
      my $userName = (split ":", $_)[0];
      my $homeDirectory = (split ":", $_)[5];
      my $fullName = (split ":", $_)[4];
      my $shell = (split ":", $_)[6];
      $users_in_passwd{$userName} = [$fullName, $homeDirectory,$shell];
  }
   while (<INFO>) {
   chomp;
       my $userName1 = (split ",", $_)[2];
       my $userAddress = (split ",", $_)[3];
       my $userEmail = (split ",", $_)[4];
       my $userManager = (split ",", $_)[6];
       if (exists  $users_in_passwd{$userName}){
          my @data_in_passwd = @{$users_in_passwd{$userName}};
          print Dumper($userName,$userName1,\@data_in_passwd,);
          ### do something
       }
   }  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM