简体   繁体   English

使用Perl脚本更新xml数据库文件

[英]Update xml database file with perl script

I want to update a 150mo database xml file with 3 other xml files in their chronological order someone give me a quick example of perl script on how to do this but I don't know how I should make the script open the base file and the 3 update. 我想按时间顺序使用其他3个xml文件更新一个150mo数据库xml文件,有人给了我有关如何执行此操作的perl脚本的快速示例,但我不知道如何使该脚本打开基本文件,并且3更新。 with @args or by opening directly the file with the open function. 使用@args或使用open函数直接打开文件。 He said to me that I do not need to parse the xml file because updates consist of junks of database entries and only need to throw lines of these junks into a hash with entries id as hash key then read all files sequentially so that entries either update or make a new one, and then write out the hash in numerical order of the key. 他对我说,我不需要解析xml文件,因为更新由数据库条目的垃圾组成,只需要将这些垃圾的行放入以id为哈希键的哈希中,然后按顺序读取所有文件,以便更新或制作一个新的,然后按键的数字顺序写出哈希值。

#! /usr/bin/perl -CIOE
use strict;

my %h = ();
my $head = '';
my $has_data = 0;

while (<>) {
   /<db_entry db_id="(\d+)">/ and do {
    my $entry = $_;
    my $id = $1;
    while (<>) {
      $entry .= $_;
      /<\/db_entry>/ and last;
    }
    $h{$id} = $entry;
    $has_data = 1;
    next;
  };
  if (! $has_data) {
    $head .= $_;
    next;
  }
  /\s*<timestamp/ and do {
    $head .= $_;
    next;
  };
}

my $count = scalar keys %h;
print $head;
foreach (sort { $a <=> $b } keys %h) {
  print $h{$_};
}
print qq|  <db_entry_count count="$count" />
</databank_export>
|;

I don't know how this script is supposed to read files sequentially either by command line or open function. 我不知道该脚本应该如何通过命令行或打开功能顺序读取文件。 It will be simpler to do this way than parsing with xml::twig or something. 与用xml :: twig或其他东西进行解析相比,这种方式会更简单。

Best regards. 最好的祝福。

I've tried to add some information (comments) for you to understand better. 我试图添加一些信息(注释),以使您更好地理解。 Note comments in perl are written using "#". perl中的注释注释使用“#”编写。 Except the first line wherever you see something with # symbol is a comment. 除了第一行以外,任何带有#符号的地方都是注释。

Example, if you see the below code have # Hash initialization this means a comment for your understanding. 例如,如果您看到以下代码具有# Hash initialization则表示您需要理解。 I highly recommend you to use Notepad++ editor for writing perl codes or understanding this below one. 我强烈建议您使用Notepad++编辑器编写Perl代码或在下面的内容中理解这一点。 You will get a clear view since in notepad++ comments are visible in green colour. 您将获得清晰的视图,因为在notepad ++中,注释是绿色的。 Hope this is helpful! 希望这会有所帮助!

#! /usr/bin/perl -CIOE
use strict;

my %h = ();                     # Hash initialization
my $head = '';                  # Variable initialization
my $has_data = 0;               # Variable initialization (Numeric)

# Perl's diamond <> operator to read from files. 
# It acts like a readline() command

while (<>) {                                

    # "\d"  matches all numbers; it is the same as [0-9] 
    # "+" sign for many expressions (1 or more)

    /<db_entry db_id="(\d+)">/ and do {     
    my $entry = $_;             # "$_" is a default operator
    my $id = $1;                # $1 is match operator (successful match)
    while (<>) {                # Again read from file
      $entry .= $_;             # Updating "$entry" variable everytime
      /<\/db_entry>/ and last;
    }
    $h{$id} = $entry;           # Preparing variable formed here will be value with index
                                # "$h{$id}" is equivalent to some value like "007XBCF(2)"

    $has_data = 1;              # assign value 1 to variable "has_data" which was 0
    next;                       # iteration
  };
  if (! $has_data) {            # If "$has_data" not exist then if loop runs
    $head .= $_;
    next;
  }
  /\s*<timestamp/ and do {
    $head .= $_;
    next;
  };
}

my $count = scalar keys %h;    # "scalar" will give you count for arrays or hashes
print $head;

# <=> compare operator of perl and rest syntax is of perl sort
foreach (sort { $a <=> $b } keys %h) {
  print $h{$_};
}
print qq|  <db_entry_count count="$count" />
</databank_export>
|;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM