[英]Update xml database file with perl script
I want to update a 150mo database xml file with 3 other xml files in their chronological order someone give me a quick example of perl script on how to do this but I don't know how I should make the script open the base file and the 3 update. 我想按时间顺序使用其他3个xml文件更新一个150mo数据库xml文件,有人给了我有关如何执行此操作的perl脚本的快速示例,但我不知道如何使该脚本打开基本文件,并且3更新。 with @args or by opening directly the file with the open function.
使用@args或使用open函数直接打开文件。 He said to me that I do not need to parse the xml file because updates consist of junks of database entries and only need to throw lines of these junks into a hash with entries id as hash key then read all files sequentially so that entries either update or make a new one, and then write out the hash in numerical order of the key.
他对我说,我不需要解析xml文件,因为更新由数据库条目的垃圾组成,只需要将这些垃圾的行放入以id为哈希键的哈希中,然后按顺序读取所有文件,以便更新或制作一个新的,然后按键的数字顺序写出哈希值。
#! /usr/bin/perl -CIOE
use strict;
my %h = ();
my $head = '';
my $has_data = 0;
while (<>) {
/<db_entry db_id="(\d+)">/ and do {
my $entry = $_;
my $id = $1;
while (<>) {
$entry .= $_;
/<\/db_entry>/ and last;
}
$h{$id} = $entry;
$has_data = 1;
next;
};
if (! $has_data) {
$head .= $_;
next;
}
/\s*<timestamp/ and do {
$head .= $_;
next;
};
}
my $count = scalar keys %h;
print $head;
foreach (sort { $a <=> $b } keys %h) {
print $h{$_};
}
print qq| <db_entry_count count="$count" />
</databank_export>
|;
I don't know how this script is supposed to read files sequentially either by command line or open function. 我不知道该脚本应该如何通过命令行或打开功能顺序读取文件。 It will be simpler to do this way than parsing with xml::twig or something.
与用xml :: twig或其他东西进行解析相比,这种方式会更简单。
Best regards. 最好的祝福。
I've tried to add some information (comments) for you to understand better. 我试图添加一些信息(注释),以使您更好地理解。 Note comments in perl are written using "#".
perl中的注释注释使用“#”编写。 Except the first line wherever you see something with
#
symbol is a comment. 除了第一行以外,任何带有
#
符号的地方都是注释。
Example, if you see the below code have # Hash initialization
this means a comment for your understanding. 例如,如果您看到以下代码具有
# Hash initialization
则表示您需要理解。 I highly recommend you to use Notepad++
editor for writing perl codes or understanding this below one. 我强烈建议您使用
Notepad++
编辑器编写Perl代码或在下面的内容中理解这一点。 You will get a clear view since in notepad++ comments are visible in green colour. 您将获得清晰的视图,因为在notepad ++中,注释是绿色的。 Hope this is helpful!
希望这会有所帮助!
#! /usr/bin/perl -CIOE
use strict;
my %h = (); # Hash initialization
my $head = ''; # Variable initialization
my $has_data = 0; # Variable initialization (Numeric)
# Perl's diamond <> operator to read from files.
# It acts like a readline() command
while (<>) {
# "\d" matches all numbers; it is the same as [0-9]
# "+" sign for many expressions (1 or more)
/<db_entry db_id="(\d+)">/ and do {
my $entry = $_; # "$_" is a default operator
my $id = $1; # $1 is match operator (successful match)
while (<>) { # Again read from file
$entry .= $_; # Updating "$entry" variable everytime
/<\/db_entry>/ and last;
}
$h{$id} = $entry; # Preparing variable formed here will be value with index
# "$h{$id}" is equivalent to some value like "007XBCF(2)"
$has_data = 1; # assign value 1 to variable "has_data" which was 0
next; # iteration
};
if (! $has_data) { # If "$has_data" not exist then if loop runs
$head .= $_;
next;
}
/\s*<timestamp/ and do {
$head .= $_;
next;
};
}
my $count = scalar keys %h; # "scalar" will give you count for arrays or hashes
print $head;
# <=> compare operator of perl and rest syntax is of perl sort
foreach (sort { $a <=> $b } keys %h) {
print $h{$_};
}
print qq| <db_entry_count count="$count" />
</databank_export>
|;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.