[英]Faster way to process xml file using PHP
I have this xml file named flight-itinerary.xml. 我有一个名为flight-itinerary.xml的xml文件。 A scaled-down version is shown below. 缩小版本如下所示。
<itin line="1" dep="LOS" arr="ABV">
<flt>
<fltav>
<cb>1</cb>
<id>C</id>
<av>10</av>
<cur>NGN</cur>
<CurInf>2,0.01,0.01</CurInf>
<pri>15000.00</pri>
<tax>30800.00</tax>
<fav>1</fav>
<miles></miles>
<fid>11</fid>
<finf>0,0,1</finf>
<cb>2</cb>
<id>J</id>
<av>10</av>
<cur>NGN</cur>
<CurInf>2,0.01,0.01</CurInf>
<pri>13000.00</pri>
<tax>26110.00</tax>
<fav>1</fav>
<miles></miles>
<fid>12</fid>
<finf>0,0,0</finf>
</fltav>
</flt>
</itin>
The complete file contains 8 itinerary <itin>
elements. 完整的文件包含8个行程<itin>
元素。 The <fltav>
element of each of the <itin>
elements contains 11 of the <cb>1</cb>
to <finf>0,0,1</finf>
groups. 所述<fltav>
每个的元素<itin>
元素包含的11 <cb>1</cb>
到<finf>0,0,1</finf>
基团。
And below is the code I am using to process the file: 下面是我用来处理文件的代码:
<?php
function processFlightsData()
{
$data = array();
$dom= new DOMDocument();
$dom->load('flight-itinerary.xml');
$classbands = $dom->getElementsByTagName('classbands')->item(0);
$bands = $classbands->getElementsByTagName('band');
$itineraries = $dom->getElementsByTagName('itin');
$counter = 0;
foreach($itineraries AS $itinerary)
{
$flt = $itinerary->getElementsByTagName('flt')->item(0);
$dep = $flt->getElementsByTagName('dep')->item(0)->nodeValue;
$arr = $flt->getElementsByTagName('arr')->item(0)->nodeValue;
$time_data = $flt->getElementsByTagName('time')->item(0);
$departure_day = $time_data->getElementsByTagName('ddaylcl')->item(0)->nodeValue;
$departure_time = $time_data->getElementsByTagName('dtimlcl')->item(0)->nodeValue;
$departure_date = $departure_day. ' '. $departure_time;
$arrival_day = $time_data->getElementsByTagName('adaylcl')->item(0)->nodeValue;
$arrival_time = $time_data->getElementsByTagName('atimlcl')->item(0)->nodeValue;
$arrival_date = $arrival_day. ' '. $arrival_time;
$flight_duration = $time_data->getElementsByTagName('duration')->item(0)->nodeValue;
$flt_det = $flt->getElementsByTagName('fltdet')->item(0);
$airline_id = $flt_det->getElementsByTagName('airid')->item(0)->nodeValue;
$flt_no = $flt_det->getElementsByTagName('fltno')->item(0)->nodeValue;
$flight_number = $airline_id. $flt_no;
$airline_type = $flt_det->getElementsByTagName('eqp')->item(0)->nodeValue;
$stops = $flt_det->getElementsByTagName('stp')->item(0)->nodeValue;
$av_data = $flt->getElementsByTagName('fltav')->item(0);
$cbs = iterator_to_array($av_data->getElementsByTagName('cb')); //11 entries
$ids = iterator_to_array($av_data->getElementsByTagName('id')); //ditto
$seats = iterator_to_array($av_data->getElementsByTagName('av')); //ditto
$curr = iterator_to_array($av_data->getElementsByTagName('cur')); //ditto
$price = iterator_to_array($av_data->getElementsByTagName('pri')); //ditto
$tax = iterator_to_array($av_data->getElementsByTagName('tax')); //ditto
$miles = iterator_to_array($av_data->getElementsByTagName('miles')); //ditto
$fid = iterator_to_array($av_data->getElementsByTagName('fid')); //ditto
$inner_counter = 0;
for($i = 0; $i < count($ids); $i++)
{
$data[$counter][$inner_counter] = array
(
'flight_number' => $flight_number,
'flight_duration' => $flight_duration,
'departure_date' => $departure_date,
'departure_time' => substr($departure_time, 0, 5),
'arrival_date' => $arrival_date,
'arrival_time' => substr($arrival_time, 0, 5),
'departure_airport_code' => $dep,
'departure_airport_location_name' => get_airport_data($dep, $data_key='location'),
'arrival_airport_code' => $arr,
'arrival_airport_location_name' => get_airport_data($arr, $data_key='location'),
'stops' => $stops,
'cabin_class' => $ids[$i]->nodeValue,
'ticket_class' => $ids[$i]->nodeValue,
'ticket_class_nicename' => formate_ticket_class_name($ids[$i]->nodeValue),
'available_seats' => $seats[$i]->nodeValue,
'currency' => $curr[$i]->nodeValue,
'price' => $price[$i]->nodeValue,
'tax' => $tax[$i]->nodeValue,
'miles' => $miles[$i]->nodeValue,
);
++$inner_counter;
}
return $data;
}
?>
Now, the outer loop iterates 8 times for each <itin>
element, and during each iteration of the outer loop, the inner loop iterates 11 times, resulting in a total of 88 iterations per pass and causing serious performance issues. 现在,外循环为每个<itin>
元素迭代8次,并且在外循环的每次迭代过程中,内循环迭代11次,导致每遍总共进行88次迭代,并导致严重的性能问题。 What I am looking for is a faster method of processing the file. 我正在寻找的是一种处理文件的更快方法。 Any helps will be greatly appreciated. 任何帮助将不胜感激。
I don't think the loop is the bottle-neck. 我认为循环不是瓶颈。 You should check your operations that are called within the loop, get_airport_data
and formate_ticket_class_name
. 您应该检查循环中调用的操作get_airport_data
和formate_ticket_class_name
。
Trying your code (without the auxiliary operations) on a number of itin
elements takes less than a second, check this fiddle: http://phpfiddle.org/main/code/7fpi-b3ka (Note that the XML might not be similar to yours, I've guessed a lot of elements that were missing). 在多个itin
元素上尝试代码(不执行辅助操作)不到一秒钟,请检查以下小提琴: http : //phpfiddle.org/main/code/7fpi-b3ka (请注意,XML可能与您的,我猜想很多元素都缺失了)。
If there are operations that are called which increases the processing time substantially, try to call the operation with bulk data or cache the responses. 如果有被调用的操作大大增加了处理时间,请尝试使用批量数据调用该操作或缓存响应。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.