简体   繁体   English

使用PHP处理XML文件的更快方法

[英]Faster way to process xml file using PHP

I have this xml file named flight-itinerary.xml. 我有一个名为flight-itinerary.xml的xml文件。 A scaled-down version is shown below. 缩小版本如下所示。

<itin line="1" dep="LOS" arr="ABV">
    <flt>
        <fltav>
            <cb>1</cb>
            <id>C</id>
            <av>10</av>
            <cur>NGN</cur>
            <CurInf>2,0.01,0.01</CurInf>
            <pri>15000.00</pri>
            <tax>30800.00</tax>
            <fav>1</fav>
            <miles></miles>
            <fid>11</fid>
            <finf>0,0,1</finf>

            <cb>2</cb>
            <id>J</id>
            <av>10</av>
            <cur>NGN</cur>
            <CurInf>2,0.01,0.01</CurInf>
            <pri>13000.00</pri>
            <tax>26110.00</tax>
            <fav>1</fav>
            <miles></miles>
            <fid>12</fid>
            <finf>0,0,0</finf>
        </fltav>
    </flt>
</itin>

The complete file contains 8 itinerary <itin> elements. 完整的文件包含8个行程<itin>元素。 The <fltav> element of each of the <itin> elements contains 11 of the <cb>1</cb> to <finf>0,0,1</finf> groups. 所述<fltav>每个的元素<itin>元素包含的11 <cb>1</cb><finf>0,0,1</finf>基团。

And below is the code I am using to process the file: 下面是我用来处理文件的代码:

<?php

function processFlightsData()
{
    $data = array();
    $dom= new DOMDocument();
    $dom->load('flight-itinerary.xml');

    $classbands  = $dom->getElementsByTagName('classbands')->item(0);
    $bands       = $classbands->getElementsByTagName('band');
    $itineraries = $dom->getElementsByTagName('itin');
    $counter     = 0;

    foreach($itineraries AS $itinerary)
    { 
        $flt = $itinerary->getElementsByTagName('flt')->item(0);

        $dep = $flt->getElementsByTagName('dep')->item(0)->nodeValue;
        $arr = $flt->getElementsByTagName('arr')->item(0)->nodeValue;

        $time_data       = $flt->getElementsByTagName('time')->item(0);
        $departure_day   = $time_data->getElementsByTagName('ddaylcl')->item(0)->nodeValue;
        $departure_time  = $time_data->getElementsByTagName('dtimlcl')->item(0)->nodeValue;
        $departure_date  = $departure_day. ' '. $departure_time;
        $arrival_day     = $time_data->getElementsByTagName('adaylcl')->item(0)->nodeValue;
        $arrival_time    = $time_data->getElementsByTagName('atimlcl')->item(0)->nodeValue;
        $arrival_date    = $arrival_day. ' '. $arrival_time;
        $flight_duration = $time_data->getElementsByTagName('duration')->item(0)->nodeValue;

        $flt_det       = $flt->getElementsByTagName('fltdet')->item(0);
        $airline_id    = $flt_det->getElementsByTagName('airid')->item(0)->nodeValue;
        $flt_no        = $flt_det->getElementsByTagName('fltno')->item(0)->nodeValue;
        $flight_number = $airline_id. $flt_no;
        $airline_type  = $flt_det->getElementsByTagName('eqp')->item(0)->nodeValue;
        $stops         = $flt_det->getElementsByTagName('stp')->item(0)->nodeValue;

        $av_data = $flt->getElementsByTagName('fltav')->item(0);

        $cbs     = iterator_to_array($av_data->getElementsByTagName('cb')); //11 entries
        $ids     = iterator_to_array($av_data->getElementsByTagName('id')); //ditto
        $seats   = iterator_to_array($av_data->getElementsByTagName('av')); //ditto
        $curr    = iterator_to_array($av_data->getElementsByTagName('cur')); //ditto
        $price   = iterator_to_array($av_data->getElementsByTagName('pri')); //ditto
        $tax     = iterator_to_array($av_data->getElementsByTagName('tax')); //ditto
        $miles   = iterator_to_array($av_data->getElementsByTagName('miles')); //ditto
        $fid     = iterator_to_array($av_data->getElementsByTagName('fid')); //ditto    

        $inner_counter = 0;

        for($i = 0; $i < count($ids); $i++)
        {
            $data[$counter][$inner_counter] = array
            (
                'flight_number'                   => $flight_number,
                'flight_duration'                 => $flight_duration, 
                'departure_date'                  => $departure_date,
                'departure_time'                  => substr($departure_time, 0, 5),
                'arrival_date'                    => $arrival_date,
                'arrival_time'                    => substr($arrival_time, 0, 5),
                'departure_airport_code'          => $dep,
                'departure_airport_location_name' => get_airport_data($dep, $data_key='location'),
                'arrival_airport_code'            => $arr,
                'arrival_airport_location_name'   => get_airport_data($arr, $data_key='location'),
                'stops'                           => $stops,
                'cabin_class'                     => $ids[$i]->nodeValue,
                'ticket_class'                    => $ids[$i]->nodeValue,
                'ticket_class_nicename'           => formate_ticket_class_name($ids[$i]->nodeValue),
                'available_seats'                 => $seats[$i]->nodeValue,
                'currency'                        => $curr[$i]->nodeValue,
                'price'                           => $price[$i]->nodeValue,
                'tax'                             => $tax[$i]->nodeValue,
                'miles'                           => $miles[$i]->nodeValue,
            );

            ++$inner_counter;
        }

    return $data;
}

?>

Now, the outer loop iterates 8 times for each <itin> element, and during each iteration of the outer loop, the inner loop iterates 11 times, resulting in a total of 88 iterations per pass and causing serious performance issues. 现在,外循环为每个<itin>元素迭代8次,并且在外循环的每次迭代过程中,内循环迭代11次,导致每遍总共进行88次迭代,并导致严重的性能问题。 What I am looking for is a faster method of processing the file. 我正在寻找的是一种处理文件的更快方法。 Any helps will be greatly appreciated. 任何帮助将不胜感激。

I don't think the loop is the bottle-neck. 我认为循环不是瓶颈。 You should check your operations that are called within the loop, get_airport_data and formate_ticket_class_name . 您应该检查循环中调用的操作get_airport_dataformate_ticket_class_name

Trying your code (without the auxiliary operations) on a number of itin elements takes less than a second, check this fiddle: http://phpfiddle.org/main/code/7fpi-b3ka (Note that the XML might not be similar to yours, I've guessed a lot of elements that were missing). 在多个itin元素上尝试代码(不执行辅助操作)不到一秒钟,请检查以下小提琴: http : //phpfiddle.org/main/code/7fpi-b3ka (请注意,XML可能与您的,我猜想很多元素都缺失了)。

If there are operations that are called which increases the processing time substantially, try to call the operation with bulk data or cache the responses. 如果有被调用的操作大大增加了处理时间,请尝试使用批量数据调用该操作或缓存响应。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM