简体   繁体   中英

PHP to reformat XML steps needed

I want to come up with the code myself but I need someone to please tell me what I'm dealing with here and layout the basic steps, not the actual code. Right now my PHP does a file get contents via an http get web call. The data returned to me is XML node structure. Medical claims return, so there could be one claim or 200 claims returned, but all structured the same element wise, they just repeating. I need to take each element name that's within the CLAIM main element and have those names print horizontally across delineated by the dot I use in the 2nd example output. I only need it to horizontally list these names once, not repeating. Then I need to have the actual data that is in the middle of the elements also display horizontally with my dot delineation. So if there are 100 claims returned, I need that data to keep on displaying horizontally with my dot delineation.

<CLAIM_LIST>
  <CLAIM>
    <fund_code>TTG-PMA N351</fund_code>
    <fund_name>TTG</fund_name>
    <ProviderTIN>444555666</ProviderTIN>
</CLAIM>
<CLAIM>
    <fund_code>XXX-PMA N444</fund_code>
    <fund_name>ILWU</fund_name>
    <ProviderTIN>888777666</ProviderTIN>
</CLAIM>
<CLAIM>
</CLAIM_LIST>

TURN the above into the below. I know how to do the dot delineation, and the column_names and data name elements. 


<column_names>
    fund_code·fund_name·ProviderTIN
 </column_names>

<data>
 TTG-PMA N351·TTG·44555666·XXX-PMA N444·ILWU·888777666
 </data>


I did it, actually surprised myself. $file being the entire result I wanted to modify. This gives me the horizontal output I wanted. The only problem now is getting a row count. Any suggestions? Something with some math involved, like take the column count I got and compare it to how many dot delineated positions there are on the total row output and divide by?

To strip out column names, put the dot delineation in and return a count of how many columns

$xml = simplexml_load_string($file);

  foreach($xml->children()->children() as $child){
     $claimsNames .= $child->getName() . "·" . "";  
    $col_count++;
    
}

Then to strip out the data in all the elements and also put dot delineation in

 $claimsData = trim(preg_replace('/<[^>]*>/', '   ', $file));  
  $claimsData = str_replace( '      ', '·', $claimsData ); 

My final code:

    $file = file_get_contents($remote_url, false, $context);   // Open the 
    file using the HTTP headers set above   
   
       $start_time  = microtime(true);
       $col_count = 0;
       $row_count = 0;   
   
       $xml = simplexml_load_string($file);
   
     // THE LOOP! To strip column names out of XML elements and display how 
    many columns
   
      foreach($xml->children()->children() as $child)
    {
         $claimsNames .= $child->getName() . "·" . "";
    
        $col_count++;
    
    
    }

      $claimsData = trim(preg_replace('/<[^>]*>/', '   ', $file));  
      $claimsData = str_replace( '      ', '·', $claimsData ); 

      $row_count1 = count(explode('·', $claimsData));   // how many total dots 
    starting at 1
       $ColPlusOne = ($col_count + 1);              //28 plus 1 = 29
      $row_count2 = ($row_count1 / $ColPlusOne);         //  divide above by 
    number of columns and round, to give total number of rows  
      $row_count3 = ceil($row_count2*1)/1;             // round return up 
  
  
       if ($col_count == "28") {       //checking for no record
    
         $col_count = $col_count;
         $row_count3 = $row_count3;
       
    } else {
    
         $col_count = "0";
          $row_count3 = "0";
    }
  
   
       $time = round( (microtime(true) - $start_time), 4);      

    
    ?>

    <response>
    <time><?=$time?></time>
    <cols><?=$col_count?></cols>
    <rows><?=$row_count3?></rows>
    <column_names>
    <?=$claimsNames?>    
    </column_names>
    <data>
    <?=$claimsData?>                                                       
    </data>
    </response>

It gives output like this:
<response>
<time>0.0029</time>
<cols>28</cols>
<rows>83</rows>
<column_names>
fund_code·fund_name·ProviderTIN·provider_name·claim_num·status·dos·dos_end·ProcessDate·patient_id·patient_dob·patient_name·patient_lastname·patient_firstname·patient_middlename·patient_relationship·Payee·AmountBilled·AmountCovered·AmountPaid·AmountCopay·Discount·Deductible·PatientAmount·dup·Source·ClaimSource·OriginalClaimNumber·
</column_names>
<data>
TTG-PMA N351·TTG·111222999··20200312-209·Issued·20200303·20200303·20200312·0000037725·19510915·VAN HALEN EDDIE·VAN HALEN·EDDIE··Participant·Provider·8127.00·2888.80·2888.80·0.00·5238.20·0.00·0.00··AMBICAB·SG·20200312-209··TTG-PMA N351·TTG·111222999··20200318-1361·Issued·20200303·20200303·20200318·0000037725·19510915·VAN HALEN EDDIE·VAN HALEN·EDDIE··Participant·Provider·26.00·9.99·9.99·0.00·16.01·0.00·0.00··AMBICAB·SG·20200318-1361··TTG-PMA N351·TTG·111222999··20200318-1362·Issued·20200303·20200303·20200318·0000037725·19510915·VAN HALEN EDDIE·VAN HALEN·EDDIE··Participant·Provider·17.00·10.31·10.31·0.00·6.69·0.00·0.00··AMBICAB·SG·20200318-1362··TTG-PMA N351·TTG·252363454··20200407-1405·Issued·20200303·20200303·20200407·0000037725·19510915·VAN HALEN EDDIE·VAN HALEN·EDDIE··Participant·Provider·765.00·180.57·180.57·0.00·584.43·0.00·0.00··AMBICAB·SG·20200407-1405··TTG-PMA N351·TTG·472728752··20191119-3554·Issued·20191021·20191021·20191120·0000037725·19510915·VAN HALEN

 

I really appreciate that you took your time here Jack and wrote the code you did. I have a lot to learn by your code. I would have never known that about RegEx. I never used "DOM" before. My code is perhaps, well, it is, a hack job, and the math, that took me a while to test 30 different claims returns but it always gave me the correct row count. This is for a Cisco IVR so I need to have the XML stay as XML, but formatted like it does so the Cisco can maintain string delineation counts for its processing. None of the will ever be on a terminal screen as its 100% machine to machine, hence the XML format all the way though. Column counts and row counts are oh SO important in the IVR world.

First things first: you are dealing with xml, and a complex one at that. One thing that is not a good idea is to work on xml (or html, for that matter) with regex. Search around and you'll see it's an almost universal consensus.

The most appropriate tools for working with xml are xpath and xquery. Unfortunately, xpath support in php is terrible, so getting your expected output is going to involve a lot of mental gymnastics.

Having said that, since you asked to do it in php, here's an answer in php:

$string = <<<XML
[your xml snippet above]
XML;

//loading boilerplate
$claimsdoc = new DOMDocument();
$claimsdoc->loadXML($string);
$claimsdoc_xpath = new DOMXPath($claimsdoc);

$claims = $claimsdoc_xpath->evaluate('count(//CLAIM)'); //get the number of CLAIMs
$tags = $claimsdoc_xpath->evaluate('count(//CLAIM[1]//*)');//get the number of tags per CLAIM 

//get the column names and create the xml output
$cols = $claimsdoc_xpath->evaluate(".//CLAIM[1]//*");
$colnames = htmlspecialchars("<column_names>", ENT_QUOTES) . "\n<br>";    
for ($x = 0; $x < $tags; $x++) { 
  $result = $cols[$x];
  $colnames .= "$result->tagName";
  if ($x < $tags-1)
        $colnames.=" * ";  
 }    
$colnames .= "\n<br>" . htmlspecialchars("</column_names>", ENT_QUOTES);

//get the claim data and create the xml output
$data = htmlspecialchars("<data>", ENT_QUOTES) . "\n<br>";
for ($x = 1; $x <= $claims; $x++) { 
  $result = $claimsdoc_xpath->evaluate("concat(.//CLAIM[$x]//fund_code/text(),' * ',.//CLAIM[$x]//fund_name,' * ',.//CLAIM[$x]//ProviderTIN)");
  $data .=$result;
  if ($x < $claims)
        $data .=" * ";  
} 
$data .= "\n<br>" . htmlspecialchars("</data>", ENT_QUOTES);

echo $colnames;
echo "\n<br>";
echo $data;

Output:

<column_names>
fund_code * fund_name * ProviderTIN
</column_names>
<data>
TTG-PMA N351 * TTG * 444555666 * XXX-PMA N444 * ILWU * 888777666
</data>

As I mentioned in the beginning, if your dataset is large enough and you have to do it frequently enough, it may be worth your while to learn about xpath/xquery and working with an xml database like BaseX.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM