繁体   English   中英

我需要使用从HTML文档中提取的数据获取JSON

[英]I need to obtain JSON with data extracted from an HTML document

我正在尝试将包含图片,名称和电话号码的公司电话清单转换为JSON文件。 我试图遍历所有<a>来找到img srcdiv.employee-desc文本,但是没有成功。 我尝试使用DOMdocument() ,但也失败了。

<section>
        <a href="tel:+471234567890">
        <article class="clearfix">
          <div class="employee-image">       
            <img src"image_1.jpg">
          </div>
          <div class="employee-desc">
            Emma doe <br>
            +471234567890
          </div>
        </article>
        </a>
        <a href="tel:+471234567890">
        <article class="clearfix">
          <div class="employee-image">       
            <img src"image_2.jpg">
          </div>
          <div class="employee-desc">
            Frank doe <br>
            +471234567890
          </div>
        </article>
        </a>
        <a href="tel:+xxxxxxxx">
        <article class="clearfix">
          <div class="employee-image">       
            <img src"image_3.jpg">
          </div>
          <div class="employee-desc">
            John doe <br>
            +471234567890
          </div>
        </article></a>
    </section>

我的梦想是让json文件看起来像这样:

[  
   {  
      "image":"image_1.jpg",
      "name":"Emma doe",
      "phone":"+47 1234567890"
   },
   {  
      "image":"image_2.jpg",
      "name":"Frank doe",
      "phone":"+47 1234567890"
   },
   {  
      "image":"image_3.jpg",
      "name":"John doe",
      "phone":"+47 1234567890"
   }
]

有谁知道如何在php中完成此工作?

您可以在下面找到代码。 请注意,您的示例中的img标签不正确。 应该是'img src =“”'而不是'img src“”“

我假设您的html在$ html变量中。

$json_arr = array();

$html = substr($html, strpos($html, '<section>') + 9);
$html = substr($html, 0, strpos($html, '</section>'));

$arr = explode('<a href="', $html);
foreach ($arr as $k => $line) {
    if ($k == 0) continue;

    $phone = substr($line, 0, strpos($line, '"'));
    $phone = str_replace('tel:', '', $phone);
    $phone = trim($phone);

    $image = substr($line, strpos($line, '<img src="') + 10);
    $image = substr($image, 0, strpos($image, '"'));

    $name = substr($line, strpos($line, '<div class="employee-desc">') + 37);
    $name = substr($name, 0, strpos($name, '</div>'));
    $name = trim($name);
    $name = substr($name, 0, strpos($name, '<br'));

    $json_arr[$k - 1]['image'] = $image;
    $json_arr[$k - 1]['name'] = $name;
    $json_arr[$k - 1]['phone'] = $phone;
}

$json = json_encode($json_arr);
echo $json . "\n";

借助PHP Simple HTML DOM Parser的较短方法:

$html = HtmlDomParser::str_get_html($data);

foreach($html->find('a') as $element) {
    $image=$element->children(0)->children(0)->children(0)->src;
    list($name,$phone)=array_map('trim', explode('<br>',$element->children(0)->children(1)->innertext));
    $row = (object)compact('image','name','phone');
    $result[]=$row;
}

$output=json_encode($result,JSON_PRETTY_PRINT);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM