简体   繁体   中英

Scraping HTML with PHP and encoding problems

I am trying to scrape the following url with PHP: http://www.clubedoricardo.com.br/Produto/Smartphone-Samsung-Galaxy-Win-2-Duos-G360-Cinza-Dual-Chip-4G-Tela-45-Camera-5MP-Frontal-2MP-Quad-Core-12Ghz-8GB/44-491-496-568187

$url="http://www.clubedoricardo.com.br/Produto/Smartphone-Samsung-Galaxy-Win-2-Duos-G360-Cinza-Dual-Chip-4G-Tela-45-Camera-5MP-Frontal-2MP-Quad-Core-12Ghz-8GB/44-491-496-568187";
$dom = new DOMDocument;
$dom->loadHTMLFile($url);
$page_content = $dom->saveHTML();
echo($page_content);

But the text comes with weird characters. I tried encoding with UTF-8 and ISO-8859, but nothing changes.

Any ideas?

When i follow the link You provided, a blank website appears. Try:

$dom->loadHTML(mb_convert_encoding($url, 'HTML-ENTITIES', 'UTF-8'));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM