简体   繁体   中英

How to decode email?

I am scraping one page and it has e-mails like ...mailto:Stewart.Smi&#1... and similar. It is decoded, how could I encode it with PHP? Thanks (only for education purposes).

These are just ordinary ASCII characters which for mysterious reasons have been encoded in HTMLs numeric character format. ie the letter "a" is coded as ` .

A list of common encodings

The built in php function html-entity-decode() should convert these back to readable utf-8.

try html_entity_decode() to get the encoded value.

for ex:

$str = "mailt&#111";  
$string = html_entity_decode($str);
echo $string;

Each entity is the decimal representation of a character. This Perl code will translate simple ASCII.

use strict;
use warnings;

my $mail = 'mailto:Stewart.Smi&#1';

$mail =~ s/&#(\d+);/chr $1/eg;

print $mail;

OUTPUT

mailto:Stewart.Smi&#1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM