简体   繁体   中英

php regex to parse mailTo

I have a following html source string:

<a href="mailto:abcd@test.com?body=This%20is%20the%20body%20-123-&subject=Subject%20Text&Content-Type=text/plain">Reply To Post</a>

From the above string I want to extract:

  1. Email address that is the part after mailto: and before ?
  2. Body
  3. Subject

Any help with the regex will be appreciated. Thanks in advance.

You would need not need regex for the second part. It can be parsed as a query string , IMO.

Something like: ( $s is the value of href in the following code)

preg_match("/mailto:(.*?)\?(.*)/",$s,$matches);

echo "Email:" . $matches[1] . "\n";
parse_str($matches[2],$output);
echo "Body: " . $output['body'] . "\n";
echo "Subject: " . $output['subject'] . "\n";

Actually, if you are sure the string appears in the exact same fashion, you could take the substring from the offset of index of ":" up to the index of "?", too.

This will assume you only a single mailto link:

// $str will be your string content from the question
if (preg_match('/"mailto:([^"]+?)/', $str, $matches) && false !== ($info = parse_url($matches[1]))) {
        $emailAddress = $info['path'];
        $emailParameters = array();
        if (isset($info['query'])) {
                parse_str($info['query'], $emailParameters);
        }
        var_dump($emailAddress, $emailParameters);
}

It matches from the "mailto: to the first end quote and uses parse_url to do the rest.

Haven't tried it in PHP, but it works fine in Regex Hero :

"mailto:([\\w%.+-]+?@[\\w.-]+?)(?:[?&](?:body=(.*?)|subject=(.*?)|[\\w-]+=.*?))+?"

This should result in the following capture groups:

  • 1: email address
  • 2: body
  • 3: subject

You might want to do some more intensive testing though, as I'm not sure whether I've got all valid mail addresses.

Try this

$m = preg_match("/mailto:(.+?)\?/");

it matches the word mailto followed by a colon, followed by a capturing group (parenthesis) which contains any character . one or more times + un-greedily (? - it will make the capture as short as possible) followed by a (escaped) question mark ( \\? )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM