简体   繁体   中英

PHP - Export name and email address from file

I have a file that has a list of people, phone numbers, email addreses

for example

Coulthard
Sally Coulthard
Location: Surrey
Expertise Covered: Horse, Dog, Horse and Rider
Website: www.veterinaryphysio.co.uk
Tel: 07865095005
Email: sally@veterinaryphysio.co.uk

Kate Haynes
Location: Surrey, Sussex, Kent
Expertise Covered: Horse, Performance, Horse and Rider
Tel: 07957 344688
Email: katehaynesphysio@yahoo.co.uk

The list is like the above by with hundreds, how do i create a regex that reads the file from top down and extract the first name and lastname line and the email address and puts them together like the following

first and last name, email address

Any help would be awesome

I have the below code, but reads only the email addresses

$string = file_get_contents("physio.txt"); // Load text file contents

// don't need to preassign $matches, it's created dynamically

// this regex handles more email address formats like a+b@google.com.sg, and the i makes it case insensitive
$pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';

// preg_match_all returns an associative array
preg_match_all($pattern, $string, $matches);

// the data you want is in $matches[0], dump it with var_export() to see it
echo "<pre>";
$input = $matches[0];
echo count($input);
echo "<br>";
$result = array_unique($input);
echo count($result);
echo "<br>";
//print_r($result);
echo "</pre>";

Regex seems a sensible way to parse this data. It is important to put in sufficient components to keep your matching accurate.

I'll suggest the following:

Pattern: ~^(.+)\\RLocation:[\\s\\S]*?^Email: (\\S*)~m ( Demo )

Nearby substrings Location: and Email: are used to ensure the correct substrings are targeted.

The m pattern modifier is used to improve pattern accuracy through the ^ character matching the start of a line (not just the start of the string).

Breakdown:

~          #pattern delimiter
^          #match start of a line
(.+)       #capture one or more non-newline characters (Capture Group #1)
\R         #match a newline character (\r, \n, \r\n)
Location:  #match literal: "Location" followed by colon
[\s\S]*?   #match (lazily) zero or more of any character
^Email:    #match start of a line, literal: "Email", colon, space
(\S*)      #capture zero or more visible characters (Capture Group #2 -- quantifier means the email value can be blank and still valid)
~          #pattern delimiter
m          #pattern modifier tells regex engine that ^ means start of a line instead of start of the string

Code: ( Demo )

$input = "Coulthard
Sally Coulthard
Location: Surrey
Expertise Covered: Horse, Dog, Horse and Rider
Website: www.veterinaryphysio.co.uk
Tel: 07865095005
Email: sally@veterinaryphysio.co.uk

Kate Haynes
Location: Surrey, Sussex, Kent
Expertise Covered: Horse, Performance, Horse and Rider
Tel: 07957 344688
Email: katehaynesphysio@yahoo.co.uk";

if (preg_match_all("~^(.+)\RLocation:[\s\S]*?^Email: (\S*)~m", $input, $matches, PREG_SET_ORDER)) {
    foreach ($matches as $data) {
        echo "{$data[1]}, {$data[2]}\n";
    }
}

Output:

Sally Coulthard, sally@veterinaryphysio.co.uk
Kate Haynes, katehaynesphysio@yahoo.co.uk

You could split your content by double linebreak, then process each block. To get the first name and last name, you could get the last line that not contains ": " :

$blocks = explode("\n\n", $string);
foreach ($blocks as $block) {
    $lines = explode("\n", $block);
    $mail = end($lines);
    $mail = substr($mail, strlen('Email: '));
    $lines = array_reverse($lines);
    $fnln = '';
    foreach ($lines as $line) {
        if (strpos($line, ': ') === false) {
            $fnln = $line;
            break;
        }
    }
    echo $fnln . ", " . $mail . "<br>";
}

Outputs :

Sally Coulthard, sally@veterinaryphysio.co.uk
Kate Haynes, katehaynesphysio@yahoo.co.uk

Or, if the email is not always the last line of a block ;

$blocks = explode("\n\n", $string);
foreach ($blocks as $block) {
    $lines = explode("\n", $block);
    $lines = array_reverse($lines);
    $fnln = '';
    foreach ($lines as $line) {
        if (substr($line, 0, 6) == 'Email:') {
            $mail = substr($line, 7);
        }
        if (strpos($line, ': ') === false) {
            $fnln = $line;
            break;
        }
    }
    echo $fnln . ", " . $mail . "<br>";
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM