简体   繁体   中英

PHP preg_match and regular expressions

I am fairly new to PHP and regular expressions, after reading i have got this far trying to understand how to extract correct info,

Sample data

2011/09/20  00:57       367,044,608 S1E04 - Cancer Man.avi
2012/03/12  03:01       366,991,496 Family Guy - S09E01 - And Then There Were Fewer.avi
2012/03/25  00:27        53,560,510 Avatar- The Legend of Korra S01E01.avi

What i would like to extract is the Date, File size and name of the file, remembering that the file can start with basically anything. and file size changes all the time.

What i have currently.

$dateModifyed = substr($file, 0, 10); 
$fileSize = preg_match('[0-9]*/[0-9]*/[0-9]*/s[0-9]*:[0-9]*/s*', $file, $match)
$FileName = 

Full code i am working on

function recursivePrint($folder, $subFolders, $Jsoncounter) {
$f = fopen("file.json", "a");

echo '{ "id" : "' . $GLOBALS['Jsoncounter'] . '", parent" : "' . "#" . '", Text" : "' . $folder . '" },' . "\n";
$PrintString = '{ "id" : "' . $GLOBALS['Jsoncounter'] . '", parent" : "' . "#" . '", Text" : "' . $folder . '" },' . "\n";
fwrite($f, $PrintString);
$foldercount = $GLOBALS['Jsoncounter'];
$GLOBALS['Jsoncounter']++;
foreach($subFolders->files as $file) {


    preg_match('/^(\d{4}/\d{2}/\d{2}\s+\d{2}:\d{2})\s+([\d,]+)\s+(.*)$/', $file, $match);
    $dateModified = $match[1];
    $fileSize = str_replace(',', '', $match[2]);
    $fileName = $match[3];
    echo $dateModified . $fileSize . $fileName;


    echo '{ "id" : "' . $GLOBALS['Jsoncounter'] . '", parent" : "' . $foldercount . '", Text" : "' . $file . '" },';
    $PrintString ='{ "id" : "' . $GLOBALS['Jsoncounter'] . '", parent" : "' . $foldercount . '", Text" : "' . $file . '" },';
    fwrite($f, $PrintString);
    $GLOBALS['Jsoncounter']++;
}

foreach($subFolders->folders as $folder => $subSubFolders) {
    recursivePrint($folder, $subSubFolders, $Jsoncounter);
}
fclose($f); 

}

Any help extracting the correct numbers would be greatly appreciated

There are several problems in your regex:

preg_match('[0-9]*/[0-9]*/[0-9]*/s[0-9]*:[0-9]*/s*', $file, $match)
            ^--missing delimiter ^            ^-- asterisk instead of plus
                                 |--literal s instead of \s

and of course you haven't used anchors or capturing groups , and the regex isn't finished yet.

Try the following:

preg_match_all(
    '%^                     # Start of line
    ([0-9]+/[0-9]+/[0-9]+)  # Date (group 1)
    \s+                     # Whitespace
    ([0-9]+:[0-9]+)         # Time (group 2)
    \s+                     # Whitespace
    ([0-9,]+)               # File size (group 3)
    \s+                     # Whitespace
    (.*)                    # Rest of the line%mx', 
    $file, $result, PREG_SET_ORDER);
for ($matchi = 0; $matchi < count($result); $matchi++) {
    for ($backrefi = 0; $backrefi < count($result[$matchi]); $backrefi++) {
        # Matched text = $result[$matchi][$backrefi];

so for example $result[0][1] will contain 2011/09/20 , and $result[2][4] will contain Avatar- The Legend of Korra S01E01.avi etc.

You need to use capture groups to get the parts of the string that are matched by different parts of the regular expression. Capture groups use parentheses around portions of the regexp.

preg_match('#^(\d{4}/\d{2}/\d{2}\s+\d{2}:\d{2})\s+([\d,]+)\s+(.*)$#', $string, $match);
$dateModified = $match[1];
$fileSize = str_replace(',', '', $match[2]);
$fileName = $match[3];

Other problems in your regexp:

  • You left out the delimiters at the beginning and end.
  • You used /s instead of \\s for whitespace characters.

There's a tutorial on regular expressions at www.regular-expressions.info .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM