简体   繁体   English

我如何比较两个文本文件与PHP的匹配

[英]How would I compare two text files for matches with PHP

$domains = file('../../domains.txt');
$keywords = file('../../keywords.txt');

$domains will be in format of: $ domains的格式为:

3kool4u.com,9/29/2013 12:00:00 AM,AUC
3liftdr.com,9/29/2013 12:00:00 AM,AUC
3lionmedia.com,9/29/2013 12:00:00 AM,AUC
3mdprod.com,9/29/2013 12:00:00 AM,AUC
3mdproductions.com,9/29/2013 12:00:00 AM,AUC

keywords will be in format of: 关键字的格式为:

keyword1
keyword2
keyword3

I guess I would really like to do an array for keywords from a file and search each line of domains.txt for matches. 我想我真的想为一个文件中的关键字做一个数组,并搜索每个domains.txt的匹配项。 Not sure where to start as I'm confused at the difference of preg_match, preg_match_all, and strpos and more or less when to use one over the other. 我不知道从哪里开始,因为我对preg_match,preg_match_all和strpos的区别感到困惑,而且或多或少地使用了一个而不是另一个。

Thanks ahead for the help. 谢谢你的帮助。

//EMPTY array to hold each line on domains that has a match
$matches = array();

//for each line on the domains file
foreach($domains as $domain){

    //for each keyword
    foreach($keywords as $keyword){

          //if the domain line contains the keyword on any position no matter the case
          if(preg_match("/$keyword/i", $domain)) {
                    //Add the domain line to the matches array
            $matches[] = $domain;
          }     
     }   
}

Now you have the $matches array with all the lines of the domain file that match the keywords 现在你有了$ matches数组,其中包含与关键字匹配的域文件的所有行

NOTE THAT WITH THE PREVIOUS APPROACH THE TWO ENTIRE FILES ARE LOADED INTO MEMORY AND DEPENDING ON THE FILE SIZES YOU CAN RUN OUT OF MEMORY OR THE OS WILL START USING THE SWAP WHICH IS MUCH SLOWER THAN RAM 请注意,使用以前的方法将两个整个文件加载到内存中,并依赖于文件大小,您可以运行内存或操作系统将开始使用比RAM大得多的SWAP

THIS IS ANOTHER AND MORE EFFICIENT APPROACH THAT WILL LOAD ONE LINE IF THE FILE AT THE TIME. 这是另一种更有效的方法,如果当时的文件将会加载一行。

<?php

// Allow automatic detection of line endings
ini_set('auto_detect_line_endings',true);

//Array that will hold the lines that match
$matches = array();

//Opening the two files on read mode
$domains_handle = fopen('../../domains.txt', "r");
$keywords_handle = fopen('../../keywords.txt', "r");

    //Iterate the domains one line at the time
    while (($domains_line = fgets($domains_handle)) !== false) {

        //For each line on the domains file, iterate the kwywords file a line at the time
        while (($keywords_line = fgets($keywords_handle)) !== false) {

              //remove any whitespace or new line from the beginning or the end of string
              $trimmed_keyword = trim($keywords_line);

              //Check if the domain line contains the keyword on any position
              // using case insensitive comparison
              if(preg_match("/$trimmed_keyword/i", trim($domains_line))) {
                    //Add the domain line to the matches array
                $matches[] = $domains_line;
              } 
        }
        //Set the pointer to the beginning of the keywords file
        rewind($keywords_handle);
    }

//Release the resources
fclose($domains_handle);
fclose($keywords_handle);

var_dump($matches);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM