简体   繁体   English

如果使用PHP包含多个单词,则在文件中仅获得一个单词的相同单词

[英]get the same word in file only 1 word if it have many word are the same using PHP

I want to read data from text file and group for some data are the same.I have code as below: 我想从文本文件和组中读取数据,因为某些数据是相同的。我有以下代码:

$PMTA_DATE = date("Y-m-d");
            $PMTA_FILE = file_get_contents("../stats_domain_emetteur_recepteur.affinitead.net.".$PMTA_DATE.".txt");
            $lineFromText = explode("\n", $PMTA_FILE);
            $result = array();
            $cate = "";
            $total ="";
            $fail = "";
            $mailSuc = "";
            $title = "";
            foreach($lineFromText as $line){                    
                    $words = explode(";",$line);
                    echo $words[5];
                    echo "<br>";
                     if($title == ""){
                         $title = $words[0];
                     }

                     $cate .= ','."'$words[6]'";
                     $total .= ','.$words[7];
                     $fail .= ','.$words[8];
                     $mailSuc .= ','.((int)$words[7] - (int)$words[8]);                         

     }

In file data: 在文件数据中:

2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;hotmail.fr;150116;90753;60.45
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;hotmail.com;108478;65766;60.62
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;free.fr;81431;97;.11
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;wanadoo.fr;77786;15;.01
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;gmail.com;77325;1;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;orange.fr;44768;13;.02
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;laposte.net;33844;16;.04
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;neuf.fr;29918;26;.08
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;yahoo.fr;23232;1;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;friendcorp.fr;yahoo.fr;21073;2;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;voila.fr;19692;3;.01
2012-12-19-0830;affinitead.net;1409462;231830;16.44;messengear.fr;free.fr;18234;5;.02
2012-12-19-0830;affinitead.net;1409462;231830;16.44;friendcorp.fr;free.fr;17658;12;.06
2012-12-19-0830;affinitead.net;1409462;231830;16.44;lebuzzdesbonsplans.com;yahoo.fr;15856;103;.64
2012-12-19-0830;affinitead.net;1409462;231830;16.44;cwlunit.com;laposte.net;13463;1;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;msn.com;12044;7222;59.96
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;live.fr;11491;6983;60.76
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;aliceadsl.fr;11145;17;.15
2012-12-19-0830;affinitead.net;1409462;231830;16.44;cwlunit.com;sfr.fr;11135;1;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;tendancity.com;yahoo.fr;10631;0;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;sfr.fr;9878;1;.01
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;club-internet.fr;9868;4;.04
2012-12-19-0830;affinitead.net;1409462;231830;16.44;friendcorp.fr;wanadoo.fr;9533;0;0
2012-12-19-0830;affinitead.net;1409462;231830;16.44;boulevard-des-ventes.com;aol.com;9253;7729;83.52
2012-12-19-0830;affinitead.net;1409462;231830;16.44;lebuzzdesbonsplans.com;hotmail.com;8656;252;2.91
2012-12-19-0830;affinitead.net;1409462;231830;16.44;messengear.fr;laposte.net;8616;1;.01

as you see have some data are the same like boulevard-des-ventes.com it has many time so I don't want that.I want take it only 1 if it has many time. 如您所见,有一些数据像boulevard-des-ventes.com一样,它有很多时间,所以我不想要它。如果有很多时间,我只想取1。

This is the output I need: 这是我需要的输出:

boulevard-des-ventes.com        hotmail.fr  150116
                                hotmail.com 108478
                                free.fr      81431
                                ..................
                                ..................
                                ..................
friendcorp.fr                   yahoo.fr    21073
                                free.fr      17658
cwlunit.com                     laposte.net  13463
                                sfr.fr       11135
                                ..................
                                ..................
                                ..................
..................................................
..................................................
..................................................

Use an associative array to remember which domains you've already processed, and skip to the next line when you get a duplicate. 使用关联数组来记住您已经处理过的域,并在获得重复域时跳至下一行。

$domains_seen = array();
foreach($lineFromText as $line){
  $words = explode(";",$line);
  $domain = $words[5];
  if array_key_exists($domain, $domains_seen) {
    continue;
  }
  $domains_seen[$domain] = true;
  ...
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM