简体   繁体   English

PHP从文件获取X行数,直到完成

[英]PHP Get X Amount of Lines from File until Done

I have a text file with 2000 lines and need to get 100 at a time in a continuous loop. 我有一个2000行的文本文件,并且需要连续循环一次获取100行。 I can get it to work, but I have to manually change $i value for each chunk of 100. Here is the code: 我可以使它工作,但是我必须手动更改每个100块的$ i值。以下是代码:

    $file = 'postcode_2000.txt';
    for ($i = 0; $i <= 99; $i++) {

        $str = str_replace(PHP_EOL, '', $file[$i]);
        $list[] = $str;

    }

    $list_json = json_encode($list);

How can I iterate over this getting the next 100 each time? 我该如何迭代每次获得下一个100?

Once you have the file in an array you can use array_chunk() . 将文件放入数组后,可以使用array_chunk() First read the file into an array: 首先将文件读入数组:

$lines = 'postcode_2000.txt';

Then chunk the file into 100 line parts: 然后文件放入100个部分:

$chunks = array_chunk($lines, 100, true); // true keeps a set of consecutive line numbers

The output will be something like this: 输出将是这样的:

Array
(
    [0] => Array
        (
            [0] => line in file
            [1] => line in file
            [2] => line in file
            [3] => line in file
            [4] => line in file
            ...
            [94] => line in file
            [95] => line in file
            [96] => line in file
            [97] => line in file
            [98] => line in file
            [99] => line in file
        )

    [1] => Array
        (
            [100] => line in file
            [101] => line in file
            [102] => line in file
            [103] => line in file
            [104] => line in file
            ...
            [194] => line in file
            [195] => line in file
            [196] => line in file
            [197] => line in file
            [198] => line in file
            [199] => line in file
        )

    [2] => Array
        (
            [200] => line in file
            [201] => line in file
            [202] => line in file
            [203] => line in file
            [204] => line in file
            ...
            [294] => line in file
            [295] => line in file
            [296] => line in file
            [297] => line in file
            [298] => line in file
            [299] => line in file
        )
    ...    
    [18] => Array
        (
            [1800] => line in file
            [1801] => line in file
            [1802] => line in file
            [1803] => line in file
            [1804] => line in file
            ...
            [1894] => line in file
            [1895] => line in file
            [1896] => line in file
            [1897] => line in file
            [1898] => line in file
            [1899] => line in file
        )

    [19] => Array
        (
            [1900] => line in file
            [1901] => line in file
            [1902] => line in file
            [1903] => line in file
            [1904] => line in file
            ...
            [1994] => line in file
            [1995] => line in file
            [1996] => line in file
            [1997] => line in file
            [1998] => line in file
            [1999] => line in file
        )

)

If you really want to use loops instead of array_chunk :) 如果您真的想使用循环而不是array_chunk :)

<?php

$file = file('/tmp/postcodes_2000.txt');

$list = array();
$line_count = count($file);
$chunk_size = 100;
$current_line = 0;
echo "Line count: ${line_count}" . PHP_EOL;
while( $current_line < $line_count ) {
  $i = 0;
  $chunk = array();
  while($i < $chunk_size) {
    $str = str_replace(PHP_EOL, '', $file[$current_line]);
    $chunk[] = $str;
    echo "> Line ${current_line} : " . $file[$current_line] . PHP_EOL;
    $i++;
    $current_line++;
  }
  $list[] = $chunk;
}

$list_json = json_encode($list);
echo $list_json . PHP_EOL;

Generators are great for when you are calculating large sets and you don't want to allocate memory for all of the results at the same time or when you don't know if you will need all of the results, Due to the way results are processed, the memory footprint can be reduced to a very bare minimum by allocating memory for only the current result. 生成器非常适合当您计算大型集,并且您不想同时为所有结果分配内存,或者当您不知道是否需要所有结果时,由于结果的方式是在处理后,可以仅为当前结果分配内存,从而将内存占用空间降至最低限度。

So this is a very fast way which consume less memory to achieve your goal: 因此,这是一种非常快速的方法,它消耗更少的内存来实现您的目标:

we open the file ,read a precise amount of lines and then yield the result while keeping the state for the next iteration and thus allow you to handle your chunks very fast and with less memory consumption. 我们打开文件,读取精确的行数,然后生成结果,同时保留下一次迭代的状态,从而使您可以非常快速地处理块,并减少内存消耗。

Proceeding this way has the advantage to avoid reading all the file twice to build chunks (read it one time to build a big array containing all the lines and then loop through this array to build chunks) before allowing you finally (a third time) to loop through the array of chunks... 进行这种方式的好处是,在允许您最终(第三次)执行以下操作之前,避免两次读取所有文件以构建块(一次读取以构建包含所有行的大数组,然后循环遍历该数组以构建块)。遍历大块数组...

function file_get_x_lines($file,$amount=100){
    if(!file_exists($file)||!is_file($file)||!is_readable($file))
        return;
    if(!is_int($amount)||$amount<1) $amount=1;
    $handle=fopen($file,'rb');
        $i=0;
        while($line=fgets($handle)){
            if($i===0)
                $tmp=array();
            $chunks[]=rtrim($line,PHP_EOL);
            $i++;
            if($i===$amount){
                $i=0;
                $tmp=$chunks;
                $chunks=array();
                yield $tmp;
            }
        }
        if(!$line)
            yield $chunks; 

    }

then you can use it like you would use any generator 然后就可以像使用任何发电机一样使用它

foreach(file_get_x_lines(__FILE__) as $list)    {
    $list=json_encode($list);
    //do stuff
}

if you want to keep same keys as the line number you can lightly alter the function this way: 如果您想保持与行号相同的键,则可以通过以下方式轻松更改功能:

function file_get_x_lines($file,$amount=100){
        if(!file_exists($file)||!is_file($file)||!is_readable($file))
            return;
        if(!is_int($amount)||$amount<1) $amount=1;
        $handle=fopen($file,'rb');
            $i=0;
            $j=1;
            while($line=fgets($handle)){
                if($i===0)
                     $tmp=array();
                $chunks[$j]=rtrim($line,PHP_EOL);
                $i++;
                $j++;
                if($i===$amount){
                    $i=0;
                    $tmp=$chunks;
                    $chunks=array();
                    yield $tmp;
                }
            }
            if(!$line)
                yield $chunks; 

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM