簡體   English   中英

通過PHPExcel讀取.xls文件會引發致命錯誤:允許內存大小...即使使用塊讀取器也是如此

[英]Reading .xls file via PHPExcel throws Fatal error: allowed memory size… even with chunk reader

即時通訊使用PHPExcel讀取.xls文件。 我見面的時間很短

Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 730624 bytes) in Excel\PHPExcel\Shared\OLERead.php on line 93

經過一些谷歌搜索,我嘗試chunkReader來防止這種情況(甚至在PHPExcel主頁上提到),但我仍然堅持這個錯誤。

我的想法是,通過大塊閱讀器,我將逐個閱讀文件,我的記憶不會溢出。 但是必須有一些嚴重的記憶漏洞? 或者我釋放一些記憶力不好? 我甚至試圖將服務器ram提升到1GB。 我試圖閱讀的文件大小約為700k,這不是那么多(我也讀取~20MB pdf,xlsx,docx,doc等文件沒有問題)。 所以我假設我可能會忽略一些小的巨魔。

代碼看起來像這樣

function parseXLS($fileName){
    require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
    require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';

    $inputFileType = 'Excel5';

    /**  Create a new Reader of the type defined in $inputFileType  **/
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    /**  Define how many rows we want to read for each "chunk"  **/ 
    $chunkSize = 20;
    /**  Create a new Instance of our Read Filter  **/ 
    $chunkFilter = new chunkReadFilter(); 
    /**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/ 
    $objReader->setReadFilter($chunkFilter); 

    /**  Loop to read our worksheet in "chunk size" blocks  **/ 
    /**  $startRow is set to 2 initially because we always read the headings in row #1  **/
    for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) { 
        /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/ 
        $chunkFilter->setRows($startRow,$chunkSize); 
        /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/ 
        $objPHPExcel = $objReader->load($fileName); 
        //    Do some processing here 

        //    Free up some of the memory 
        $objPHPExcel->disconnectWorksheets(); 
        unset($objPHPExcel); 
    }
}

這里是chunkReader的代碼

class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;
    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */ 
    public function setRows($startRow, $chunkSize) { 
        $this->_startRow    = $startRow; 
        $this->_endRow      = $startRow + $chunkSize;
    } 

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow 
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) { 
           return true;
        }
        return false;
    } 
}

所以我在這里找到了有趣的解決方案如何使用PHPExcel從大型Excel文件(27MB +)中讀取大型工作表?

作為附錄3的問題

edit1:也有了這個解決方案,我用我最喜歡的errr消息來阻塞,但我發現了一些關於緩存的東西,所以我實現了這個

$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

最近我測試它僅適用於小於10MB的xls文件,但它似乎工作(我也設置$objReader->setReadDataOnly(true); )它似乎足夠平衡,以實現速度和內存消耗。 (如果可能的話,我將更多地遵循我的棘手路徑)

編輯2:所以我做了一些進一步的研究,發現我的方式不需要大塊閱讀器。 (對我而言,內存問題與大塊閱讀器相同,沒有它。)所以我對我的問題的最終答案是這樣的,它讀取.xls文件(只有來自單元格的數據,沒有格式化,甚至過濾公式)。 當我使用cache_tp_php_temp我能夠在幾秒鍾內讀取xls文件(測試到10MB)和大約10k行和多列並且沒有內存問題

function parseXLS($fileName){

/** PHPExcel_IOFactory */
    require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
    require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';
    require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel.php';

    $inputFileName = $fileName;
    $fileContent = "";

    //get inputFileType (most of time Excel5)
    $inputFileType = PHPExcel_IOFactory::identify($inputFileName);

    //initialize cache, so the phpExcel will not throw memory overflow
    $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
    $cacheSettings = array(' memoryCacheSize ' => '8MB');
    PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

    //initialize object reader by file type
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);

    //read only data (without formating) for memory and time performance
    $objReader->setReadDataOnly(true);

    //load file into PHPExcel object
    $objPHPExcel = $objReader->load($inputFileName);

    //get worksheetIterator, so we can loop sheets in workbook
    $worksheetIterator = $objPHPExcel->getWorksheetIterator();

    //loop all sheets
    foreach ($worksheetIterator as $worksheet) {    

            //use worksheet rowIterator, to get content of each row
            foreach ($worksheet->getRowIterator() as $row) {
                //use cell iterator, to get content of each cell in row
                $cellIterator = $row->getCellIterator();
                //dunno
                $cellIterator->setIterateOnlyExistingCells(false);      

                //iterate each cell
                foreach ($cellIterator as $cell) {
                    //check if cell exists
                    if (!is_null($cell)) {
                        //get raw value (without formating, and all unnecessary trash)
                        $rawValue = $cell->getValue();
                        //if cell isnt empty, print its value
                        if ((trim($rawValue) <> "") and (substr(trim($rawValue),0,1) <> "=")){
                            $fileContent .= $rawValue . " ";                                            
                        }
                    }
                }       
            }       
    }

    return $fileContent;
}

這是我根據你的例子所做的。 我發現需要設置php引擎的一些變量以確保函數的成功。 看看這個。 我刪除了一些部分插入我的數據庫,但主要的想法是在這里。

$upload_dir = dirname(__DIR__) . "/uploads/";
$inputFileName = $upload_dir . basename($_FILES["fileToUpload"]["name"]);
$insertOk = FALSE;

// get inputFileType (most of time Excel5)
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);

// initialize cache, so the phpExcel will not throw memory overflow
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 180); // 180 seconds of execution time maximum
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

// initialize object reader by file type
$objReader = PHPExcel_IOFactory::createReader($inputFileType);

// read only data (without formating) for memory and time performance
$objReader->setReadDataOnly(true);

// load file into PHPExcel object
$objPHPExcel = $objReader->load($inputFileName);
$objPHPExcel->setActiveSheetIndex(0);

$spreadsheetInfo = $objReader->listWorksheetInfo($inputFileName);
$maxRowsAllowed = $spreadsheetInfo[0]['totalRows']; 

// Define how many rows we want to read for each "chunk"
$chunkSize = 200;

// Create a new Instance of our Read Filter
$chunkFilter = new ReportChunkReadFilter();

//  Tell the Reader that we want to use the Read Filter that we've
//  Instantiated
$objReader->setReadFilter($chunkFilter);

// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 0; $startRow <= $maxRowsAllowed; $startRow += $chunkSize) {
    // Tell the Read Filter, the limits on which rows we want to 
    // read this iteration
    $chunkFilter->setRows($startRow,$chunkSize);

    // Load only the rows that match our filter from $inputFileName
    // to a PHPExcel Object
    $objPHPExcel = $objReader->load($inputFileName);
    $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);

    // loop on the rows of the filtered excel file (the chunk)
    foreach ($sheetData as $rowArray) {                                    
      echo $rowArray['A'];  
      // do your stuff here
    }

    // Free up some of the memory 
    $objPHPExcel->disconnectWorksheets(); 
    unset($objPHPExcel);                    
}

unlink($inputFileName); 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM