[英]Reading .xls file via PHPExcel throws Fatal error: allowed memory size… even with chunk reader
im using PHPExcel to read .xls files. 即时通讯使用PHPExcel读取.xls文件。 I quite a short time i meet
我见面的时间很短
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 730624 bytes) in Excel\PHPExcel\Shared\OLERead.php on line 93
after some googling, i tried chunkReader to prevent this (mentioned even on PHPExcel homesite), but im still stucked with this error. 经过一些谷歌搜索,我尝试chunkReader来防止这种情况(甚至在PHPExcel主页上提到),但我仍然坚持这个错误。
My thought is, that via chunk reader, i will read file part by part and my memory wont overflow. 我的想法是,通过大块阅读器,我将逐个阅读文件,我的记忆不会溢出。 But there must be some serious memoryleak?
但是必须有一些严重的记忆漏洞? Or im freeing some memory bad?
或者我释放一些记忆力不好? I even tried to raise server ram to 1GB.
我甚至试图将服务器ram提升到1GB。 File size, which i trying to read is about 700k, which is not so much (im also reading ~20MB pdf, xlsx, docx, doc, etc files without issue).
我试图阅读的文件大小约为700k,这不是那么多(我也读取~20MB pdf,xlsx,docx,doc等文件没有问题)。 So i assume there can be just some minor troll i overlooked.
所以我假设我可能会忽略一些小的巨魔。
Code looks like this 代码看起来像这样
function parseXLS($fileName){
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';
$inputFileType = 'Excel5';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 20;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($fileName);
// Do some processing here
// Free up some of the memory
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
}
}
And here is code for chunkReader 这里是chunkReader的代码
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
private $_startRow = 0;
private $_endRow = 0;
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}
So i found interesting solution here How to read large worksheets from large Excel files (27MB+) with PHPExcel? 所以我在这里找到了有趣的解决方案如何使用PHPExcel从大型Excel文件(27MB +)中读取大型工作表?
as Addendum 3 in question 作为附录3的问题
edit1: also with this solution, i came to chokepoint with my favourite errr message, but i found something about caching, so i implemented this edit1:也有了这个解决方案,我用我最喜欢的errr消息来阻塞,但我发现了一些关于缓存的东西,所以我实现了这个
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
recently i tested it only for xls files lesser than 10MB, but it seems like to work (also i set $objReader->setReadDataOnly(true);
) and it seems like balanced enough to achieve speed and memory consumption. 最近我测试它仅适用于小于10MB的xls文件,但它似乎工作(我也设置
$objReader->setReadDataOnly(true);
)它似乎足够平衡,以实现速度和内存消耗。 (i will follow my thorny path more, if its possible) (如果可能的话,我将更多地遵循我的棘手路径)
edit2: So i made some further research and found chunk reader unnecessary in my way. 编辑2:所以我做了一些进一步的研究,发现我的方式不需要大块阅读器。 (seems like to me, memory issue is same with chunk reader and without it.) So my final answer to my question is something like that, which reads .xls file (only data from cells, without formating, even filtering out formulas).
(对我而言,内存问题与大块阅读器相同,没有它。)所以我对我的问题的最终答案是这样的,它读取.xls文件(只有来自单元格的数据,没有格式化,甚至过滤公式)。 When i use
cache_tp_php_temp
im able to read xls files (tested to 10MB) and about 10k rows and multiple columns in matter of seconds and without memory issue 当我使用
cache_tp_php_temp
我能够在几秒钟内读取xls文件(测试到10MB)和大约10k行和多列并且没有内存问题
function parseXLS($fileName){
/** PHPExcel_IOFactory */
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.php';
require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel.php';
$inputFileName = $fileName;
$fileContent = "";
//get inputFileType (most of time Excel5)
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
//initialize cache, so the phpExcel will not throw memory overflow
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
//initialize object reader by file type
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
//read only data (without formating) for memory and time performance
$objReader->setReadDataOnly(true);
//load file into PHPExcel object
$objPHPExcel = $objReader->load($inputFileName);
//get worksheetIterator, so we can loop sheets in workbook
$worksheetIterator = $objPHPExcel->getWorksheetIterator();
//loop all sheets
foreach ($worksheetIterator as $worksheet) {
//use worksheet rowIterator, to get content of each row
foreach ($worksheet->getRowIterator() as $row) {
//use cell iterator, to get content of each cell in row
$cellIterator = $row->getCellIterator();
//dunno
$cellIterator->setIterateOnlyExistingCells(false);
//iterate each cell
foreach ($cellIterator as $cell) {
//check if cell exists
if (!is_null($cell)) {
//get raw value (without formating, and all unnecessary trash)
$rawValue = $cell->getValue();
//if cell isnt empty, print its value
if ((trim($rawValue) <> "") and (substr(trim($rawValue),0,1) <> "=")){
$fileContent .= $rawValue . " ";
}
}
}
}
}
return $fileContent;
}
here is what I did based on your examples. 这是我根据你的例子所做的。 I found out that some variables with the php engine need to be set to ensure the success of the function.
我发现需要设置php引擎的一些变量以确保函数的成功。 Take a look at this.
看看这个。 I remove some part to insert into my database but the main idea is here.
我删除了一些部分插入我的数据库,但主要的想法是在这里。
$upload_dir = dirname(__DIR__) . "/uploads/";
$inputFileName = $upload_dir . basename($_FILES["fileToUpload"]["name"]);
$insertOk = FALSE;
// get inputFileType (most of time Excel5)
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
// initialize cache, so the phpExcel will not throw memory overflow
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 180); // 180 seconds of execution time maximum
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array(' memoryCacheSize ' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
// initialize object reader by file type
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
// read only data (without formating) for memory and time performance
$objReader->setReadDataOnly(true);
// load file into PHPExcel object
$objPHPExcel = $objReader->load($inputFileName);
$objPHPExcel->setActiveSheetIndex(0);
$spreadsheetInfo = $objReader->listWorksheetInfo($inputFileName);
$maxRowsAllowed = $spreadsheetInfo[0]['totalRows'];
// Define how many rows we want to read for each "chunk"
$chunkSize = 200;
// Create a new Instance of our Read Filter
$chunkFilter = new ReportChunkReadFilter();
// Tell the Reader that we want to use the Read Filter that we've
// Instantiated
$objReader->setReadFilter($chunkFilter);
// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 0; $startRow <= $maxRowsAllowed; $startRow += $chunkSize) {
// Tell the Read Filter, the limits on which rows we want to
// read this iteration
$chunkFilter->setRows($startRow,$chunkSize);
// Load only the rows that match our filter from $inputFileName
// to a PHPExcel Object
$objPHPExcel = $objReader->load($inputFileName);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
// loop on the rows of the filtered excel file (the chunk)
foreach ($sheetData as $rowArray) {
echo $rowArray['A'];
// do your stuff here
}
// Free up some of the memory
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
}
unlink($inputFileName);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.