简体   繁体   English

如何使用php打开.docx,.doc,.pdf文件并读取第一行前50个字符

[英]how to open a .docx,.doc, .pdf file using php and read first line first 50 characters

I want to know how to open a file with .docx, .doc, .pdf extension using php and then read the first line of 50 characters. 我想知道如何使用php打开带有.docx,.doc,.pdf扩展名的文件,然后读取50个字符的第一行。

Code : 代码:

****include_once 'inc/docx.php';
include_once 'inc/PdfParser.php'; ****   

            if ($imageFileType == 'pdf') {
                    $pdfObj = new PdfParser();
                    $resumeText = $pdfObj->parseFile($target_file);
                    // $resumeText = $pdfObj->getText();
                } else {
                    $docObj = new DocxConversion($target_file);
                    $resumeText = $docObj->convertToText();

                }




           $fileInfo = explode(PHP_EOL, $resumeText);
                $records = [];
                foreach ($fileInfo as $row) {
                    // if($row == '') continue;
                    // $parts = explode(',12', $row);
                    $parts = preg_split('/(?<=[.?!])\s+(?=[a-z])/i', $row);
                    foreach ($parts as $part) {
                        if ($part == '') {
                            continue;
                        }
                    // echo $part.'<br><br>';
                        $part = strtolower($part);

how to open a .docx,.doc, .pdf file using php and read first line first 50 characters 如何使用php打开.docx,.doc,.pdf文件并读取第一行前50个字符

I think you should use this library PHPOffice/PHPWord which allow you to read all document type you mention. 我认为你应该使用这个库PHPOffice / PHPWord ,它允许你阅读你提到的所有文档类型。

You should probably first test document type then use different function to retrieve text and get first 50 characters. 您应该首先测试文档类型然后使用不同的函数来检索文本并获得前50个字符。

Already used this library with success. 已成功使用此库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM