简体   繁体   English

PHP Regex:解析复杂文本文件中的数据

[英]PHP Regex: parse data from complex text file

Using PHP and regex how can I extract data from text file as shown in the highlighted parts (as an example but the idea is to extract the whole file): 使用PHP和regex,如何从文本文件中提取数据,如突出显示的部分所示(作为示例,但想法是提取整个文件):

在此处输入图片说明

I would like to put the highlighted parts (SHORT DESCRIPTION, LEN, TYPE, Description, SAS Name and VALUES if they exist) into a multidimensional array: 我想将突出显示的部分(简短说明,LEN,TYPE,说明,SAS名称和值(如果存在)放入多维数组:

$columns = [
    [
        'Provider Category Subtype Code',
        2,
        'VARCHAR2',
        'Identifies the subtype of the provider, with..and SNFs.',
        'PRVDR_CTGRY_SBTYP_CD',
        [
            '01' => 'Short Term',
            '02' => 'Long Term',
        ],
    ],
    [
        'Provider Category Code',
        2,
        'VARCHAR2',
        'Identifies the type of provider participating in..Medicaid program.',
        'PRVDR_CTGRY_CD',
        [
            '01' => 'Hospital',
        ],
    ]
    // rest of the columns..
];

So far I have this: 到目前为止,我有这个:

// For real file content
$str = file_get_contents('https://data.cms.gov/api/views/i4jy-dtss/files/8331bd77-e02d-42a1-b4a4-b4a3ef31655d?download=true&filename=POS_OTHER_LAYOUT_SEP17.txt');

$fileArray =  explode("\n", $str);

// Prepare columns
$columns = [];
$column = [];

// sets the start of a new column
$startOfNewColumn = false;

foreach ($fileArray as $line) {
    if (preg_match('/^\s{3}\S/m', $line) && !preg_match('/^\s{3}SHORT DESCRIPTION/m', $line)) {
        $column = [];
        $startOfNewColumn = true;
    }
}

This is the regex I am using. 这是我正在使用的正则表达式

Since this file doesn't have a "fixed" structure/pattern, It is useless to parse it with regex. 由于此文件没有“固定”结构/模式,因此用正则表达式解析它是没有用的。

The final solution I did was to use a bunch of if else statements and loop through each line. 我做的最终解决方案是使用一堆if else语句并遍历每一行。 It is not the best optimal thing to do but this is how I solved this question. 这不是最好的最佳选择,但这就是我解决这个问题的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM