简体   繁体   English

使用php从平面文件加载数据

[英]Load data from flat file using php

I have a text file that serves as a database and have the following data format: 我有一个用作数据库的文本文件,具有以下数据格式:

*NEW RECORD
NM = Stackoverflow
DT = 9/15/2006
DS = Overflow
DS = Stack
DS = stackoverflow.com
DS = FAQ

*NEW RECORD
NM = Google
DT = 9/4/1998
DS = G+
DS = Google
DS = Search engine
DS = Search

You get the idea.. 你明白了。

The problem is I do not know how to load specific data from a specific record using PHP. 问题是我不知道如何使用PHP从特定记录中加载特定数据。 especially, when the data is not in an array format. 特别是当数据不是数组格式时。 Do I need to convert data to array format? 我需要将数据转换为数组格式吗? or is their a way that I can retrieve information from my current format? 还是他们可以从当前格式中检索信息的一种方式?

For example, what is the equivelent code for this mysql query: 例如,此mysql查询的等价代码是什么:

SELECT DT FROM MY_TXT WHERE DS = "Google"

If you're stuck with this format, you need a custom deserialisation mechanism. 如果您坚持使用这种格式,则需要自定义反序列化机制。 Here's one that works for your sample data: 这是适用于您的示例数据的一个:

<?php

date_default_timezone_set("UTC");

class Record {
    public $nm = null;
    public $dt = null;
    public $ds = [];

    function isValid() {
        return $this->nm !== null && $this->dt !== null && count($this->ds) > 0;
    }

    function isEmpty() {
        return $this->nm == null && $this->dt == null && count($this->ds) == 0;
    }
}

function deserialise($filename, $newLineSeparator = "\n") {
    $incompleteRecords = 0;
    $records = [];
    $lines = explode($newLineSeparator, file_get_contents($filename));

    if ($lines)
        $lines[] = "*NEW RECORD";

    $record = new Record();
    foreach ($lines as $line) {
        $line = trim($line);
        if ($line == "*NEW RECORD") {
            if ($record->isValid())
                $records[] = $record;
            else if (!$record->isEmpty())
                $incompleteRecords++;

            $record = new Record();
        } else if (substr($line, 0, 5) == "NM = ") {
            $record->nm = substr($line, 5);
        } else if (substr($line, 0, 5) == "DT = ") {
            $record->dt = strtotime(substr($line, 5));
        } else if (substr($line, 0, 5) == "DS = ") {
            $record->ds[] = substr($line, 5);
        }
    }

    echo "Found $incompleteRecords incomplete records.\n";

    return $records;
}

I tried it with your data and I get this output: 我对您的数据进行了尝试,并得到以下输出:

Found 0 incomplete records.
Array
(
    [0] => Record Object
        (
            [nm] => Stackoverflow
            [dt] => 1158278400
            [ds] => Array
                (
                    [0] => Overflow
                    [1] => Stack
                    [2] => stackoverflow.com
                    [3] => FAQ
                )

        )

    [1] => Record Object
        (
            [nm] => Google
            [dt] => 904867200
            [ds] => Array
                (
                    [0] => G+
                    [1] => Google
                    [2] => Search engine
                    [3] => Search
                )

        )

)

Is this what you want? 这是你想要的吗?

Some considerations 一些注意事项

  • Loads everything in memory at once; 一次将所有内容加载到内存中; no batching 不分批
  • Uses strtotime to parse dates into timestamps; 使用strtotime将日期解析为时间戳; you might want to just load them as strings (easier), or use the DateTime class. 您可能只想将它们加载为字符串(更简单),或者使用DateTime类。 If you're using strtotime , please set the adecuate timezone first, as in the example ( date_default_timezone_set ). 如果您使用的是strtotime ,请首先设置适当的时区,如示例( date_default_timezone_set )。
  • Assumes that a record is invalid if no NM is set, or no DT is set, or no DS entries exist. 假定如果未设置NM,未设置DT或不存在DS条目,则记录无效。 You can modify this constraint by adapting the isValid method on the Record class. 您可以通过修改Record类的isValid方法来修改此约束。
  • No error-handling for broken format, lowercase, etc. 没有错误处理格式,小写字母等
  • Assumes \\n as the newline separator. 假定\\n为换行符。 If you have \\r\\n or \\r , just invoke the deserialise function with them as the second parameter. 如果您具有\\r\\n\\r ,只需调用deserialise函数,并将它们作为第二个参数即可。

Without validation!! 未经验证!

$filename = "test.txt"; // Your Filename ;-)

$t = new FlatDbSearch($filename);

var_dump($t->select('DT', 'DS = "Google"'));

class FlatDbSearch {

    protected $lines;

    public function __construct($filename) {
        $this->lines = file($filename, FILE_IGNORE_NEW_LINES);
    }

    public function select($column, $where) {
        $parts = explode("=", $where);
        $searchKey = trim(str_replace('"', '', $parts[0]));
        $searchValue = trim(str_replace('"', '', $parts[1]));
        $column = trim(str_replace('"', '', $column));

        $lines = $this->searchForward($searchKey, $searchValue);
        if (count($lines) !== 0) {
            $results = $this->searchBackward($column, $lines);
            return $results;
        }
        return array();
    }

    protected function searchBackward($column, $lines) {
        $results = array();
        foreach($lines as $key) {
            for ($i = $key; $i > -1; $i--) {
                $parts = explode("=", $this->lines[$i]);
                if ($column == trim(str_replace('"', '', $parts[0]))) {
                    $results[] = trim(str_replace('"', '', $parts[1]));
                    break;
                }
            }
        }
        return $results;
    }

    protected function searchForward($searchKey, $searchValue) {
        $result = array();
        for ($i = 0; $i < count($this->lines); $i++) {
            $parts = explode("=", $this->lines[$i]);
            if (trim(str_replace('"', '', $parts[0])) == $searchKey) {
                if (trim(str_replace('"', '', $parts[1])) == $searchValue) {
                    $result[] = $i;
                }
            }
        }
        return $result;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM