JS从文本文件中读取多行

Question

我有一个具有以下结构的文本文件：

#DATA1 1000
#DATA2 1000
#DATA3 2000

#VER B 2 20190403 "Text" 20190413
{
#TRANS 3001 {1 "TEXT"} -14000 "" "" 0
#TRANS 2611 {1 "TEXT"} -3500 "" "" 0
#TRANS 1510 {1 "LIU"} 17500 "" "" 0
}
#VER C 1 20190426 "TEXT" 20190426
{
#TRANS 1930 {} 1875 "" "" 0
#TRANS 1510 {} -1875 "" "" 0
}

我试图找到一种方法：

从以#VER 开头的每一行到以#VER 开头的下一行之前的行，将文本文件分段
然后在段中的每个文本行上执行其他代码（不是这个问题的一部分）

任何建议如何开始我？ 一直在用这个小提琴进行测试，但到目前为止没有成功。

https://jsfiddle.net/236pbzqf/2/

Answer 1

基本解析。 我会匹配以# 开头的行。 您可以轻松地遍历每一行并忽略带有 { 或 } 的那些，或者如果大括号真的很重要，那么您将需要遍历每一行。

但假设 { 和 } 不是真正需要的，你可以做这样的事情。

 var txt = `#DATA1 1000 #DATA2 1000 #DATA3 2000 #VER B 2 20190403 "Text" 20190413 { #TRANS 3001 {1 "TEXT"} -14000 "" "" 0 #TRANS 2611 {1 "TEXT"} -3500 "" "" 0 #TRANS 1510 {1 "LIU"} 17500 "" "" 0 } #VER C 1 20190426 "TEXT" 20190426 { #TRANS 1930 {} 1875 "" "" 0 #TRANS 1510 {} -1875 "" "" 0 } `; // parse out the commands const commands = txt.match(/(#[^\\n]+)/g) // loop over const results = commands.reduce((acc, command) => { // break it up into its parts const [x, type, params] = command.match(/#([^\\s]+)\\s(.*)/) // if we find a ver, add new object to push to // if we find trans, push to the last object // else, assume it is data fields if (type === "VER") { acc.vers.push({ data: params, trans: [] }); } else if (type === "TRANS") { acc.vers[acc.vers.length - 1].trans.push(params); } else { acc.data[type] = params; } return acc; }, { data: {}, vers: [] }); console.log(results);

Answer 2

退后一步，看起来您正在尝试编写代码解释器。 这样做的基本步骤是：

将代码转换为标记序列（即使用“词法分析器”进行词法分析）
使用令牌转换成某种结构化格式，以便可以执行（例如，二叉语法树，使用解析器）

您可以自己编写这些，但您可能想要探索标准代码解释引擎，因为它们可以为您完成很多艰苦的工作。 不过，学习曲线可能有点陡峭。

对于简单的语言，您可能可以做一些不太正式的事情。 快速浏览一下您的示例似乎表明代码被分解为行，因此换行符很重要。 看起来重要的关键字以# 符号作为前缀很有帮助。 鉴于上述情况，我可能会首先执行以下操作：

const data = `
#DATA1 1000
#DATA2 1000
#DATA3 2000

#VER B 2 20190403 "Text" 20190413
{
#TRANS 3001 {1 "TEXT"} -14000 "" "" 0
#TRANS 2611 {1 "TEXT"} -3500 "" "" 0
#TRANS 1510 {1 "LIU"} 17500 "" "" 0
}
#VER C 1 20190426 "TEXT" 20190426
{
#TRANS 1930 {} 1875 "" "" 0
#TRANS 1510 {} -1875 "" "" 0
}
`

// Get the data as individual lines
let dataLines = data.split("\n")

// Remove empty lines
dataLines = dataLines.filter(line => line !== "")

// Convert to tokens
tokenisedData = dataLines.map(line => {
  let tokenName = "UNKNOWN";
  if (line.match(/^#VER .+/)) {
    tokenName = "VER_TOKEN"
  } else if (line.match(/^#TRANS .+/)) {
    tokenName = "TRANS_TOKEN"
  } else if (line.match(/^#(DATA1|DATA2|DATA3) .+/)) {
    tokenName = "DATA_TOKEN"
  } else if (line === "{") {
    tokenName = "OPEN_BLOCK"
  } else if (line === "}") {
    tokenName = "CLOSE_BLOCK"
  }
  return {
    token: tokenName,
    rawText: line
  }
})

// Contexual parsing based on known token sequences may begin
const parsedData = [];

while (tokenisedData.length > 0) {
  // Consume the first token
  currentToken = tokenisedData.shift();
  switch (currentToken.token) {

    // Convert known sequence VER_TOKEN, OPEN_BLOCK, <<nested commands>> , CLOSE_BLOCK
    case "VER_TOKEN":
      // Set up an object to contain the VER command and the nested block
      const verCommand = {
        token: "VER_COMMAND",
        // TODO - presumably need to parse the rawText here and populate in this verCommand object
        rawText: currentToken.rawText,
        nestedCommands: []
      }
      // We now expect an OPEN_BLOCK. Throw if not.
      let nextToken = tokenisedData.shift();
      if (nextToken.token !== "OPEN_BLOCK") {
        throw "Parse error: expected OPEN_BLOCK for VER command but instead got " + nextToken.token
      }
      nextToken = tokenisedData.shift();
      // Add the nested commands into the VER nestedCommands array
      while (nextToken && nextToken.token !== "CLOSE_BLOCK") {
        verCommand.nestedCommands.push(nextToken)
        // Get the next token
        nextToken = tokenisedData.shift();
      }
      // We now must have a CLOSE_BLOCK token
      if (nextToken.token !== "CLOSE_BLOCK") {
        throw "Parse error: expected CLOSE_BLOCK for VER command but instead got " + nextToken.token
      }
      // Add the parsed VER command to the resulting parsed data
      parsedData.push(verCommand);
      break;

    // Nothing special to do with this token - keep it as it is
    default:
      parsedData.push(currentToken);
      break;
  }
}

console.log(parsedData)

这个例子比我最初计划的要多一些。 :)

但是，对于您正在解释的语言的基本词法分析器和解析器来说，它很可能是一个非常合理的起点。

使用上述内容，它将文本文件转换为以下结构化格式：

[
   {
      "rawText":"#DATA1 1000",
      "token":"DATA_TOKEN"
   },
   {
      "rawText":"#DATA2 1000",
      "token":"DATA_TOKEN"
   },
   {
      "rawText":"#DATA3 2000",
      "token":"DATA_TOKEN"
   },
   {
      "nestedCommands":[
         {
            "rawText":"#TRANS 3001 {1 \\&quot;TEXT\\&quot;} -14000 \\&quot;\\&quot; \\&quot;\\&quot; 0",
            "token":"TRANS_TOKEN"
         },
         {
            "rawText":"#TRANS 2611 {1 \\&quot;TEXT\\&quot;} -3500 \\&quot;\\&quot; \\&quot;\\&quot; 0",
            "token":"TRANS_TOKEN"
         },
         {
            "rawText":"#TRANS 1510 {1 \\&quot;LIU\\&quot;} 17500 \\&quot;\\&quot; \\&quot;\\&quot; 0",
            "token":"TRANS_TOKEN"
         }
      ],
      "rawText":"#VER B 2 20190403 \\&quot;Text\\&quot; 20190413",
      "token":"VER_COMMAND"
   },
   {
      "nestedCommands":[
         {
            "rawText":"#TRANS 1930 {} 1875 \\&quot;\\&quot; \\&quot;\\&quot; 0",
            "token":"TRANS_TOKEN"
         },
         {
            "rawText":"#TRANS 1510 {} -1875 \\&quot;\\&quot; \\&quot;\\&quot; 0",
            "token":"TRANS_TOKEN"
         }
      ],
      "rawText":"#VER C 1 20190426 \\&quot;TEXT\\&quot; 20190426",
      "token":"VER"
   }
]

值得注意的是，VER 命令以及左括号和右括号以及所有嵌套命令都已被使用，现在包含在单个 VER_COMMAND 对象中。

这种类型的数据正式结构使得在代码中处理变得更加容易，因为您现在可以迭代程序并执行您想要的程序部分。

JS从文本文件中读取多行

问题描述

2 个解决方案

解决方案1
1 2021-10-29 12:50:42

解决方案2
0 2021-10-29 12:43:52

JS从文本文件中读取多行

问题描述

2 个解决方案

解决方案1 1 2021-10-29 12:50:42

解决方案2 0 2021-10-29 12:43:52

解决方案1
1 2021-10-29 12:50:42

解决方案2
0 2021-10-29 12:43:52