[英]No lines read from text file when using 'readline' for node.js
[英]JS read multiple lines from text file
我有一个具有以下结构的文本文件:
#DATA1 1000
#DATA2 1000
#DATA3 2000
#VER B 2 20190403 "Text" 20190413
{
#TRANS 3001 {1 "TEXT"} -14000 "" "" 0
#TRANS 2611 {1 "TEXT"} -3500 "" "" 0
#TRANS 1510 {1 "LIU"} 17500 "" "" 0
}
#VER C 1 20190426 "TEXT" 20190426
{
#TRANS 1930 {} 1875 "" "" 0
#TRANS 1510 {} -1875 "" "" 0
}
我试图找到一种方法:
任何建议如何开始我? 一直在用这个小提琴进行测试,但到目前为止没有成功。
基本解析。 我会匹配以# 开头的行。 您可以轻松地遍历每一行并忽略带有 { 或 } 的那些,或者如果大括号真的很重要,那么您将需要遍历每一行。
但假设 { 和 } 不是真正需要的,你可以做这样的事情。
var txt = `#DATA1 1000 #DATA2 1000 #DATA3 2000 #VER B 2 20190403 "Text" 20190413 { #TRANS 3001 {1 "TEXT"} -14000 "" "" 0 #TRANS 2611 {1 "TEXT"} -3500 "" "" 0 #TRANS 1510 {1 "LIU"} 17500 "" "" 0 } #VER C 1 20190426 "TEXT" 20190426 { #TRANS 1930 {} 1875 "" "" 0 #TRANS 1510 {} -1875 "" "" 0 } `; // parse out the commands const commands = txt.match(/(#[^\\n]+)/g) // loop over const results = commands.reduce((acc, command) => { // break it up into its parts const [x, type, params] = command.match(/#([^\\s]+)\\s(.*)/) // if we find a ver, add new object to push to // if we find trans, push to the last object // else, assume it is data fields if (type === "VER") { acc.vers.push({ data: params, trans: [] }); } else if (type === "TRANS") { acc.vers[acc.vers.length - 1].trans.push(params); } else { acc.data[type] = params; } return acc; }, { data: {}, vers: [] }); console.log(results);
退后一步,看起来您正在尝试编写代码解释器。 这样做的基本步骤是:
您可以自己编写这些,但您可能想要探索标准代码解释引擎,因为它们可以为您完成很多艰苦的工作。 不过,学习曲线可能有点陡峭。
对于简单的语言,您可能可以做一些不太正式的事情。 快速浏览一下您的示例似乎表明代码被分解为行,因此换行符很重要。 看起来重要的关键字以# 符号作为前缀很有帮助。 鉴于上述情况,我可能会首先执行以下操作:
const data = `
#DATA1 1000
#DATA2 1000
#DATA3 2000
#VER B 2 20190403 "Text" 20190413
{
#TRANS 3001 {1 "TEXT"} -14000 "" "" 0
#TRANS 2611 {1 "TEXT"} -3500 "" "" 0
#TRANS 1510 {1 "LIU"} 17500 "" "" 0
}
#VER C 1 20190426 "TEXT" 20190426
{
#TRANS 1930 {} 1875 "" "" 0
#TRANS 1510 {} -1875 "" "" 0
}
`
// Get the data as individual lines
let dataLines = data.split("\n")
// Remove empty lines
dataLines = dataLines.filter(line => line !== "")
// Convert to tokens
tokenisedData = dataLines.map(line => {
let tokenName = "UNKNOWN";
if (line.match(/^#VER .+/)) {
tokenName = "VER_TOKEN"
} else if (line.match(/^#TRANS .+/)) {
tokenName = "TRANS_TOKEN"
} else if (line.match(/^#(DATA1|DATA2|DATA3) .+/)) {
tokenName = "DATA_TOKEN"
} else if (line === "{") {
tokenName = "OPEN_BLOCK"
} else if (line === "}") {
tokenName = "CLOSE_BLOCK"
}
return {
token: tokenName,
rawText: line
}
})
// Contexual parsing based on known token sequences may begin
const parsedData = [];
while (tokenisedData.length > 0) {
// Consume the first token
currentToken = tokenisedData.shift();
switch (currentToken.token) {
// Convert known sequence VER_TOKEN, OPEN_BLOCK, <<nested commands>> , CLOSE_BLOCK
case "VER_TOKEN":
// Set up an object to contain the VER command and the nested block
const verCommand = {
token: "VER_COMMAND",
// TODO - presumably need to parse the rawText here and populate in this verCommand object
rawText: currentToken.rawText,
nestedCommands: []
}
// We now expect an OPEN_BLOCK. Throw if not.
let nextToken = tokenisedData.shift();
if (nextToken.token !== "OPEN_BLOCK") {
throw "Parse error: expected OPEN_BLOCK for VER command but instead got " + nextToken.token
}
nextToken = tokenisedData.shift();
// Add the nested commands into the VER nestedCommands array
while (nextToken && nextToken.token !== "CLOSE_BLOCK") {
verCommand.nestedCommands.push(nextToken)
// Get the next token
nextToken = tokenisedData.shift();
}
// We now must have a CLOSE_BLOCK token
if (nextToken.token !== "CLOSE_BLOCK") {
throw "Parse error: expected CLOSE_BLOCK for VER command but instead got " + nextToken.token
}
// Add the parsed VER command to the resulting parsed data
parsedData.push(verCommand);
break;
// Nothing special to do with this token - keep it as it is
default:
parsedData.push(currentToken);
break;
}
}
console.log(parsedData)
这个例子比我最初计划的要多一些。 :)
但是,对于您正在解释的语言的基本词法分析器和解析器来说,它很可能是一个非常合理的起点。
使用上述内容,它将文本文件转换为以下结构化格式:
[
{
"rawText":"#DATA1 1000",
"token":"DATA_TOKEN"
},
{
"rawText":"#DATA2 1000",
"token":"DATA_TOKEN"
},
{
"rawText":"#DATA3 2000",
"token":"DATA_TOKEN"
},
{
"nestedCommands":[
{
"rawText":"#TRANS 3001 {1 \\"TEXT\\"} -14000 \\"\\" \\"\\" 0",
"token":"TRANS_TOKEN"
},
{
"rawText":"#TRANS 2611 {1 \\"TEXT\\"} -3500 \\"\\" \\"\\" 0",
"token":"TRANS_TOKEN"
},
{
"rawText":"#TRANS 1510 {1 \\"LIU\\"} 17500 \\"\\" \\"\\" 0",
"token":"TRANS_TOKEN"
}
],
"rawText":"#VER B 2 20190403 \\"Text\\" 20190413",
"token":"VER_COMMAND"
},
{
"nestedCommands":[
{
"rawText":"#TRANS 1930 {} 1875 \\"\\" \\"\\" 0",
"token":"TRANS_TOKEN"
},
{
"rawText":"#TRANS 1510 {} -1875 \\"\\" \\"\\" 0",
"token":"TRANS_TOKEN"
}
],
"rawText":"#VER C 1 20190426 \\"TEXT\\" 20190426",
"token":"VER"
}
]
值得注意的是,VER 命令以及左括号和右括号以及所有嵌套命令都已被使用,现在包含在单个 VER_COMMAND 对象中。
这种类型的数据正式结构使得在代码中处理变得更加容易,因为您现在可以迭代程序并执行您想要的程序部分。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.