[英]Lexical Analyzer not getting the next character
So I am working on a project where we are making a small compiler program but before I can move on to the other parts I am having troubles with getting the lexical analyzer to output anything after '\\BEGIN' afterwards I debugged it and it seems the value is stuck in a loop where the condition is saying the next character is always a newline. 因此,我正在开发一个小的编译器程序的项目中,但是在我继续进行其他部分之前,我很难让词法分析器在调试\\ BEGIN之后输出任何内容,之后我对其进行了调试,看来值卡在一个循环中,条件是说下一个字符始终是换行符。 Is it because I haven't added the pattern matching yet to the defined tokens?
是因为我还没有将模式匹配添加到已定义的令牌中?
Here is the code 这是代码
import java.util
//import com.sun.javafx.fxml.expression.Expression.Parser.Token
/*Lexical analyzer will be responsible for the following:
- finds the lexemes
- Checks each given character determining the tokens
* */
class MyLexicalAnalyzer extends LexicalAnalyzer {
//Array full of the keywords
//val SpecialCharacters = List(']', '#', '*', '+', '\\', '[', '(',')', "![", '=')
val TEXT = "[a-z] | _ | 0-9 | [A-Z]:"
private var sourceLine: String = null
private val lexeme: Array[Char] = new Array[Char](999)
private var nextChar: Char = 0
private var lexLength: Int = 0
private var position: Int = 0
private val lexems: util.List[String] = new util.ArrayList[String]
def start(line: String): Unit = {
initializeLexems()
sourceLine = line
position = 0
getChar()
getNextToken()
}
// A helper method to determine if the current character is a space.
private def isSpace(c: Char) = c == ' '
//Defined and intialized tokens
def initializeLexems(): Any = {
lexems.add("\\BEGIN")
lexems.add("\\END")
lexems.add("\\PARAB")
lexems.add("\\DEF[")
lexems.add("\\USE[")
lexems.add("\\PARAE")
lexems.add("\\TITLE[")
lexems.add("]")
lexems.add("[")
lexems.add("\\")
lexems.add("(")
lexems.add(")")
lexems.add("![")
lexems.add("=")
lexems.add("+")
lexems.add("#")
}
//val pattern = new regex("''").r
def getNextToken() ={
lexLength = 0
// Ignore spaces and add the first character to the token
getNonBlank()
addChar()
getChar()
// Continue gathering characters for the token
while ( {
(nextChar != '\n') && (nextChar != ' ')
}) {
addChar()
getChar()
}
// Convert the gathered character array token into a String
val newToken: String = new String(lexeme)
if (lookup(newToken.substring(0, lexLength)))
MyCompiler.setCurrentToken(newToken.substring(0,lexLength))
}
// A helper method to get the next non-blank character.
private def getNonBlank(): Unit = {
while ( {
isSpace(nextChar)
}) getChar()
}
/*
Method of function that adds the current character to the token
after checking to make sure that length of the token isn't too
long, a lexical error in this case.
*/
def addChar(){
if (lexLength <= 998) {
lexeme({
lexLength += 1; lexLength - 1
}) = nextChar
lexeme(lexLength) = 0
}
else
System.out.println("LEXICAL ERROR - The found lexeme is too long!")
if (!isSpace(nextChar))
while ( {
!isSpace(nextChar)
})
getChar()
lexLength = 0
getNonBlank()
addChar()
}
//Reading from the file its obtaining the tokens
def getChar() {
if (position < sourceLine.length)
nextChar = sourceLine.charAt ( {
position += 1;
position - 1
})
else nextChar = '\n'
def lookup(candidateToken: String): Boolean ={
if (!(lexems.contains(candidateToken))) {
System.out.println("LEXICAL ERROR - '" + candidateToken + "' is not recognized.")
return false
}
return true
}
}
else nextChar = '\\n'<- this is where the condition goes after rendering the first character '\\BEGIN' then just keeps outputting in the debug console as listed below.
This is what the debug console it outputting after '\\BEGIN' is read through Can anyone please let me know why that is?
这是通读'\\ BEGIN'后输出的调试控制台的内容。有人可以让我知道为什么吗? This happens after I keep stepping into it many times as well.
在我也多次介入之后,就会发生这种情况。
Here is the driver class that uses the lexical analyzer 这是使用词法分析器的驱动程序类
import scala.io.Source
object MyCompiler {
//check the arguments
//check file extensions
//initialization
//get first token
//call start state
var currentToken : String = ""
def main(args: Array[String]): Unit = {
val filename = args(0)
//check if an input file provided
if(args.length == 0) {
//usage error
println("USAGE ERROR: Must provide an input file. ")
System.exit(0)
}
if(!checkFileExtension(args(0))) {
println("USAGE ERROR: Extension name is invalid make sure its .gtx ")
System.exit(0)
}
val Scanner = new MyLexicalAnalyzer
val Parser = new MySyntaxAnalyzer
//getCurrentToken(Scanner.getNextToken())
//Parser.gittex()
for (line <- Source.fromFile(filename).getLines()){
Scanner.start(line)
println()
}
//.......
//If it gets here, it is compiled
//post processing
}
//checks the file extension if valid and ends with .gtx
def checkFileExtension(filename : String) : Boolean = filename.endsWith(".gtx")
def getCurrentToken() : String = this.currentToken
def setCurrentToken(t : String ) : Unit = this.currentToken = t
}
The code is operating as it is supposed to. 该代码正在按预期方式运行。 The first line contains only the string
\\BEGIN
so the lexical analyser is treating the end of the first line as an '\\n' as shown in this method: 第一行仅包含字符串
\\BEGIN
因此词法分析器将第一行的末尾视为“ \\ n”,如以下方法所示:
def getChar() {
if (position < sourceLine.length)
nextChar = sourceLine.charAt ( {
position += 1;
position - 1
})
else nextChar = '\n'
However, the comment directly above that method does not describe what the method actually does. 但是,该方法正上方的注释并未描述该方法的实际作用。 This could be a hint as to where your confusion lies.
这可能暗示了您的困惑所在。 If the comment says it should read from the file, but it is not reading from the file, maybe that's what you've forgotten to implement.
如果注释说应该从文件中读取,但不是从文件中读取,则可能是您忘记了要实现的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.