简体   繁体   English

如何注释以“//”开头的字符串?

[英]How to comment string starting with "//" this?

I am learning compiler construction these days, and I am having trouble while making the code for comment in it.这些天我正在学习编译器构造,但在编写代码以供评论时遇到了麻烦。 What is actually happening is that when ever I am writing a string in the notepad file such as Hello //World.实际发生的情况是,当我在记事本文件中写入字符串时,例如 Hello //World。 Then it is printing "/" this div operator which I don't want.然后它正在打印“/”这个我不想要的 div 运算符。 What actually I want is that Hello should be printed in the output and World should get commented.实际上我想要的是 Hello 应该打印在输出中并且 World 应该得到评论。 I know I have included the code for div operator but it is also necessary to include.我知道我已经包含了 div 运算符的代码,但也有必要包含。 Just wanted to know how I can achieve this comment logic while checking the logic for checking the div operator should also be there.只是想知道如何在检查 div 运算符的逻辑的同时实现此注释逻辑也应该在那里。

Here is the code!这是代码!

import java.io.File;
import java.util.Scanner;
import java.io.FileNotFoundException;

public class main {
    
    public static void main(String[] args) throws FileNotFoundException{
        
        File newFile = new File("C:/temp/sourcecode.txt");
        Scanner scanFile = new Scanner(newFile);
        //Scanner scan = new Scanner(System.in);
        
        char ch;
        String str;
        
        
        
        while(scanFile.hasNextLine()){
        str = scanFile.nextLine();
        int l = str.length();
        if(!str.startsWith("//") && !str.startsWith("/*") && !str.endsWith("*/")) {
        for(int i =0; i<l ; i++) {
            ch = str.charAt(i);
            
            System.out.println(ch);
                
            if(ch == '*'){
                System.out.println("The Operator is MUL");
                System.out.println("arop\n");
            }
            if(ch == '/')
            {
                 System.out.println("The Operator is DIV");
                 System.out.println("arop\n");
            }
            

            
        }
        }
       
        
            int OP = 0;
            
            switch(OP){
                case 0: 
                    if(str.contains("<") && str.contains(">")){
                        System.out.println("The Operator is NE");
                        System.out.println("relop\n");
                        break;
                    }
                case 1: 
                    if(str.contains("<") && str.contains("=")){
                        System.out.println("The Operator is LE");
                        System.out.println("relop\n");
                        break;
                    }
                    
                case 2: 
                    if(str.contains(">") && str.contains("=")){
                        System.out.println("The Operator is GE");
                        System.out.println("relop\n");
                        break;
                    }
                    
                case 3: 
                    if(str.contains("<")){
                        System.out.println("The Operator is LT");
                        System.out.println("relop\n");
                        break;
                    }
                case 4: 
                    if(str.contains(">")){
                        System.out.println("The Operator is GT");
                        System.out.println("relop\n");
                        break;
                    }
                case 5: 
                    if(str.contains("==")){
                        System.out.println("The Operator is EQ");
                        System.out.println("relop\n");
                        break;
                    }
                case 6: 
                    if(str.contains("+")){
                        System.out.println("The Operator is ADD");
                        System.out.println("arop\n");
                        break;
                    }
                case 7: 
                    if(str.contains("-")){
                        System.out.println("The Operator is SUB");
                        System.out.println("arop\n");
                        break;
                    }
//                case 8: 
//                    if(str.contains("*")){
//                        System.out.println("The Operator is MUL");
//                        System.out.println("arop\n");
//                        break;
//                    }
//                case 9: 
//                    if(str.contains("/")){
//                        System.out.println("The Operator is DIV");
//                        System.out.println("arop\n");
//                        break;
//                    }
                case 10: 
                    if(str.contains("=")){
                        System.out.println("The Operator is ASN");
                        System.out.println("otop\n");
                        break;
                    }
                case 11: 
                    if(str.contains("'")){
                        System.out.println("The Operator is PRN");
                        System.out.println("otop\n");
                        break;
                    }
                case 12: 
                    if(str.contains(";")){
                        System.out.println("The Operator is LTRN");
                        System.out.println("otop\n");
                        break;
                    }
                case 13: 
                    if(str.contains("{")){
                        System.out.println("The Operator is LBRC");
                        System.out.println("otop\n");
                        break;
                    }
                case 14: 
                    if(str.contains("}")){
                        System.out.println("The Operator is RBRC");
                        System.out.println("otop\n");
                        break;
                    }      
            }
        }
    }
    }

Thank you in advance!先感谢您!

When programming a compiler the different input words in your code are called tokens, and the phase of recognising the role of each token is called the lexical analysis phase.在对编译器进行编程时,代码中的不同输入词称为标记,识别每个标记作用的阶段称为词法分析阶段。

When trying to recognise tokens usually what is used is regex which is a way of implementing a finite automata.当试图识别令牌时,通常使用的是正则表达式,这是一种实现有限自动机的方法。

You can read about it in much more detail here:你可以在这里更详细地阅读它:

https://en.wikipedia.org/wiki/Lexical_analysis https://en.wikipedia.org/wiki/Lexical_analysis

You should replace the usage of contains, and use a lexer, it's the name of the tool that does lexical analysis.您应该替换 contains 的用法,并使用词法分析器,它是进行词法分析的工具的名称。 It uses regexes because it's not just about / and // , there can be many different situations where your compiler will need to decide which token to choose.它使用正则表达式,因为它不仅仅是关于/// ,在许多不同的情况下,您的编译器需要决定选择哪个标记。

Here's an example of a finite automate for recognising different tokens, notice that for each prefix there can be many options for possible tokens:这是用于识别不同标记的有限自动化示例,请注意,对于每个前缀,可能的标记有许多选项:

在此处输入图片说明

In Java you can use jflex which will generate lexer code with your tokens definitions.在 Java 中,您可以使用 jflex,它将使用您的令牌定义生成词法分析器代码。

When you find a / , you need to check the next character.当您找到/ ,您需要检查下一个字符。

if (ch == '/') {
    char nextCh = (i + 1 < l ? str.charAt(i + 1) : '\0');
    if (nextCh == '/') {
        System.out.println("The Operator is EndOfLineComment");
        System.out.println("arop\n");
        i++;
    } else if (nextCh == '*') {
        System.out.println("The Operator is TraditionalComment");
        System.out.println("arop\n");
        i++;
    } else {
        System.out.println("The Operator is DIV");
        System.out.println("arop\n");
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM