莱克斯不计入

Question

Im trying to do a program that counts some things of a C program, the problem I have, is that Im trying to count lines with: 我试图做一个程序来计算C程序的某些内容，我遇到的问题是，我试图用以下方法来计算行数：

def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

It doesnt count me the lines, here is an example of the input and the output: 它不计算行数，这是输入和输出的示例：

for
if
else
switch
exit
Number of if´s: 1
Number of for´s: 1
Number of While´s: 0
Number of else´s: 1
Number of switche´s: 1
Number of lines: 1

But everytime I press enter to write a new line of code it doesnt get counted, also if I press enter without writing anything, this error appears: 但是每次我按Enter编写新的代码行时，它都不会计数，如果我按Enter而不写任何内容，也会出现此错误：

Traceback (most recent call last): File "C:/Users/User/PycharmProjects/practicas/firma_digital.py", line 80, in if tok.type is not None: AttributeError: 'NoneType' object has no attribute 'type' 追溯（最近一次通话）：文件“ C：/Users/User/PycharmProjects/practicas/firma_digital.py”，如果tok.type不为None，则为第80行：AttributeError：'NoneType'对象没有属性'type'

Here is all my code: 这是我所有的代码：

import ply.lex as lex
import ply.yacc as yacc
FinishProgram=0
Enters=0
Fors=0
Whiles=0
ifs=0
elses=0
Switches=0

reserved = {
   'if' : 'IF',
   'for' : 'FOR',
   'while': 'WHILE',
   'else': 'ELSE',
   'switch': 'SWITCH'
}
tokens = [
    'ID',
    'COLON',
    'SEMICOLON',

    ]+ list(reserved.values()) #Palabras reservadas

t_COLON= r','
t_SEMICOLON=r';'


def t_ID(t):
    r'[a-zA-Z_][a-zA-Z0-9_]*'
    t.type = reserved.get(t.value, 'ID')
    return t

t_ignore=r' '

def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

def t_error(t):
    print("This thing failed")
    t.lexer.skip(1)

lexer=lex.lex()


#def p_gram_sets(p):
 #   '''

  #  gram : SETS SEMICOLON
   #      | empty
    #'''
    #if p[1]:
     #   print(p[1])
      #  print("SETS")



def p_empty(p):
    '''
    empty :
    '''
    p[0]=None





def p_error(p):
    print("Syntax error in input!")


parser=yacc.yacc()

while FinishProgram==0:
    s=input('')
    lexer.input(s)
    tok = lexer.token()

    if tok.type is not None:
        if tok.type=='IF':
            ifs+=1
        elif tok.type=='FOR':
            Fors+=1
        elif tok.type=='WHILE':
            Whiles+=1
        elif tok.type=='ELSE':
            elses+=1
        elif tok.type=='SWITCH':
            Switches+=1

    #parser.parse(s)
    if "exit" in s:
        print("Number of if´s: "+ str(ifs) + "\n"+"Number of for´s: "+str(Fors)+"\n"+"Number of While´s: "+str(Whiles)+"\n"+"Number of else´s: "+str(elses)+"\n"+"Number of switche´s: "+str(Switches)+"\n"+"Number of lines: "+str(tok.lineno))
        FinishProgram=1

Answer 1

It's not that ply is not counting the newline characters. 并不是说不计算换行符。 It's never seeing them, because you call it repeatedly using input() . 从来没有见过它们，因为您使用input()反复调用它。

From the Python docs (emphasis added): 从Python文档（添加了重点）：

input([prompt]) 输入（[提示]）

If the prompt argument is present, it is written to standard output without a trailing newline. 如果存在提示参数，则将其写入到标准输出中，而无需尾随换行符。 The function then reads a line from input, converts it to a string ( stripping a trailing newline ), and returns that. 然后，该函数从输入中读取一行，将其转换为字符串（ 剥离尾随的换行符 ），然后将其返回。

The normal usage of lex.lex is to lex.lex的正常用法是

Additionally, you are printing 此外，您正在打印

... + str(tok.lineno)

rather than 而不是

... + str(lexer.lineno)

After the last token is tokenised, lex.lex returns None , so you can expect tok to be Null when your loop terminates, and therefore it is an error to try to extract it's lineno attribute. 在最后一个标记被标记后， lex.lex返回None ，因此您可以期望在循环终止时tok为Null ，因此尝试提取其lineno属性是错误的。 (However, in your case it only happens if the line you just tried to tokenise was empty, because you only use the first token on each line.) You want the line count recorded in the lexer object, which is the count you update in your action. （但是，在您的情况下，仅当您尝试标记化的行为空时才会发生，因为您仅在每行上使用第一个标记。）您希望将行计数记录在lexer对象中，这是您在其中更新的计数你的行动。

If you want to work on an entire file (which is the usual case for parsers, other than line-by-line calculators), you need to read the entire contents of the file (or stdin, as the case may be). 如果要处理整个文件（这是解析器的常用情况，而不是逐行计算器），则需要读取文件的全部内容（视情况而定，也可以是stdin）。 For non-interactive use, you would generally do that with the file object's read function. 对于非交互使用，通常可以使用文件对象的read功能来实现。 If you wanted to test your lexer, you would then use the fact that the lex function implements Python's iteration protocol, so it will work in a for statement. 如果要测试词法分析器，则可以使用lex函数实现Python的迭代协议的事实，因此它将在for语句中运行。 So your main loop would be something like: 因此，您的主循环将类似于：

import sys
lexer.input(sys.stdin.read())
for tok in lexer:
  # Update counts

and you would terminate the input by typing an end-of-file character at the beginning of a line (control-D on Linux or control-Z on Windows). 并且您可以通过在行首键入文件结尾字符（Linux上为control-D或Windows上为control-Z）来终止输入。

Personally, I would implement the token type counting with a defaultdict : 就个人而言，我将使用defaultdict实现令牌类型计数：

from collections import defaultdict
counts = defaultdict(int)
for tok in lexer:
  counts[tok.type] += 1
for type, count in counts.items():
  print ("Number of %s's: %d\n" % (type, count))
# Or: print('\n'.join("Number of %s's: %d\n" % (type, count) for type, count in counts.items())
print ("Number of lines: %d\n" % lexer.lineno)

莱克斯不计入

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-06-08 17:53:19

莱克斯不计入

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-06-08 17:53:19

解决方案1
1 已采纳 2018-06-08 17:53:19