简体繁体中英

Computation of DFA states

原文 2013-05-22 17:37:50 5 1 regex/ lex/ flex-lexer

I want to compute the total number of DFA states for a certain regular expression using FLEX. Which C files or functions will help me to achieve this task using FLEX?

1 answers

If you look in the file generated by flex , then the number of entries in yy_accept (and yy_base ) will probably give a good indication of the number of states used by the generated DFA. If you'd use -Cf option then yy_nxt contains the transition function of the DFA and the number of rows in the table is again the number of used states.

You may have a different version of flex where the tables are named differently, but most likely their names will be very similar.

In reaction to your questions below: the number of states in a DFA could be considered quite well defined, assuming the DFA has been minimized. The number of transitions is however much less well defined.

In the first place flex has a transition for each input character as it will ECHO any character that is not part of the defined language. This is implemented by a fresh new state to handle that case. Using a debugger you could reverse engineer which state this is. But beware that if you use start conditions, you may have to consider the possibility that there are multiple such states. If you want to analyze many regular expressions, then you may want to look into some other tools or take the sources of flex and go from there.

In the second place flex has strategies to minimize the total size of all the tables. The -Cf option instructs it to not do that. One such optimization is finding equivalence classes of characters and only use transitions for each character class. An input character is first translated to its class, which in turn is used to determine the transition. As a consequence the number of transitions is much lower, but an additional table (see yy_ec ) is required for determining the character class.

As a consequence the number of transitions is a not so well defined concept. If you are interested in determining the memory footprint of the scanner, then I would look at the size of the data section of the scanner. Use for example objdump -h on the lex.yy.o file. The size of the .rodata section will give a quite accurate estimate of the total size of the tables.

You seemed to have already found the -v option of flex that gives the number of states in the DFA in a more verbose form. In answer to why "a" {} gives 5 states, you may also use the --trace option as it gives the DFA while it is generated. Apparently there is also an End Marker rule, I assume it is used for end-of-file. For each start condition there are two states, one that is used when at the start of a line and one in the middle of a line. That makes 3 accepting states (one for "a" , one for End Marker and one for (.|"\\n") ) plus two states for the single start condition.

The source file dfa.c is not part of the generated code, but if you feel brave you could of course change the sources of flex to do further analysis of your own. I had a quick look and it does seem that generation of the code is intertwined with the transformations, which makes it a bit less modular than one would desire for an experimentation platform. Also beware of the K&R prototypes which effectively disables any type checking on the prototypes.

DFA to RE (Introduction to Automata Theory, Languages and Computation)

NFA to DFA conversion with multiple start states

Regular expression that generates a DFA with dead or superfluous states

DFA minimization

DFA to regular expression

What is a “tagged DFA”?

Converting a regular expression to a DFA

questions on nfa and dfa

Finding the complement of a DFA?

Regular Expression to DFA

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question DFA to RE (Introduction to Automata Theory, Languages and Computation) NFA to DFA conversion with multiple start states Regular expression that generates a DFA with dead or superfluous states DFA minimization DFA to regular expression What is a “tagged DFA”? Converting a regular expression to a DFA questions on nfa and dfa Finding the complement of a DFA? Regular Expression to DFA

Related Tags

Computation of DFA states

Question

1 answers

solution1 2 ACCPTED 2013-05-26 09:54:36

solution1
2 ACCPTED 2013-05-26 09:54:36