编译原理-词法分析
词法分析
实验要求
- 完成基于flex的C-minus词法分析器
- 完成基于bison的C-minus语法分析器
实验难点
- 根据C-minus的词法使用正则表达式完成词法分析器
- 理解flex和bison的用法以及掌握flex,bison联动构建语法分析器的方法
- 正确构建C-minus的语法树
实验设计
- 正则匹配
正确匹配C-minus的所有token,以及注释,空格,换行等特殊字符,利用flex与token的名字关联起来(作为返回值),例如:"if" {pos_start = pos_end-1; pos_end ++; pass_node(yytext); return IF;}
,同时维护pos_start
,pos_end
,lines
的值,表达式的顺序代表了匹配的优先级:
- 注释
\/\*(?:[^\*]|\*+[^\/\*])*\*+\/
- 空格,制表符
" "|\t
- 换行
\n
- 浮点数
[0-9]+\.[0-9]*|[0-9]*\.[0-9]+
- 整形
[0-9]+
- 变量名
[a-zA-Z_]*
- 其他token
"if" "else" "while" "return""void" "int" "float" "+" "-" "*" "/" "{" "}" "(" ")" "[" "]" ">=" ">" "==" "<=" "<" "!=" "=""," ";"
- token,type的数据类型
%union {
struct _syntax_tree_node *node;
}
%token <node> ERROR ADD SUB IF ELSE WHILE RETURN INT
%token <node>FLOAT VOID GE G E L LE NE MUX DIV DOT SEM
%token <node>LEFT1 LEFT2 LEFT3 RIGHT1 RIGHT2 RIGHT3 ASS
%token <node> INTEGER ID FLOATPOINT
%type <node> program
%type <node> type-specifier relop addop mulop
%type <node> declaration-list declaration var-declaration fun-declaration
%type <node> local-declarations compound-stmt statement-list statement expression-stmt
%type <node> iteration-stmt selection-stmt return-stmt expression simple-expression
%type <node> var additive-expression term factor integer float call
%type <node> params param-list param args arg-list
- 构建语法树
利用bison根据C-minus的语法构建语法树,其中每个type都是语法树的非叶子节点,token都是语法树的叶子节点。例如:
declaration-list: declaration-list declaration {$$=node("declaration-list",2,$1,$2);}
| declaration {$$=node("declaration-list",1,$1);};
param-list: param-list DOT param {$$=node("param-list",3,$1,$2,$3);}
|param {$$=node("param-list",1,$1);};
实验结果验证
- test测试样例
略
- 自行设计的样例
- 跨行注释和空行:
/*
This my testcase
*/
输出:不在parser和lexer的分析中出现
- 数组定义
int y[2]
输出
| | | | | >—+ var-declaration
| | | | | | >—+ type-specifier
| | | | | | | >— int
| | | | | | >— y
| | | | | | >— [
| | | | | | >— 2
| | | | | | >— ]
| | | | | | >— ;
- 循环
while (0) {y=0;}
输出
| >— while
| >— (
| >—+ expression
| | >—+ simple-expression
| | | >—+ additive-expression
| | | | >—+ term
| | | | | >—+ factor
| | | | | | >—+ integer
| | | | | | | >— 0
| >— )
| >—+ statement
| | >—+ compound-stmt
| | | >— {
| | | >—+ local-declarations
| | | | >— epsilon
| | | >—+ statement-list
| | | | >—+ statement-list
| | | | | >— epsilon
| | | | >—+ statement
| | | | | >—+ expression-stmt
| | | | | | >—+ expression
| | | | | | | >—+ var
| | | | | | | | >— y
| | | | | | | >— =
| | | | | | | >—+ expression
| | | | | | | | >—+ simple-expression
| | | | | | | | | >—+ additive-expression
| | | | | | | | | | >—+ term
| | | | | | | | | | | >—+ factor
| | | | | | | | | | | | >—+ integer
| | | | | | | | | | | | | >— 0
| | | | | | >— ;
| | | >— }