用 ANTLR 做一個四則運算器

原創

2020-02-22 13:54

　　開始把 D 的語法轉換爲 EBNF，發現 D 還支持中文變量名，也就是所謂的 UniversalAlpha，查看了一下 dmd front end 的源代碼，檢查字符是否 UniversalAlpha 的函數是這樣的：

int isUniAlpha(unsigned u)
{
    static unsigned short table[][2] =
    {
 { 0x00AA, 0x00AA },
 { 0x00B5, 0x00B5 },
 { 0x00B7, 0x00B7 },
 ......
 ......
 ......
 { 0x3105, 0x312C },
 { 0x4E00, 0x9FA5 },
 { 0xAC00, 0xD7A3 },
    };
    if (u > 0xD7A3)
 goto Lisnot;

   // Binary search
    int mid;
    int low;
    int high;

  low = 0;
    high = sizeof(table) / sizeof(table[0]) - 1;
    while (low <= high)
    {
 mid = (low + high) >> 1;
 if (u < table[mid][0])
     high = mid - 1;
 else if (u > table[mid][1])
     low = mid + 1;
 else
     goto Lis;
    }

Lisnot:
    return 0;

Lis:
    return 1;
}

　　但是，怎麼讓 Grammatica 在分析過程中調用類似的函數，卻是一點兒頭緒也沒有。雖然，理論上來說，用正則表達式，也可以表示上面的邏輯，不過，200多行的數據，要都轉成正則表達式，不止運行速度慢，就只是轉換的工作量，也讓人不可接受。

　　而後，對於 D 中 string interger 和 float 的轉換，再次發現 Grammatica 這種只用正則表達式的方式的嚴重不足，終於決定放棄 Grammatica。

　　本來，最好的辦法其實是使用 dmd 的前端的源代碼來解析，不過，幾乎 1.6M 的代碼，沒有任何文檔，都讀過一遍的話，黃花菜都涼了。

　　一直不想用 ANTLR 的原因，是語法文件和嵌入的代碼混編，看起來雜亂無章，但是 ANTLR 的強大和社區的活躍確實是很吸引人的。於是，決定用 ANTLR 來寫 D Parser。（看到還有一個叫 coco/r 的生成器，據說比 ANTLR 清晰，不過也有語法能力不如 ANTLR 的問題，所以暫時也不考慮了。）

　　同樣的，四則運算是一個比較好的例子，從 ANTLR 的主頁的“五分鐘教程”中，找到一個四則運算的語法文件，看了一下，不嵌入代碼的話，還挺清晰的。既然用 ANTLR，就要體驗一下它自動建立抽象語法樹的能力，把那個語法文件做了一些修改，成爲這個樣子：

grammar SimpleCalc;

options {
    language=CSharp;
    output=AST;
    ASTLabelType=CommonTree;
}

tokens {
    PLUS     = '+' ;
    MINUS    = '-' ;
    MULT     = '*' ;
    DIV      = '/' ;
}

@members {
}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/

expr    : term ( ( PLUS^ | MINUS^ )  term )* ;
term    : factor ( ( MULT^ | DIV^ ) factor )* ;
factor  : NUMBER | '(' expr ')' -> expr ;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/

NUMBER     : (DIGIT)+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+     { $channel = HIDDEN; } ;
fragment DIGIT    : '0'..'9' ;

　　做了修改的地方是，1.讓它輸出 AST，2.把運算符提取爲根，3.支持括號。

　　生成文件後，在 Program 文件中加入創建分析器的代碼，再加入深度優先的語法樹訪問函數，以及運算部分如下：

using System;
using System.Collections.Generic;
using System.Text;
using Antlr.Runtime;
using Antlr.Runtime.Tree;

namespace Expr
{
    class Program
    {
        private static Stack<int> numbers = new Stack<int>();

       static void Main(string[] args)
        {
            SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0]));
            CommonTokenStream tokens = new CommonTokenStream(lex);
            SimpleCalcParser parser = new SimpleCalcParser(tokens);

           try
            {
                CommonTree ct = (CommonTree)parser.expr().Tree;
                VisitTree(ct);
                Console.WriteLine("The result is: {0}", numbers.Pop());
                Console.Read();
            }
            catch (RecognitionException e)
            {
                Console.Error.WriteLine(e.StackTrace);
            }
        }

       static void VisitTree(ITree it)
        {
            for (int i = 0; i < it.ChildCount; i++)
            {
                ITree c = it.GetChild(i);
                VisitTree(c);
            }
            switch (it.Type)
            {
                case SimpleCalcLexer.PLUS:
                case SimpleCalcLexer.MINUS:
                case SimpleCalcLexer.MULT:
                case SimpleCalcLexer.DIV:
                    Operation(it.Text, numbers.Pop(), numbers.Pop());
                    break;
                case SimpleCalcLexer.NUMBER:
                    numbers.Push(int.Parse(it.Text));
                    break;
            }
        }

    static void Operation(string opCode, int v2, int v1)
        {
            int result;
            switch (opCode)
            {
                case "+":
                    result = v1 + v2;
                    break;
                case "-":
                    result = v1 - v2;
                    break;
                case "*":
                    result = v1 * v2;
                    break;
                case "/":
                    result = v1 / v2;
                    break;
                default:
                    throw new Exception();
            }
            Console.WriteLine("{1} {0} {2} = {3}", opCode, v1, v2, result);
            numbers.Push(result);
        }
    }
}

　　上面的代碼，除了運算之外，還會把每一個計算步驟打印出來，在輸入文件中輸入“5-(3-2)+6*7”，編譯運行程序，得到結果：

3 - 2 = 1
5 - 1 = 4
6 * 7 = 42
4 + 42 = 46
The result is: 46

　　ANTLR 幫助建立 AST 的功能確實很舒服，而且例子也多，嗯，以後就用它了。

iteye_5407

發佈了0 篇原創文章 · 獲贊 0 · 訪問量 1307

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

用 ANTLR 做一個四則運算器

玩了一會兒 SQL Server Compact 3.5

D Parser 之前：寫一個簡單的虛擬機

獵殺連環木馬

點睛文本編碼查詢 D 語言版

DbEntry.Net v0.33

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結