banner
cos

cos

愿热情永存,愿热爱不灭,愿生活无憾
github
tg_channel
bilibili

Compiler Principles Experiment Four LR(1) Parsing Method Program

The third experiment is skipped because the reverse Polish notation is relatively simple.

Source code repository: CompilePrincipleLearning/experiment_4 · yusixian/CompilePrincipleLearning (github.com)

In the demo folder~

I. Experiment Objectives#

  1. Master the basic principles of the LR(1) parsing method.
  2. Master the construction method of the LR(1) parsing table.
  3. Master the construction method of the LR(1) driver program.

II. Experiment Content and Requirements#

Construct an LR(1) parsing program to perform syntax analysis and determine whether the given symbol string is a sentence recognized by the grammar. Understand that the LR(K) parsing method strictly scans from left to right and uses a bottom-up syntax analysis method.

Debug the LR(1) parsing program based on a certain grammar to analyze any input symbol string. The main purpose of this experiment is to deepen the understanding of the LR(1) parsing method.

For the following grammar, analyze any input symbol string using the LR(1) parsing method:

(0) S' -> E

(1) E -> E + T

(2) E -> T

(3) T -> T * F

(4) T -> F

(5) F -> (E)

(6) F -> i

The output format is as follows:

(1) LR(1) parsing program, author: Name, Student ID, Class

(2) Input a symbol string ending with # (including + - * / ( ) i #): Enter the symbol string here

(3) The output process is as follows:

StepState StackSymbol StackRemaining Input StringAction
10#i+i*i#Shift

(4) The input symbol string is an illegal symbol string or a legal symbol string.

Note:

  1. The expression may use operators (+|*), delimiters (parentheses), character i, and the end symbol #;

  2. If an erroneous expression is encountered, an error message should be output (the more detailed the information, the better);

  3. For students with extra capacity, the test expressions are placed in a text file in advance, with one expression per line, separated by semicolons. The expected output results should also be written in another text file for comparison with the output;

  4. Other grammars may be used, but they must be based on the LR(1) parsing method.

III. Experiment Process#

1. Construct the DFA for Recognizing LR(1) Grammar and Prefix#

As shown: Open in a new tab, not blurry.

LR1_DFA.drawio.png

The action table and goto table are as follows:

image.png

2. Data Structures Used#

// ACTION table
// + * ( ) i #
string action[12][6];
// goto table
// a b #
int _goto[12][3];
string vt = "+*()i#";      // Terminal symbol table
string vn = "ETF";        // Non-terminal symbol table
string LR[6] = { "E->E+T", "E->T", "T->T*F", "T->F", "F->(E)", "F->i" };   // Store productions
stack<char> chars;  // Symbol stack
stack<int> state;   // State stack

3. Header File Declarations and Global Variable Definitions#

As follows.

#include <iostream>
#include <string>
#include <fstream>
#include <stack>
#include <vector>
using namespace std;
const string ExpFileName = "./exp.txt";
const string GotoFileName = "./goto.txt";
const string ActionFileName = "./action.txt";
const int Null = -1;
// ACTION table
// + * ( ) i #
string action[12][6];
// goto table
// a b #
int _goto[12][3];
string vt = "+*()i#";      // Terminal symbol table
string vn = "ETF";        // Non-terminal symbol table
string LR[6] = { "E->E+T", "E->T", "T->T*F", "T->F", "F->(E)", "F->i" };   // Store productions

4. Function Summary#

(1) Function Summary Table#

Function NameBrief Description
readFileReads a file and returns a dynamic array of strings, split by line
initInitialization function, initializes the goto table and action table
printActions / printGotosOutputs the goto table and action table
isTerminatorDetermines whether the current character c is a terminal symbol
findTerminatorReturns the index of the terminal symbol
findNonTerminatorReturns the index of the non-terminal symbol
s2stringConverts the stack to a string for easy output
analyzeLR1Analyzes the string exp using the LR1 parsing method, outputs its analysis steps
main Main program entry, calls the read file function to start analysis

(2) Function Call Relationships#

function4.drawio.png

(3) Flowchart#

main.drawio.png

5. Experiment Results#

Input#

action.txt file

N	N	s4	N	s5	N
s6	N	N	N	N	acc
r2	s7	N	r2	N	r2
r4	r4	N	r4	N	r4
N	N	s4	N	s5	N
r6	r6	N	r6	N	r6
N	N	s4	N	s5	N
N	N	s4	N	s5	N
s6	N	N	s11	N	N
r1	s7	N	r1	N	r1
r3	r3	N	r3	N	r3
r5	r5	N	r5	N	r5

goto.txt file

1	2	3
N	N	N
N	N	N
N	N	N
8	2	3
N	N	N
N	9	3
N	N	10
N	N	N
N	N	N
N	N	N
N	N	N

exp.txt file

i+(i*i)*(i+i)#
i*i+i*i#
i+i*i+i*(i+i*i)#
i+*(i)+i(i+i*i)#
i+i(i)#

Output#

image1.png

image2.png

image3.png

Complete Code#

/*
 * @Author: cos
 * @Date: 2022-04-30 14:20:51
 * @LastEditTime: 2022-05-01 02:34:12
 * @LastEditors: cos
 * @Description: Experiment 4 LR(1) Parsing Method
 * @FilePath: \CompileTheory\experiment_4\demo\main.cpp
 */
#include <iostream>
#include <string>
#include <fstream>
#include <stack>
#include <vector>
using namespace std;
const string ExpFileName = "./exp.txt";
const string GotoFileName = "./goto.txt";
const string ActionFileName = "./action.txt";
const int Null = -1;
// ACTION table
// + * ( ) i #
string action[12][6];
// goto table
// a b #
int _goto[12][3];
string vt = "+*()i#";      // Terminal symbol table
string vn = "ETF";        // Non-terminal symbol table
string LR[6] = { "E->E+T", "E->T", "T->T*F", "T->F", "F->(E)", "F->i" };   // Store productions
// Read file
vector<string> readFile(string fileName) {
    vector<string> res;
    try {
        ifstream fin;
        fin.open(fileName);
        string temp;
        while (getline(fin, temp))
            res.push_back(temp);
        return res;
    }
    catch (const exception& e) {
        cerr << e.what() << '\n';
        return res;
    }
}
void printActions() {
    cout << "-----------------ACTION Table------------------" << endl;
    cout << "+\t*\t(\t)\ti\t$" << endl;
    for (int i = 0; i < 12; ++i) {
        for (int j = 0; j < 6; ++j)
            cout << action[i][j] << "\t";
        cout << endl;
    }
}
void printGotos() {
    cout << "-----------------GOTO Table------------------" << endl;
    cout << "E\tT\tF" << endl;
    for (int i = 0; i < 12; ++i) {
        for (int j = 0; j < 3; ++j)
            cout << _goto[i][j] << "\t";
        cout << endl;
    }
}
void init() {
    vector<string> actions = readFile(ActionFileName);
    for (int i = 0; i < 12; ++i) {
        int cnt = 0;
        string row = actions[i];
        int len = actions[i].length();
        for (int j = 0; j < len; ++j) {
            string temp = "";
            while (j < len && row[j] != ' ' && row[j] != '\t') {
                temp += row[j];
                ++j;
            }
            while (j < len && (row[j] == ' ' || row[j] == '\t'))
                ++j;
            --j;
            action[i][cnt++] = temp;
        }

    }
    printActions();
    vector<string> gotos = readFile(GotoFileName);
    for (int i = 0; i < 12; ++i) {
        int cnt = 0;
        string row = gotos[i];
        int len = row.length();
        for (int j = 0; j < len; ++j) {
            string temp = "";
            while (j < len && row[j] != ' ' && row[j] != '\t') {
                temp += row[j];
                ++j;
            }
            while (j < len && (row[j] == ' ' || row[j] == '\t'))
                ++j;
            --j;
            _goto[i][cnt++] = (temp == "N") ? Null : stoi(temp);
        }
    }
    printGotos();
}
bool isTerminator(char c) {
    return vt.find(c) != string::npos;
}
int findTerminator(char c) { // Returns the index of the terminal symbol
    return vt.find(c);
}
int findNonTerminator(char c) { // Returns the index of the non-terminal symbol
    return vn.find(c);
}
// Converts the stack to a string for output
string s2string(stack<int> s) {
    string str = "";
    while(!s.empty()) {
        str += to_string(s.top()) + " ";
        s.pop();
    }
    return str;
}
// Outputs the remaining input string
void printRestInput(string exp, int start, int len) {
    for(int i = start; i < len; ++i) 
        cout << exp[i];
    cout << '\t';
}
void analyzeLR1(string exp) {  // Analyzes an expression
    int len = exp.length();
    stack<char> chars;  // Symbol stack
    stack<int> state;   // State stack
    state.push(0);  // Initial state is 0
    chars.push('#');  // Initial symbol is #
    string charsStr = "#";
    stack<int> copyState;
    copyState.push(0);
    int cnt = 0;    // Sequence number
    int idx = 0;  // Current input pointer
    cout << "Sequence\tState Stack\tSymbol Stack\tInput String\tDescription" << endl;
    cout << cnt++ << '\t' << s2string(copyState) << '\t' << charsStr << '\t' << exp << '\t' << " Initial state " << endl;
    while(1) {
        int nowState = state.top();
        char nowChar = exp[idx];    // Current input character
        int isT = findTerminator(nowChar);
        if(isT == Null) {   // Non-terminal
            cout << "Error!" << " Illegal character encountered, program exiting" <<endl;
            return;
        }
        string actionStr = action[nowState][isT];
        if(actionStr == "acc") {
            cout << cnt++ << '\t' << s2string(copyState) << '\t' << charsStr << '\t' << exp << '\t' << " accept Accepted! " << endl;
            return;
        } else if(actionStr == "N") {
            cout << cnt++ << '\t' << s2string(copyState) << '\t' << charsStr << '\t' << exp << '\t' << "Error! Program exited abnormally" << endl;
            return;
        } else if(actionStr[0] == 'r') {   // Reduction
            int num = stoi(actionStr.substr(1));    // Select which production to reduce
            int len = LR[num-1].length()-3;
            while(len--) {
                chars.pop();        // Pop from stack, reduce
                state.pop();
                charsStr = charsStr.substr(0, charsStr.length()-1);
                copyState.pop();   // For output convenience
            }
            chars.push(LR[num-1][0]);   // Push the left part of the production onto the symbol stack
            charsStr += LR[num-1][0];

            int nowState = state.top();
            int gidx = findNonTerminator(LR[num-1][0]);
            int newState = _goto[nowState][gidx];
            state.push(newState);
            copyState.push(newState);

            cout << cnt++ << '\t' << s2string(copyState) << '\t' << charsStr  << '\t';
            printRestInput(exp, idx, len);
            cout << '\t' << " Reduce " << LR[num-1] << endl;
        } else if(actionStr[0] == 's') {    // Shift
            int newState =  stoi(actionStr.substr(1));
            state.push(newState);
            copyState.push(newState);

            chars.push(nowChar);
            charsStr += nowChar;
            ++idx;  // Move input pointer forward

            cout << cnt++ << '\t' << s2string(copyState) << '\t' << charsStr << '\t';
            printRestInput(exp, idx, len);
            cout << '\t' << actionStr << " Shift " << endl;
        } else {
            cout << "Error!" << " Program exited abnormally" <<endl;
            return;
        }
    }
}
int main() {
    cout << "LR(1) Parsing Program, Author: xxx xxxxxxxx xxxx Class" << endl;
    cout << "Note: This program can only analyze expressions composed of 'i', '+', '*', '/', '(', ')' ending with '#', one expression per line" << endl;
    cout << "The file name read is: " << ExpFileName << endl; 
    init();
    vector<string> exps = readFile(ExpFileName);
    int len = exps.size();
    for (int i = 0; i < len; i++) {
        string exp = exps[i];
        cout << "\n------------------Expression to Analyze " << i+1 << ":" << exp << "--------------------" << endl;
        bool flag = true;
        for (int j = 0; j < exp.length(); j++) {
            if (!isTerminator(exp[j])) {
                cout << "The string input in line " <<   i+1 << " is illegal, please re-enter" << endl;
                flag = false;
                break;
            }
        }
        if (flag) {
            cout << "Expression "  << i+1 << ":" << exp << " analysis begins" << endl;
            analyzeLR1(exp);
        }
    }
    return 0;
}
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.