Having trouble with algorithm to build a function tree

Jamin2112 · May 6, 2015

So I've been writing an algorithm to build a function tree and I believe I'm aaaaallllllmost there. The intent is to take a math function like

Code:

x^2+x+1*5

and build it into

Code:

            /        +         \
         / +  \               /  \
      / ^ \     x           1 *  5
    x       2

My idea to parse the function is to think of it being like

Code:

[Exp_0][Op_1][Exp_1][Op_2][Exp_2]...[Op_n][Exp_n]

where [Exp_i] represents an expression (x, a number, or a function in parentheses) and [Op_i] represents an operator. In the above function

Code:

Exp_0 = 'x'
Op_1 = '^'
Exp_1 = 2
Op_2 = '+'
Exp_2 = 'x'
Op_3 = '+'
Exp_3 = 1
Op_4 = '*'
Exp_4 = 5

I know how to write the algorithm if the operators are in order of ascending or descending precedence. For instance, something like

x^2*2+1

is easy to build into a tree because as you grab the expressions from left to right you simply take the previous tree and make it be the left-hand side of a new tree and make the right-hand side of the tree be the current expression. The building of that tree (skipping a few steps) would look like

Code:

 / ^ \
x     2

,

Code:

    /      *    \
 / ^ \          2
x     2

,

Code:

                    /   +   \
                 /             1
          /     *    \
       / ^ \          2
      x     2

The algorithm is:

Code:

Get Exp_0

If there's no Op_1, return Exp_0; 
Else, 

Make a tree T like

     /                  [Op_1]             \
[Exp_0]                                  [Exp_1]

for (i = 2; i <= n; ++i)
{
      temp = T;
      T = new Tree; 
      T.leftHandSide = temp;
      T.oper = [Op_i];
      T.rightHandSide = [Exp_i];
}

Hopefully that pseudo-code is legible to you.

The algorithm for making a tree when the precedence of the operators is reversed, e.g.

1+2*x^2

is slightly more complicated, but I know how to write. I can also write an algorithm that combines the 2 algorithms I mentioned.

The problem is that I can't think of a general algorithm that looks at the precedence of each operator found and then rebuilds the tree accordingly. It seems like it would be extremely complex, and I've gotten close to making it, but maybe I'm going about things completely wrong.

Can someone give me a hint to push me along?

jim mcnamara · May 6, 2015

Compiler developers face exactly the same problem - decomposing strings into atomic operations. Have a Google for "smallc.c" the name may vary slightly. It is a C interpreter written in C, functions the way the BASIC language originally did. It uses a descent parser - which is what you need. Somewhere on my shelves is a book on C that has the code. If I bump into it I will post the reference.

At any rate check out a descent parser.

jim mcnamara · May 6, 2015

Found what you want: 'C: The complete reference Fourth Edition' Herbert Schildt, 2000, Obourne
You want Chapter 29.

I DO NOT fully recommend the book as a C reference - the edition I have has more than several errors.

Jamin2112 · May 6, 2015

jim mcnamara said:

Found what you want: 'C: The complete reference Fourth Edition' Herbert Schildt, 2000, Obourne
You want Chapter 29.

I DO NOT fully recommend the book as a C reference - the edition I have has more than several errors.

Awesome. I'll take a look at it.

Carno Raar · May 7, 2015

Pseudocode for this would look like this.

Tokenise the input. I would use regex.

Code:

20 * (32 + 4) + 1 --> integer mulop bra integer addop integer ket addop integer

Now the grammar:

Code:

<expression> =
        <term> <addop> <term>
    or  <term>
 
<term> =
        <factor> <mulop> <factor>
    or  <factor>

<factor> =
        <integer>
    or  <bra> <expression> <ket>

Now recursive descent parsing

Code:

def parse (tokens)
    value = expression ()
    return value
 
def expression ()
    value = term ()
    while 1
        if tokens[index].type == ADDOP
            index += 1
            value += term ()
        else if tokens[index].type == SUBOP
            index += 1
            value -= term ()
        else break
    return value

def term ()
    value = factor ()
    while 1
        if tokens[index].type == MULOP
            index += 1
            value *= factor ()
        else if tokens[index].type == DIVOP
            index += 1
            value /= factor ()  
        else break
    return value

def factor ()
    if tokens[index].type == INTEGER
        return tokens[index++].value
    else if tokens[index].type == BRA
        index += 1  # eat bracket
        result = expression()
        if tokens[index].type != KET
            throw SyntaxError
        index += 1  # eat bracket
    return result
 
def main ()
    tokens = lex ("  2 + 3 * (2+2)  ")
    answer = parse (tokens)
    print (answer)

Read Compilers by Aho (the Dragon Book) or just google how to write a parser. I did this quick and there might be an error or two.

Edited. Fixed expression, term.

Having trouble with algorithm to build a function tree

Thread 'Learning Assembly and computer architecture for x86'

Thread 'Microsoft Technical Interview question'

Similar threads

Hot Threads

Hackathon ideas?

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Trying To Debug A Python File

Python Complaining About Python

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective