I am looking into implementing common subexpression elimination (CSE) for expression graphs corresponding to large mathematical expressions (millions of nodes).

What algorithms are suitable for performing this? I was searching the internet for an easy-to-implement algorithm but I could not find anything. If possible the algorithm should have a linear complexity in the number of nodes of the complete expression graph.

1

Best Answer


These are expressions with no side effects? Then the easiest thing to do is to hash the trees for each sub-expression into buckets to determine candidates for sub-expression elimination.This is a special case of CSE where all the expressions are in a single (huge) "basic block". (I use this idea as the basis for detecting duplicate code.)

If the expressions have order and side effects, you may want to consider Value Numbering.