Jump to content

Reverse Polish notation: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
A Postfix evaluator implemented in Python: Added a more readable, secure and functional implementation.
Line 135: Line 135:


def eval_rpn(stack, expression):
def eval_rpn(stack, expression):
"""Evaluates an expression in reverse polish notation."""
"""Evaluates an expression in reverse polish notation on a stack."""
for token in expression.split():
for token in expression.split():
if token in operators:
if token in operators:
Line 141: Line 141:
else:
else:
stack.append(float(token))
stack.append(float(token))
return stack[-1]
return stack


stack = []
stack = []
Line 147: Line 147:
while True:
while True:
try:
try:
print(eval_rpn(stack, input('> ')))
stack = eval_rpn(stack, input('> '))
print(stack[-1])
except (ValueError, IndexError) as e:
except (ValueError, IndexError) as e:
print('Bad input:', e)
print('Bad input:', e)

Revision as of 23:22, 13 January 2009

Template:Infobox notationReverse Polish notation (or just RPN) by analogy with the related Polish notation, a prefix notation introduced in 1920 by the Polish mathematician Jan Łukasiewicz, is a mathematical notation wherein every operator follows all of its operands. It is also known as Postfix notation and is parenthesis-free.

The Reverse Polish scheme was proposed by F. L. Bauer and E. W. Dijkstra in the early 1960s to reduce computer memory access and utilize the stack to evaluate expressions. The notation and algorithms for this scheme were enriched by Australian philosopher and computer scientist Charles Hamblin in the mid-1960s.[1][2]

During the 1960s and 1970s, RPN had some currency even among the general public, as it was widely used in desktop calculators of the time.

Most of what follows is about binary operators. A unary operator for which the Reverse Polish notation is the general convention is the factorial.

Explanation

In Reverse Polish notation the operators follow their operands; for instance, to add three and four, one would write "3 4 +" rather than "3 + 4". If there are multiple operations, the operator is given immediately after its second operand; so the expression written "3 − 4 + 5" in conventional infix notation would be written "3 4 − 5 +" in RPN: first subtract 4 from 3, then add 5 to that. An advantage of RPN is that it obviates the need for parentheses that are required by infix. While "3 − 4 * 5" can also be written "3 − (4 * 5)", that means something quite different from "(3 − 4) * 5". In postfix, the former would be written "3 4 5 * −", which unambiguously means "3 (4 5 *) −".

Interpreters of Reverse Polish notation are often stack-based; that is, operands are pushed onto a stack, and when an operation is performed, its operands are popped from a stack and its result pushed back on. Stacks, and therefore RPN, have the advantage of being easy to implement and very fast.

Note that, despite the name, reverse Polish notation is not exactly the reverse of Polish notation, as the operands of non-commutative operations are still written in the conventional order (e.g. "/ 6 3" in Polish notation corresponds to "6 3 /" in reverse Polish, both evaluating to 2, whereas "3 6 /" would evaluate to 0.5). Numbers are also written with the digits in the conventional order.

Practical implications

  • Calculations occur as soon as an operator is specified. Thus, expressions are not entered wholesale from right to left but calculated one piece at a time, most efficiently from the centre outwards. This results in fewer operator errors when performing complex calculations.[citation needed]
  • The automatic stack permits the automatic storage of intermediate results for use later: this key feature is what permits RPN calculators easily to evaluate expressions of arbitrary complexity: they do not have limits on the complexity of expression they can calculate, unlike typical scientific calculators.
  • Brackets and parentheses are unnecessary: the user simply performs calculations in the order that is required, letting the automatic stack store intermediate results on the fly for later use. Likewise, there is no requirement for the precedence rules required in infix notation.
  • In RPN calculators, no equals key is required to force computation to occur.
  • RPN calculators do, however, require an enter key to separate two adjacent numeric operands.
  • The machine state is always a stack of values awaiting operation; it is impossible to enter an operator onto the stack. This makes use conceptually easy compared to more complex entry methods.
  • Educationally, RPN calculators have the advantage that the user must understand the expression being calculated: it is not possible to simply copy the expression from paper into the machine and read off the answer without understanding. One must calculate from the middle of the expression, which makes life easier but only if the user understands what they are doing.
  • Reverse Polish notation also reflects the way calculations are done on pen and paper. One first writes the numbers down and then performs the calculation. Thus the concept is easy to teach.
  • The widespread use of infix electronic calculators using (infix) in educational systems can make RPN impractical at times, not conforming to standard teaching methods. The fact that RPN has no use for parentheses means it is faster and easier to calculate expressions, particularly the more complex ones, than with an infix calculator, owing to fewer keystrokes and greater visibility of intermediate results. It is also easy for a computer to convert infix notation to postfix, most notably via Dijkstra's shunting yard algorithm - see converting from infix notation below.
  • Users must know the size of the stack, since practical implementations of RPN use different sizes for the stack. For example, the algebraic expression 1-1.001^(-6.2-2^(3*pi)), if performed with a stack size of 4 and executed from left to right, would exhaust the stack. The answer might be given as an imaginary number instead of approximately 0.5 as a real number which, to the novice user, could be inexplicably wrong (assuming a novice would notice).
  • When writing RPN on paper (something which even some users of RPN may not do) adjacent numbers need a separator between them. Using a space is not good practice because it requires clear handwriting to prevent confusion. For example, 12 34 + could look like 123 4 + but in a monospace font it is quite clear, while something like 12, 34 + is straightforward. The comma becomes a virtual Space key.
  • RPN is very easy to write and makes practical sense when it is adopted. The "learning" process to adopt RPN in writing usually comes later than adopting RPN on a calculator so that one may communicate more easily with non-RPN users.

The postfix algorithm

The algorithm for evaluating any postfix expression is fairly straightforward:

  • While there are input tokens left
    • Read the next token from input.
    • If the token is a value
      • Push it onto the stack.
    • Otherwise, the token is an operator.
      • It is known a priori that the operator takes n arguments.
      • If there are fewer than n values on the stack
        • (Error) The user has not input sufficient values in the expression.
      • Else, Pop the top n values from the stack.
      • Evaluate the operator, with the values as arguments.
      • Push the returned results, if any, back onto the stack.
  • If there is only one value in the stack
    • That value is the result of the calculation.
  • If there are more values in the stack
    • (Error) The user input too many values.

Example

The infix expression "5 + ((1 + 2) * 4) − 3" can be written down like this in RPN:

5 1 2 + 4 * + 3 −

The expression is evaluated left-to-right, with the inputs interpreted as shown in the following table (the Stack is the list of values the algorithm is "keeping track of" after the Operation given in the middle column has taken place):

Input Operation Stack Comment
5 Push operand 5
1 Push operand 5, 1
2 Push operand 5, 1, 2
+ Add 5, 3 Pop two values (1, 2) and push result (3)
4 Push operand 5, 3, 4
* Multiply 5, 12 Pop two values (3, 4) and push result (12)
+ Add 17 Pop two values (5, 12) and push result (17)
3 Push operand 17, 3
Subtract 14 Pop two values (17, 3) and push result (14)

When a computation is finished, its result remains as the top (and only) value in the stack; in this case, 14.

The above example could be rewritten by following the "chain calculation" method described by HP for their series of RPN calculators:

"As was demonstrated in the Algebraic mode, it is usually easier (fewer keystrokes) in working a problem like this to begin with the arithmetic operations inside the parentheses first."[1]

1 2 + 4 * 5 + 3 −

Converting from infix notation

Edsger Dijkstra invented the "shunting yard" algorithm to convert infix expressions to postfix (RPN), so named because its operation resembles that of a railroad shunting yard.

There are other ways of producing postfix expressions from infix notation. Most Operator-precedence parsers can be modified to produce postfix expressions; in particular, once an abstract syntax tree has been constructed, the corresponding postfix expression is given by a simple post-order traversal of that tree.

Implementations

The first computers to implement architectures enabling RPN were the English Electric Company's KDF9 machine, which was announced in 1960 and delivered (i.e. made available commercially) in 1963, and the American Burroughs B5000, announced in 1961 and also delivered in 1963. One of the designers of the B5000, Robert S. Barton, later wrote that he developed RPN independently of Hamblin, sometime in 1958 while reading a textbook on symbolic logic, and before he was aware of Hamblin's work.

Friden introduced RPN to the desktop calculator market with the EC-130 in June 1963. Hewlett-Packard (HP) engineers designed the 9100A Desktop Calculator in 1968 with RPN. This calculator popularized RPN among the scientific and engineering communities, even though early advertisements for the 9100A failed to mention RPN. The HP-35, the world's first handheld scientific calculator, used RPN in 1972. HP used RPN on every handheld calculator it sold, whether scientific, financial, or programmable, until it introduced an adding machine-style calculator, the HP-10A. HP introduced an LCD display line of calculators in the early 1980s that used RPN, such as the HP-10C, HP-11C, HP-15C, HP-16C, and the famous financial calculator, the HP-12C. When Hewlett-Packard introduced a later business calculator, the HP-19B, without RPN, feedback from financiers and others used to the 12-C compelled them to release the HP-19BII, which gave users the option of using algebraic notation or RPN.

Existing implementations using Reverse Polish notation include:

A Postfix evaluator implemented in Python 3.0

operators = {
    '+': float.__add__,
    '-': float.__sub__,
    '*': float.__mul__,
    '/': float.__truediv__,
    '^': float.__pow__,
    '%': float.__mod__,
}

def eval_rpn(stack, expression):
    """Evaluates an expression in reverse polish notation on a stack."""
    for token in expression.split():
        if token in operators:
            stack.append(operators[token](stack.pop(-2), stack.pop()))
        else:
            stack.append(float(token))
    return stack

stack = []

while True:
    try:
        stack = eval_rpn(stack, input('> '))
        print(stack[-1])
    except (ValueError, IndexError) as e:
        print('Bad input:', e)

Notes

  1. ^ "Charles L. Hamblin and his work" by Peter McBurney
  2. ^ "Charles L. Hamblin: Computer Pioneer" by Peter McBurney, July 27, 2008. "Hamblin soon became aware of the problems of (a) computing mathematical formulae containing brackets, and (b) the memory overhead in having dealing with memory stores each of which had its own name. One solution to the first problem was Jan Lukasiewicz's Polish notation, which enables a writer of mathematical notation to instruct a reader the order in which to execute the operations (e.g. addition, multiplication, etc) without using brackets. Polish notation achieves this by having an operator (+, *, etc) precede the operands to which it applies, e.g., +ab, instead of the usual, a+b. Hamblin, with his training in formal logic, knew of Lukasiewicz's work."

See also