hoc5 builds on the VM model constructed for hoc4, enabling more complex constructs like if/else branching and while loops. As we’ll see, these flow-control constructs will be more useful if we can also perform some limited output, so we’ll add a print keyword as well.

These constructs are the first step to executing larger programs, instead of the single-line “programs” executed by hoc4. We’ll begin with an overview of these new features to better understand how or implementation might change.

Feature: Relational and Boolean Operators

The first feature we’ll need for conditional logic is a set of relational operators, so that we can compare two values. We’ll add the standard set of ==, !=, <, <=, >, and >=.

These operators, like our arithmetic operators, will be compiled into instructions which will pop two values from the stack, then push the value 1 if the given comparison is true, and 0 otherwise.

The boolean operators, && and ||, operate the same way. In hoc5, unlike most languages, we’ll always evaluate both operands, instead of short-circuiting. We do want to inherit C’s approach to boolean precedence, however. Namely, the boolean operators should have a very low precedence, so that expressions like:

x < y && y < z

Are evaluated as:

Instead of:

Feature: Conditional Logic

The hoc language up to this point is meant for line-by-line interactions. The user enters an expression, and hoc prints a response. This “calculator-style” of interaction does not work as well once we introduce conditional logic. Consider:

x = 10
y = 1
while (x > 0) {
    y = y * x
    x = x - 1
}
lg(y)

hoc code for lg(10!)

The first two lines of this example will execute correctly in hoc4. Each one will be parsed into a tiny “assignment” program, and execute()ed in the virtual machine.

The while loop, however, presents some brand-new challenges. There are at least 3 hoc4 “programs” we can see inside this snippet of code.

Furthermore, some of these programs will be executed a variable number of times:

Certainly we must always execute the instructions to evaluate the condition, since that will determine whether to enter the loop body or not.
Our loop body, and its multiple internal expressions, will only execute if the condition is true. And if it’s true, we should return to execute it again after the body.
Finally, once the condition evaluates to false, then we must skip execution of the body, and move to the instructions after the loop body.

Obviously, our parser cannot know how many iterations of the loop we will run, so the machine must be able to “jump” between the condition, body, and post-body sections on its own.

We can visualize its behavior as follows:

We’ll need a new set of instructions for this; branching instructions, which allow us to move our pc to a new value and continue executing.

We’ll still need some help from the parser, since our machine needs to be told what address in the program to jump to. We can figure this out at parsing time, but only after the whole loop is parsed and all of its code is installed. So, we’ll need to be able to install the entire while construct, including the conditional expression and loop body, before we execute any of it. This will require some structural changes to our grammar.

Implementing Relational and Boolean Operators

For the most part, implementing relational operations is a straightforward extension of our current arithmetic operations. There is a small additional challenge regarding their syntax: the operators themselves may be 1 or 2 characters, and their prefixes overlap. For example, we must be able to distinguish < from <=, regardless of whitespace and surrounding digits/letters, e.g. 2 <=5 vs 2< 5.

Step 1: Add Operator Tokens

The relational operations are left-associative, just like arithmetic. They are set to a lower priority than any arithmetic operation, since we want x < y + 2 to mean what we would expect. Similarly, we want most operations to be complete before evaluating the OR and AND operators.

  %type   <hoc_symbol>    assignable
  %right  '='       /* right-associative, much like C */
+ %left   OR
+ %left   AND
+ %left   EQ GT GTE LT LTE NEQ
  %left '+' '-'   /* left-associative, same precedence */
  %left '*' '/'   /* left-assoc; higher precedence */
  %left UNARY_MINUS NOT /* Prefix operators have highest priority*

hoc5: Flow Control

Feature: Relational and Boolean Operators

Feature: Conditional Logic

Implementing Relational and Boolean Operators

Step 1: Add Operator Tokens

Step 2: Updating our Lexer

Step 3: Grammar Changes

Step 4: Machine Instructions

Implementing Loops

Step 1: Adding Keywords

Step 2: Introducing Statements

Step 3: Exposing Addresses

Accessing Addresses During Parsing

Handling Addresses in our Parser

Operator Expressions

Example: Generating Addresses

Assignment Expressions

Statement Addresses

Step 4: Looping Instructions: whilecode

The Shape of a whilecode instruction

Executing whilecode

Step 5: Parsing & installing while statements

Reserving Space: the while Keyword

Checking Conditions: the cond Nonterminal

Installing STOPs: The end rule

Putting it all together: the while statement

Implementing Conditional Logic

Parsing if/else statements

Syntax Sidenote: Braces & Newlines (or skip)

Executing if/else statements

Implementing a print Statement

What’s Next?

Source Code

Step 4: Looping Instructions: `whilecode`

The Shape of a `whilecode` instruction

Executing `whilecode`

Step 5: Parsing & installing `while` statements

Reserving Space: the `while` Keyword

Checking Conditions: the `cond` Nonterminal

Installing `STOP`s: The `end` rule

Putting it all together: the `while` statement

Parsing `if/else` statements

Executing `if/else` statements

Implementing a `print` Statement