The last version of hoc defined in UPE, hoc6 provides some major features that complete its transition from calculator to minimally-useful scripting language.

User-defined subroutines

Users can now define their own subroutines to create reusable behavior. The implementation changes for this feature are probably—with the exception of our transition from hoc3 to hoc4—the largest overall.

Improved Print Statement & String Literals

The print statement is much improved in hoc6. We can print multiple comma-separated expressions, and even more importantly, string literals!

User Input Support

hoc6 enables hoc programs to read input data from files and/or the console. Together with the enhanced print statement, hoc programs can now prompt the user for input, read pre-calculated data from files, and engage interactively with users.

Feature: User-Defined Subroutines

Subroutines are divided into two categories: functions, which are required to return a value, and procedures, which are required not to. You must decide what kind of subroutine you’re writing up-front, because they use different keywords:

The func keyword defines a function:

func theAnswer() {
    return 42
}

A simple hoc function

Whereas the proc keyword is used for procedures:

proc tellMeTheAnswer() {
    print 42
}

A simple hoc procedure

Procedures are still allowed to use the return keyword in order to exit early, but it must be a “bare return”, with no value.

Subroutines may call other subroutines or themselves; they can also use variables freely. Note, however, that the variables defined inside a subroutine are not scoped; they’ll still be available outside of the routine!

The following example returns the factorial of the global variable x. If x isn’t defined, we’ll get an error when we call it.

func factorialOfX() {
    if (x <= 0) {
      return 1
    }
    f = x
    result = 1
    while (f > 1) {
      result = result * f
      f = f - 1
    }
    return result
}

A hoc function that uses variables

You can see that all of the variables used inside the function are still available after it returns.

factorialOfX()
./hoc6/hoc: Undefined variable x (on line 15)
x = 5
factorialOfX()
    120
x
    5
f
    1
result
    120

Using Parameters

When defining a subroutine in hoc6, there is no list of formal parameters. Instead, the function/procedure body can freely use the syntax $1, $2, etc. to refer to the first, second, etc. argument. We can use this to make a much nicer factorial function:

func fac() {
  if ($1 <= 0) {
    return 1;
  } else {
    return $1 * fac($1-1)
  }
}

fac(k) computes k!

When executing your subroutine, hoc will only verify that such an argument was actually passed when you try to use it.

fac(11)
    39916800
fac()
./hoc6/hoc: Not enough arguments passed to subroutine: fac (on line 10)

Assigning to Parameters

The arguments to a subroutine are passed by value; we are allowed to modify their values without changing them in the caller. Notice how we change $1 at the beginning of compare() below.

func compare() {
  $1 = $1 - $2
  if ($1 < 0) {
    return -1;
  } else if ($1 == 0) {
    return 0;
  }
  return 1;
}

compare(x, y) returns -1, 0, or 1, for x < y, x = y, or x > y.

Feature: Enhanced Print Statements

hoc6 has a much-improved print statement. Not only can multiple expressions be printed, e.g:

print PI, sin(PI), cos(PI)

We can also now print string literals, mixed with arbitrary expressions.

print "The sine value of ", PI, " is ", int(sin(PI)), "."

Importantly, this is the only functionality supported for strings. They may not be:

Stored in a variable
Returned from a function
Used within an expression

Any of those changes would require support for multiple value types. We’ll get to this in adhoc.

Feature: Reading User Input

In hoc6, we add a new builtin:

read(varName)

This allows our hoc programs to read user input into a variable and check that the input was valid in a single call.

Specifying Input Sources

First, when starting hoc, we can specify a set of one or more input files, and/or stdin. The read(varName) function will read from one file at a time; when a file is exhausted, hoc transparently moves to the next file.

We’d use the command ./hoc6 datafile to read input from datafile, and ./hoc6 - to read from stdin (presumably, the console).

This can be combined arbitrarily. The following shippet first reads variables from two static files, and once they’re exhausted, will read any remaining data from stdin.

./hoc6 input-file1 input-file2 -

read(): Behavior for Multiple Files — `read()`: Behavior for Multiple Files

Using Input Data in `hoc6`

To read a double from our current input sources, you can use the read(myVar) statement. hoc responds to the read request as follows:

hoc will attempt to read a double from the current input source
If the data is valid, it will be stored into the variable named myVar, and read(myVar) will return 1. If we are out of data, myVar is set to 0 and read(myVar) returns 0. If the data was invalid, we get an exec_error.

This allows us to write programs that interactively use stdin during execution, such as:

> ./hoc -
{
  print "Enter some numbers; press EOF when finished: \n"
  sum = 0
  while (read(x)) {
    sum = sum + x
  }
  print "The sum is: ", sum, "\n"
}
Enter some numbers; press EOF when finished:
1 2 3 4 5 6 7 8
<Ctrl-D pressed>
The sum is: 36
<We can continue using hoc here>

Implementation: Defining Subroutines

We can think of a user-defined subroutine as a named set of machine instructions that are permanently installed in the hoc virtual machine. When we define a subroutine, we install the instructions, but we don’t execute them. This is a significant change; until now, hoc has executed all top-level statements as soon as they’re recognized.

We’ll break down the changes into meaningful areas, as described below. step-by-step. Since procedures and functions only differ in their return values, our discussion applies to both.

Definition Syntax - We need to support the syntax for creating named blocks of code
Subroutine Addresses - Because subroutines live in our machine’s program storage area, we need to be able to refer to “addresses” inside the prog area.
Reserving prog space - Previously, our hoc machine overwrote whatever code had been installed in prog as soon as it was executed. We need a way to mark subroutine instructions as “reserved” so that they are not overwritten by new statements.

Step 1: Declaration Syntax

First, let’s add the ability to recognize a subroutine definition. We have two kinds of subroutines, so we should examine their components:

func subrName() {
  <statements>
}

Some components of this syntax we already have; the block of statements can nicely be handled by stmtlist, for example.

However, we’ll need to add the func & proc keywords, as well as adding parser actions to associate the subroutine’s name Symbol with its instructions.

New Keywords

Like loops and conditionals, it’ll be helpful to have new tokens—PROC_KW and FUNC_KW—for our “introductory” keywords proc and func. This will let us take action before any other components of the declaration are parsed.

-%token  <hoc_symbol>    IF_KW ELSE_KW PRINT_KW WHILE_KW
+%token  <hoc_symbol>    IF_KW ELSE_KW PRINT_KW WHILE_KW FUNC_KW PROC_KW

New keywords for subroutine declarations hoc6/hoc.y

The keywords must also be added to our language builtins.

static struct hoc_keyword {
  const char *name;
  int         token_type;
} keywords[] = {
    {"if", IF_KEYWORD},
    ...
+   {"func", FUNC_KW},
+   {"proc", PROC_KW},
    {NULL, 0},
};

New keywords for subroutines hoc6/builtins.c

Subroutine Names & Symbols

Once we’ve found a func or proc keyword, we want to parse the subroutine’s name, and create an associated Symbol object. To that end, we’ll need a new Symbol type; in fact, we’ll define two, so that we can handle the differences in return-value behavior. For lack of a better alternative, we’ll name them UPROC and UFUNC, for procedures and functions respectively.

  %token  <hoc_value>    NUMBER
- %token  <hoc_symbol>   VAR CONST BUILTIN UNDEF
+ %token  <hoc_symbol>   VAR CONST BUILTIN UNDEF UFUNC UPROC
  %type   <hoc_symbol>   assignable

New token types for subroutine declarations hoc6/hoc.y

The rules for naming a function will be the same as those for naming any other symbol, so we can reuse the VAR token for parsing a subroutine’s name. However, when we eventually call a subroutine, we’ll need to be able to differentiate them from variables and builtin functions, so we create a small wrapper, subr_decl, to override the installed Symbol’s type. This is the same pattern we used for assignable before, to handle VAR vs CONST symbols.

First, we’ll declare that, like our assignable wrapper, subr_decl procuces a Symbol.

  %token  <hoc_value>    NUMBER
- %type   <hoc_symbol>   assignable
+ %type   <hoc_symbol>   assignable subr_decl

subr_decl produces a UFUNC or UPROC hoc6/hoc.y

The production for a subr_decl is easy, since we have our introductory keywords letting us know when a declaration has begun. Our only job is to override the Symbol.type installed by the lexer.

+ subr_decl:      PROC_KW VAR
+                 {
+                     $2->type = UPROC;
+                     $$ = $2;
+                 }
+         |       FUNC_KW VAR
+                 {
+                     $2->type = UFUNC;
+                     $$ = $2;
+                 }
+         ;

The subr_decl production updates Symbol types hoc6/hoc.y

Q: What’s the value of $2->type before this production runs?

A: The yylex() function installs it as UNDEF.

Recognizing Subroutine Definitions

We can complete our language support for subroutine definnitions by recognizing their overall shape, including the body.

Modeling ourselves after our if and while constructs, we note that these declarations should be statements, and we can use the stmt production to recognize their bodies.

Like all statements, procedure declarations return the address at which they begin. These declarations are entirely defined by their body, so we’ll return the address of their body stmt.

  stmt:           expr { install_instruction(inst_pop); }
          |       '{' stmtlist '}' { $$ = $2; }
+         |       subr_decl '(' ')' stmt
+                 {
+                     $$ = $4;
+                 }

A subroutine declaration is a new kind of stmt hoc6/hoc.y

Step 2: Subroutine Addresses

Since user-defined subroutines are really just a set of instructions, a Symbol referring to a subroutine is really holding the address of the associated instructions.

It seems like the time to add an Addr member to the Symbol.data union, but unfortunately, this causes a new problem. Remember that the MachineCode struct already contains a Symbol pointer, because that’s how we represent symbol-table references. This causes cyclic dependencies between the Symbol and MachineCode types.

We can solve this by moving our forward declaration of MachineCode & Addr above our Symbol definition in hoc.h, and let the compiler resolve the full struct layout.

+ ///----------------------------------------------------------------
+ /// Addresses (Shared by multiple types)
+ ///----------------------------------------------------------------
+ typedef struct MachineCode MachineCode;
+ typedef MachineCode       *Addr;

...

  struct Symbol {
    char *name;
    short type;  // Generated from our token types
    Symbol *next;
    union {
      double             val;
+     Addr               addr;
      struct BuiltinFunc func;
    } data;
  };

+ ///----------------------------------------------------------------
+ /// Machine Code
+ ///----------------------------------------------------------------
...
- typedef struct MachineCode MachineCode;
- typedef MachineCode       *Addr;

hoc6/hoc.h

Step 3: Installing Subroutines

We need a way to “install” routines into our machine permanently, so that we can execute them multiple times. The biggest obstacle to this process is our machine_reset_program() function, which resets progp every time we execute a parsed statement. We need a way to prevent our subroutine bodies from being overwritten.

One workable method is to mark all space from prog to the end of the procedure as “reserved”. Instead of resetting progp back to the beginning of program memory, we’ll use a new variable, prog_start, which indicates the first non-reserved word that we can use.

  MachineCode prog[PROGSIZE];
+ Addr        prog_start = prog;  // No installation above this address
  Addr        progp;              // Next free location for code installation
  static Addr pc;                 // the current location in the executing program

hoc6/machine.c

The prog* variables have the following relationship:

Our new prog_start variable — Our new `prog_start` variable

Now, when we define a routine, we’ll preserve its code by shifting prog_start to just after its body. This is handled by reserve_subr, a new machine function. That function also updates the given Symbol to point to the location where the subroutine begins.

/**
 * Reserves all currently-installed code as a subroutine, under Symbol
 * `subr_sym`. The value of `subr_sym` will be updated to the address
 * of the installed routine. Ensures that the code will not be
 * overwritten by future installations, even after a machine reset.
 */
void reserve_subr(Symbol *subr_sym);

hoc6/hoc.h

  Addr install_code(MachineCode mc) {
    *progp = mc;
    Addr result = progp++;  // return location of THIS instruction
    return result;
  }

+ void reserve_subr(Symbol *subr_sym) {
+    subr_sym->data.addr = prog_start;
+    prog_start = progp;
+ }

hoc6/machine.c

Then, we make machine_reset_program() aware of prog_start, so that we never again overwrite any code between prog and prog_start.

void machine_reset_program(void) {
  stackp = stack;
- progp = progp;
+ progp = prog_start;

hoc6/machine.c

Finally, we update our parser action for subroutine-declaration statements to call our new reservation function and populate the Symbol value. Now, writing a subroutine declaration will install code, but won’t execute any!

  stmt:      expr { install_instruction(inst_pop); }
        |    '{' stmtlist '}' { $$ = $2; }
        |    subr_decl '(' ')' stmt
             {
+                reserve_subr($1);
                 $$ = $4;
             }

hoc.y: Reserving space for procedures hoc6/hoc.y

Q: This isn’t 100% true. What instruction(s) are read / executed after parsing stmt, and where do they come from?

A: The machine will examine at least 1 instruction: STOP. It’s installed by the list: list stmt production, and since prog_start is updated before the list production runs, it will be evaluated and then complete.

Breakpoint: Test your Implementation

If you are following along, now is a good time to test your implementation. We can do that by adding a temporary expr production that understands how to produce a value for an installed declaration.

%%
expr: ...
...      |   UPROC
             {
                 $$ = install_instruction(inst_pushlit);
                 install_literal($1->data.addr - prog);
             }
         |   UFUNC
             {
                 $$ = install_instruction(inst_pushlit);
                 install_literal($1->data.addr - prog);
             }

Now, using the function/procedure name as a variable will evaluate to its address, relative to prog. As expected, a more complex body (like calling a builtin), corresponds to a larger block of instructions.

./hoc6/hoc
proc f1() { x = 123 + 345 }
proc f2() { print sin(cos(x+42)) }
proc f3() { y = 10 }
f1
    0
f2
    10
f3
    24

Implementation: Calling Subroutines

hoc now correctly installs our subroutines into permanent storage. We now focus on calling those subroutines, which requires a coordinated dance between the caller and called code.

The basic idea of executing subroutines in our hoc VM involves a combination of two cooperating instructions:

subrexec, to enter a named subroutine, saving our current location
uprocret, to return from a procedure to the saved location. (There’s also a ufuncret instruction, covered when we implement return values.)

We’ll add this functionality in stages:

Representing Subroutine Calls & Callers - Whenever we call a procedure, we must keep track of where we will resume execution afterwards. Since subroutines may call other subroutines (or themselves), we need to track an arbitrary stack of callers.
Language Changes
Executing Subroutines
Returning to Callers

Step 1: Representing “Calls”

The bare minimum required for executing a reserved block of code —ignoring, for now, arguments and return values—is the ability to jump to the subroutine’s address, then jump back the caller after executing. The origin / destination data for a call is encoded into a call frame. Unlike most of our objects, the Frame structure has no use outside of the machine module, and so we can define it inside machine.c.

typedef struct Frame {
  Symbol *subr_called;  // the UFUNC/UPROC we have jumped to
  Addr    ret_pc;       // return location
} Frame;

hoc6/machine.c

Below is an example of how relationship between the subrexec instruction, the Frame it creates, and the uprocret instruction, which uses the information to return to the caller.

Relationship between an “execute” call, its Frame, and the Symbol involved. — Relationship between an “execute” call, its `Frame`, and the `Symbol` involved.

The Call Stack

Because subroutines may call other subroutines (or themselves), a single Frame of call data is not sufficient. Instead, this information is often organized into a stack of per-call data, imaginatively named the call stack. Since a Frame is pushed onto the stack every time a subroutine is invoked, and popped when a routine returns to its caller, the currently-executing routine always has access to its own data at the top of the stack.

#define FRAMELIMIT 100  // Recursion depth limit
static Frame        frames[FRAMELIMIT];
static const Frame *OVERFLOW_FRAME = frames + FRAMELIMIT;
static Frame       *fp = frames;   // Next frame to use

hoc6/machine.c

We also add some simple functions for manipulating the frame stack.


/**
 * Push a new frame for a subroutine call onto the frame stack, or fail on
 * overflow.
 *
 * The frame is initialized as a call to Symbol `s`, returning to address
 * `caller_pc`. Caller must initialize all other frame data.
 */
static Frame *frame_push(Symbol *s, Addr caller_pc) {
  if (fp == OVERFLOW_FRAME) {
    exec_error("Call-Stack Overflow while calling '%s'", s->name);
  }

  fp->subr_called = s;
  fp->ret_pc = caller_pc;
  return fp++;
}

/** Pop most recent Frame from the frame stack, or fail on underflow */
static Frame *frame_pop(void) {
  if (fp == frames) {
    exec_error("Call-Stack Underflow while attempting to return");
  }
  return --fp;
}

/** Access top of frame stack, or fail if no Frame exists */
static Frame *frame_peek(void) {
  if (fp == frames) {
    exec_error("Call-Stack Underflow while attempting to peek at top frame");
  }
  return fp - 1;
}

hoc6/machine.c

While we could keep the Frame objects on our existing stack, the UPE authors recommend separating the stacks for ease of implementation.

Step 2: The `subrexec` Instruction

With our new call stack, we can implement the subrexec instruction, which performs the work to actually change our PC location and push a new Frame onto our stack. This instruction is always two words long, since we also need the Symbol which points to our procedure.

The subrexec instruction format — The `subrexec` instruction format

To actually run the subroutine, we use a recursive execute() call, using Symbol->data.addr as the pc value.

int inst_subrexec(void) {
  Symbol *sym = (pc++)->symbol;
  Frame *f = frame_push(sym, pc);

  execute(sym->data.addr);
  return 0;
}

hoc6/machine.c

Sidenote: Why Recurse? (Or skip

While it may seem as though we could simply change the value of the PC in order to “jump” to the subroutine, a nested execute() call is more compatible with with our implementation of ifcode and whilecode. Since those instructions use recursion, a direct-jump subroutine implementation without using recursion would lead to executing code in C stack frames we would not expect.

As an example, consider the following hoc program.

proc sillyCode() {
  if (2) {
    if (3) {
      print 1
      return
    }
    print 4
  }
  print 5
}
{
 if(1) {
   sillyCode()
   print 2
 }
 print 3
}

If you assume a non-recursive implementation of the subrexec and the uprocret instructions, what would this program print?

Most likely, something like:

Because our ifcode instructions recurse, we’re inside 4 execute() functions by the time we reach print 1:

Main execute() from our REPL
execute() inside inst_ifcode for if (1).
execute() inside inst_ifcode for if (2).
execute() inside inst_ifcode for if (3).

Now, after we finish print 1, we return, and we assume that merely jumps to the instructions for print 2. Since that’s the end of a stmtlist block, the following instruction will be a STOP.

Now things get odd. This will exit the fourth layer of the execute() function, bringing us back to the inst_ifcode we were running for if (3)… inside our procedure! And that will begin running print 4! There are many problematic situations we can create for ourselves in this manner.

Step 3: Subroutine-Call Syntax

To install the subrexec instruction, we must add the language rules for calling a function or procedure. For now, we’ll disallow specifying any arguments, so the address we produce for the parser will just be the instruction for calling the procedure / function. Notice that procedure calls, which don’t return a value, must be statements, as they don’t leave a value on the stack.

%%
stmt: ...
     |        UPROC '(' ')'
              {
                  $$ = install_instruction(inst_subrexec);
                  install_ref($1);
              }

hoc.y: Calling procedures hoc6/hoc.y

Function calls are expressions, and so they can be used inside conditions, etc. (The job of placing a value onto the data stack will be handled by the return statement of the function; trying to call a function at this point will cause errors!)

%%
expr: ...
     |        UFUNC '(' ')'
              {
                  $$ = install_instruction(inst_subrexec);
                  install_ref($1);
              }

hoc.y: Calling functions hoc6/hoc.y

Step 4: Returning to our Caller

Our procedures must, at some point, return to their caller. We need a facility to “jump back” to our calling code. Since we’re executing recursively, this means we need to add a way to signal that it’s time to return from our current execute() call, and update the pc value to match our return address.

For hoc6, we will use a simple implementation, adding an is_returning flag that is examined by execute(). Whenever that flag is true, the VM knows it should just be unwinding nested execute calls, so the execute() function returns immediately, without incrementing the pc value again or executing anything.

That flag should stay true until we find ourselves back int the inst_subrexec function, since we’ve returned from the called subroutine. At that point, we can safely set the flag to false, knowing that the return instruction has already updated the PC.

You can see a visual representation of this flow below:

subrexec runs within some execute() call (dotted). It finds the procedure address, and starts its own execute() flow.
This flow runs instructions; eventually we reach a uprocret instruction
uprocret sets pc = frame.ret_pc, and sets the is_returning flag
execute() sees the flag is enabled, and returns to subrexec
subrexec disables is_returning, and returns to its enclosing execute() call. the pc value is still set to the value from uprocret and not updated - execution continues correctly.

Machine Changes

First, we must add the flag for is_returning:

  #define FRAMELIMIT 100  // Recursion depth limit
  #define OVERFLOW_FRAME (frames + FRAMELIMIT)
  static Frame  frames[FRAMELIMIT];
  static Frame *fp = frames;   // Next frame to use
+ static bool   is_returning;  // true when returning from a user subroutine

hoc6/machine.c

Next, we’ll add the uprocret instruction; it must:

Pop the current Frame from our frame stack, so we know our return address
Set the pc to that return address, so that execution can resume in our caller.
Mark is_returning true, so that the next execute() iteration returns early, and we may return to subrexec.

int inst_uprocret(void) {
  Frame *f = frame_pop();
  is_returning = true;
  pc = f->ret_pc;
  return 0;
}

A “return” instruction hoc6/machine.c

Finally, we update execute to respect this state.

void execute(Addr start_addr) {
  pc = start_addr;
+ is_returning = false;
- while (!is_stop_inst(pc)) {
+ while (!is_stop_inst(pc) && !is_returning) {

hoc6/machine.c

An Issue: Loops and Conditionals

There’s one other aspect of our execution flow that must be considered before we’re done. To see it, consider the following hoc program, and decide what output will result when running it.

tester = 0
proc earlyReturnTest() {
  if (tester > 0) {
    print 2
    return
  }
  print 1
}

Q: What’s the output?

A: It will print both 2 and 1!

The reason for this bug lies in our current implementation of the ifcode and whilecode instructions, and the nested execution flow we’re using. Because ifcode and whilecode also overwrite the pc value after their nested execute() calls, they may override the pc value just set by uprocret!

Consider the following flow:

Problems with uprocret and ifcode — Problems with `uprocret` and `ifcode`

Looking back to our example, the return keyword inside our function sets pc and is_returning. We immediately return to the ifcode function, whose nested execute just ended. But the inst_ifcode function also sets the PC value, overriding our return location.

The relevant parts of hoc5’s inst_ifcode() are commented below:

int inst_ifcode(void) {
  Addr ifbody_addr = pc[0].addr;
  ...
  Addr end_addr = pc[2].addr;
  Addr cond_addr = pc + 3;
  execute(cond_addr);
  Datum d = stack_pop();
  if (d.value) {
    execute(ifbody_addr);  // uprocret changes our PC inside here
  }
  pc = end_addr;           // and it's overwritten here...  :(
  return 0;
}

In hoc5, inst_ifcode unconditionally overwrites pc to end_addr

How can we make our return keyword force its way past any enclosing if and while blocks, until it reaches its parent subrexec instruction?

“Bubbling” Returns via `is_returning` Checks

As the UPE authors suggest, a simple (if hacky) fix, is to make these instructions aware of the our is_returning state, since they are part of our ‘pc-change’ flow. Before making any manual changes to the pc value, we check whether we’re actually returning (and so should stop executing). This lets us “bubble up” from a return statement, through any enclosing ifs or whiles, to the subroutine we’re returning from! Here’s our fixed ifcode flow:

is_returning aware conditional logic — `is_returning` aware conditional logic

The actual changes are minor:

`inst_ifcode`

  int inst_ifcode(void) {
    ...
    execute(cond_addr);
    Datum d = stack_pop();  // must be value
    if (d.value) {
      execute(ifbody_addr);
    } else if (maybe_elsebody.type == CT_ADDR) {
      execute(maybe_elsebody.addr);
   }
}

-   pc = end_addr;
+   if (!is_returning) {
+     pc = end_addr;
+   }
    return 0;
  }

hoc6/machine.c

`inst_whilecode`

  int inst_whilecode(void) {
    Addr body_addr = (pc++)->addr;
    Addr end_addr = (pc++)->addr;
    Addr cond_addr = pc;

    while (true) {
+     if (is_returning) {
+       return 0;  // PC is already set to the appropriate address
+     }

      execute(cond_addr);
      if (stack_pop().value) {
        execute(body_addr);
      } else {
        break;
      }
    }

    // condition false
    pc = end_addr;
    return 0;
  }

hoc6/machine.c

Adding `return` Statements

To allow users to trigger the return, we add a return keyword to our token declarations and language builtins:

- %token  <hoc_symbol>  IF_KW ... FUNC_KW PROC_KW
+ %token  <hoc_symbol>  IF_KW ... FUNC_KW PROC_KW RETURN_KW

hoc6: Subroutines & Strings

User-defined subroutines

Improved Print Statement & String Literals

User Input Support

Feature: User-Defined Subroutines

Using Parameters

Assigning to Parameters

Feature: Enhanced Print Statements

Feature: Reading User Input

Specifying Input Sources

Using Input Data in hoc6

Implementation: Defining Subroutines

Step 1: Declaration Syntax

New Keywords

Subroutine Names & Symbols

Recognizing Subroutine Definitions

Step 2: Subroutine Addresses

Step 3: Installing Subroutines

Breakpoint: Test your Implementation

Implementation: Calling Subroutines

Step 1: Representing “Calls”

The Call Stack

Step 2: The subrexec Instruction

Sidenote: Why Recurse? (Or skip

Step 3: Subroutine-Call Syntax

Step 4: Returning to our Caller

Machine Changes

An Issue: Loops and Conditionals

“Bubbling” Returns via is_returning Checks

inst_ifcode

inst_whilecode

Adding return Statements

“Fall Through” Returns

Breakpoint: Test your Implementation

Implementation: Passing Arguments

Step 1: Designing Argument-Passing

Step 2: Argument-Passing Syntax

Tracking Integer Values

New Syntax: Argument Lists

Tracking Argument Addresses

Locating Arguments on the Stack

Step 3: Evaluating Parameters

argget

argset

Language Changes: Using Arguments

Machine Changes: argget and argset

Removing Arguments on Return

Implementation: Functions & Return Values

Step 1: Return Values & the Stack

Step 2: Adding a value-return statement

Step 3: Enforcing Return Values

Implementation: Enhanced Print Statements

Step 1: Add Dynamic Strings

Step 2: Lexing & Parsing Changes for Strings

Sidenote: Allowing arbitrary-length symbols

Step 3: Allow MachineCode to hold strings

Step 4: The prstr instruction

Step 5: Our Improved print Statement

Reading User Input

Step 1: User-Input Layer

State

Initialization

Switching input files

Reading Literals

Step 2: Language Changes: The read expression

Step 3: Machine Implementation

What’s Next

Source Code

Using Input Data in `hoc6`

Step 2: The `subrexec` Instruction

“Bubbling” Returns via `is_returning` Checks

`inst_ifcode`

`inst_whilecode`

Adding `return` Statements

`argget`

`argset`

Machine Changes: `argget` and `argset`

Step 3: Allow `MachineCode` to hold strings

Step 4: The `prstr` instruction

Step 5: Our Improved `print` Statement

Step 2: Language Changes: The `read` expression