In the previous post we have seen how 8 bit symbols represent literals, arguments, functions, etc. Similarly, it was vital to have a good and efficient function encoding. Basically in the context of function encoding we need to be able to answer the following question:
- Where does the function definition begin?
- Where does it end?
- How many arguments does a function have?
First, I have planned to use function identifiers as references to physical memory locations. So, for instance, functions will be indexed by starting from 0. But this indirection could cost a lot in terms of hardware.
Instead, I am using the actual symbol to determine the function address directly. The most significant bit itself signals whether the symbol in question is a function (it is a function, if it is on, except for EOX). The consecutive 5 bits multiplied by 8 will actually give the function physical address. For example, %1aaa.aaxx - function definition will start at physical binary address %aaaa.a000.
The observant reader may be wondering what the two least significant bits are used for. They represent the argument count encoded in two-complenent form as follows:
- %00 - 4 arguments
- %01 - 3 arguments
- %10 - 2 arguments
- %11 - 1 argument.
Again, this rather strange encoding is selected to facilitate the processing in hardware.
In the light of the aforementioned, a full function encoding works as follows:
- %10000000 - is a user-defined function with 4 arguments, of which definition starts at address $00.
- %10000111 - is a user-defined function with one argument with definition beginning at address $08.
- %11111010 - is a user-defined function with 2 arguments definition at address $F0.
Please also note that built-in function "if" is encoded in accordance with the above arity encoding, i.e $FD (symbol encoding "if") represents a 3 argument function. I should mention that the encoding of inc/dec does not conform to this, but they are being treated differently as we will see.