Homebrew CPU construction / házi készítésű processzor

Friss topikok


Intefacing with the Functional CPU

2014.07.29. 22:32 Budapesti álmodozó

Now let us take a look at how the functional CPU (FunCPU for short) can be programmed in reality, from the programmer’s point of view.



This will be covered by looking at each gizmo  (i.e. button, switch, LED, etc.) on the front/control panel.

On the bottom right side, the purpose of the power switch is self explanatory.

On the  bottom left side,  the switch is used to select between run and edit mode. In edit mode, the functions and/or expressions can be entered/modified/displayed. The computation can be initiated by pressing start button, only when the edit-run switch is turned to run mode.

The third switch is provided to switch between editing functions and expressions.

The two rotary switches next to it, are used to enter the 8-bit physical address in hexadecimal mode. The current physical address (both in edit or run modes) are indicated by the 9  address indicator LEDs. The most significant, the 9th bit LED is on, when expressions are edited in edit mode, or when in run mode the expression symbol is selected by the CPU.

Similarly, the current data byte stored at the actual address is displayed by the LEDs labelled Data.  Under these LEDs the eight switches are provided to enter or modify data in binary mode.

Store button (only working in edit mode) is actually used to enter the 8 bit value in either to function memory or expression memory depending on the position of the Func/Exp switch. When the store button is released, the previous data byte displayed by the 8 data LEDs, is overwritten, and the new value is displayed.

The most significant bit (8) denotes whether the value is a function (LED is on) or a constant (or possibly an argument) (LED is off). The following 5 LEDs as suggested by their corresponding label give the physical address of the user-function. The actual address can be obtained by multiplying this 5 bit value by 8 (or simply shifting it left three times). The last two LEDs encode the arity of a user-defined function: 00 , 01, 10, 11 represent arity 4, 3, 2 and 1 respectively.

Finally, some important values in binary form:

  • 0000 0000 – zero
  • 0111 11xx – argument 1, 2, 3  or 4 (in function mode)
  • 1111 1100 – dec function
  • 1111 1101 – if function
  • 1111 1110 – inc function
  • 1xxx xxaa – user-defined function at address %xxxx x000 with arity aa, provided that it is different from the code of the 3 built-in functions described above
  • 1111 1111 – end of expression / function

Current machine state is indicated by the four LEDs labelled as State. The initial, starting state is 0000, whereas 1111 denotes the succesfull termination of the computation. In the latter case, the result of the computation is displayed by the data LEDs in binary model.

Szólj hozzá!

Címkék: functional programming Homebrew CPU funCPU control board

FunCPU - Passing Arguments

2014.07.23. 20:13 Budapesti álmodozó

It is not trivial how to handle argument passing in the lack of stack, or any dedicated storage for parameters. Consider the following examples:

fac (n):= n*fac(n-1)



I wanted a simple solution, which still works without posing serious restrictions. I wanted arbitrary function calls with arbitrary depths.

I have come up with a very simple solution. The evaluation of a function is deferred up to the point when all of its parameters become constant. Referring the the term fac(add(2,3)), first add(2,3) is reduced to 5. Only after that is the fac function reduced by applying the previous result as an argument, that is, fac(5), which in turn is reduced to 120.

In similar fashion the evaluation of the second example happens.

First, fac(2), fac(1) (this appears twice) are reduced. Please note that there is a lot of potential to exploit parallelism and for optimization. These expressions can be reduced in paralllel, furthermore fac(1) should be calculated only once.

In the second step fac(fac(2)+fac(1)) is evaluated. Note: this time we have the constant value both for fac(2) and fac(1). In the third step add is performed, finally ack is executed.

Szólj hozzá!

Címkék: functional programming FunCPU Homebrew CPU argument passing

FunCPU - Parenthesis, Precedence

2014.07.22. 17:03 Budapesti álmodozó

When evaluating expressions, the CPU must exactly know how to evaluate an expression, what function/expression parts should be evaluated first, next, and last etc. Therefore the representation of expressions and functions must be unambigous with respect to parenthesis and priorities.

In mathematics, by convention some functions take precedence over other functions (e.g. multiplication has higher precedence than addition). To override such precedence, parenthesis can be used. See, for instance:

4+5*6 = 34

It is natural to think of introducing then parenthesis to make the meaning of expressions exact. However, this could result in additional complexity for evaluation, not to mention the fact, that parenthesis must be encoded by extra symbols.

Something simple was required, which could be implemented with the minimum hardware. So I have decided on using the Polish prefix notation. The above two expressions thus become the following respectively:
+ 4 * 5 6 = 34
* 6 + 4 5 = 45

As long as the arity of functions are fixed and known in advance, this notation represents the expressions unambiguously.

To implement this in hardware a simple 4 bit counter is used. The scope of a function then can be determined by using this counter as follows:

  • Counter initial value will be the arity of the function of which scope is to be determined.
  • If a constant is found, then counter is decreased by one.
  • If a function with arity n is found, then the counter is increased by n-1.
  • If counter reaches zero, then this means it was the last symbol in the function scope.

The following example demonstrates the use of the counter to determine the scope of the first „fac":

if 2 fac + fac 2 fac 1 if ... ac=1
if 2 fac + fac 2 fac 1 if ... ac=2 =1-1+2
if 2 fac + fac 2 fac 1 if ... ac=2=2-1+1
if 2 fac + fac 2 fac 1 if ... ac=1 =2-1
if 2 fac + fac 2 fac 1 if ... ac=1=2-1+1
if 2 fac + fac 2 fac 1 if ... ac=0=1-1

Please note that actually this is achieved in hardware a little bit differently. As arity is encoded in two’s complement, counter is increased by one, if a constant is found, and it is decreased when a function in scope is encountered. Also its initial value stores the function arity in two’s complement.

The choice of a 4 bit counter seems to be rather limited, but it is in balance with other hardware figures (memory, registers, etc.). Given a function f with 4 arguments, due to the 4 bit limitation the term
f(f(f(f(... is still fine (counter=4+3+3+3=13), but
f(f(f(f(f(... cannot be evaluated (counter=4+3+3+3+3=16), however,
f(f(f(f(const,f(... is ok again (4+3+3+3-1+3=15).

Szólj hozzá!

Címkék: functional programming FunCPU Homebrew CPU parenthesis operator precedence

FunCPU - Evaluation Strategy

2014.07.22. 16:52 Budapesti álmodozó

It is not trivial how expressions should be reduced, i.e. evaluated. I have already mentioned that while some strategy could be fruitful in reducing an expression in a given interpretation, others may be not.

Let us have a look at the following simple example of evaluation of the factorial function is defined under the alias fac.
fac(n):= if n=0 then 1 else n*fac(n-1)

The reduction of fac(1) may lead to

if 1=0 then 0 else 1*fac(0)
If 1=0 then 0 else 1*(if 0=0 then 1 else 0*fac(-1))
If 1=0 then 0 else 1*(if 0=0 then 1 else 0*(if -1=0 then 0 else -1*fac(-1))) .......

We can easily see that the reduction goes on forever, it never terminates.

The goals of the evaluation strategy are as follows:

  • Must terminate (simple applicative rules may result in infinite loop)
  • Must be efficient (as possible).
  • Must be simple (suitable to be represented in hardware directly).

The following strategy will be applied:

A user-defined function is reduced if and only if all of its arguments are constants. Similarly, inc and dec are reduced only, if their arguments are literals. As far as if-then-else is concerned: if its condition part is constant 0 (i.e. true), then it is reduced to exp1. If the condition part is a constant other than 0, then it is reduced to exp2. Othewise the reduction is postponed.

Corollary: all parameters passed to functions are constants.

The reduction algorithm is as follows:

  1. Scan input expression symbol by symbol.
  2. Copy symbol from source to output if symbol is not a function, or its a function but cannot be reduced in its current form (i.e. contains at least one argument, which is not a constant).
  3. Evaluate functions (if, inc, dec, and user-defined) if possible. That is, rewrite the function call by replacing the call with the function definition with bounded arguments.
  4. If EOX symbol is found, then the whole cycle restarts. Note: in the next cycle: the current output becomes the input and vice versa.
  5. Stop, if the first symbol is a constant. This literal is the result of the reduction.

It is clear that inc, dec terminates immediately, thus reduces the length of expression. The expression size also shrinks when in the expression „if cond exp1 exp2” cond is a constant. Eventually everything is based on/can be reduced to the three built-in functions. Therefore sooner or later, all „if”, „inc”, „dec” are evaluated/reduced, thus the evaluation terminates (if this is possible). 

Szólj hozzá!

Címkék: functional programming FunCPU evaluation strategy reduction strategy reducing expressions

FunCPU - Memory Models

2014.07.21. 14:05 Budapesti álmodozó

Needless to say that not only initial and final expression, but also temporary expressions should be stored in memory somehow. The way of how expressions are represented has impact on hardware design.

During reduction the cycles, expression symbols are being copied one by one. When a function symbol is found, then the symbol itself along with possible other consecutive symbols are rewritten in accordance with the function definition (this is used in a broader sense, and includes definition of built-in functions).

Please note that the following discussion is related only to expression evaluation, as user-defined functions are stored in a separate memory, possible in ROM.

Separate source and destination memory

Separate fragment of memory is used for source and destination expressions. Upon each reduction cycle the role of the memories are swapped. Source memory will become destination and vice versa. This is illustrated in the figure below. 



If l is the longest possible size (in memory cells including EOX) of the expression during the evaluation, then the minimum memory requirement is 2xl cells.


Shared memory, with index increasing and decreasing

A shared memory can also be used. In this case source index is increasing, whereas destination is decreasing while copying symbols. After each evaluation cycle, the role of the indexes will be swapped again. This memory model may be more econimic, since in some cases it will be suitable to reduce expressions over half of the memory size, provided that the reduced expression becomes smaller, satisfying the following equation, which in fact should hold in every cycle: e+r≤m, where e, r and m denote expression size (excluding EOX), reduced expression size and memory size respectively.


Shared memory with index following

The last model is also comparable to the previous one in econimical terms, as the same equation among expression lengths and memory size must hold. However, it has the additional benefit, namely that index roles do not need to be changed. This simplifies hardware design, as multiplexers (swapping the source and destination registers) and its relevant control signal and logic can be avoided.  As source index reaches the end of input expression marked by the EOX symbol, it should just copy and step over that symbol and the continue with the next reduction cycle. If memory end is reached, then indexes will wrap around starting from a lower location.



Szólj hozzá!

Címkék: functional programming FunCPU homebrew cpu memory model

FunCPU - Function Encoding Schema

2014.07.18. 22:43 Budapesti álmodozó

In the previous post we have seen how 8 bit symbols represent literals, arguments, functions, etc. Similarly, it was vital to have a good and efficient function encoding. Basically in the context of function encoding we need to be able to answer the following question:

  • Where does the function definition begin?
  • Where does it end?
  • How many arguments does a function have?

First, I have planned to use function identifiers as references to physical memory locations. So, for instance, functions will be indexed by starting from 0. But this indirection could cost a lot in terms of hardware.

Instead, I am using the actual symbol to determine the function address directly. The most significant bit itself signals whether the symbol in question is a function (it is a function, if it is on, except for EOX). The consecutive 5 bits multiplied by 8 will actually give the function physical address. For example, %1aaa.aaxx - function definition will start at physical binary address %aaaa.a000.

The observant reader may be wondering what the two least significant bits are used for. They represent the argument count encoded in two-complenent form as follows:

  • %00 - 4 arguments
  • %01 - 3 arguments
  • %10 - 2 arguments
  • %11 - 1 argument.

Again, this rather strange encoding is selected to facilitate the processing in hardware.

 In the light of the aforementioned, a full function encoding works as follows:

  • %10000000 - is a user-defined function with 4 arguments, of which definition starts at address $00.
  • %10000111 - is a user-defined function with one argument with definition beginning at address $08.
  • %11111010 - is a user-defined function with 2 arguments definition at address $F0.

Please also note that built-in function "if" is encoded in accordance with the above arity encoding, i.e $FD (symbol encoding "if") represents a 3 argument function. I should mention that the encoding of inc/dec does not conform to this, but they are being treated differently as we will see. 

Szólj hozzá!

Címkék: functional programming FunCPU Homebrew CPU function encoding

FunCPU - Tagged Architecture

2014.07.16. 22:03 Budapesti álmodozó

In the memory, different type of symbols are stored together. It is essential for the CPU to be able to differentiate among them. Therefore type information is required to be assigned to each unit of information. This piece of information is tagged to each symbol. A lot of effort has been dedicated to come up with an optimal encoding. I wanted to have an encoding/tagging scheme, which was very economical and yet efficient. 

I have decided to use a single 8 bit RAM/ROM for memory, so everything (including the actual data and its type information) must fit there. With 8 bits of data I could encode the following information:

  • 7 bit literals
  • built-in functions
  • user-defined functions
  • expression terminator.

The latter is used to mark the end of an expression (essentially the core of the functional program) or a function.

I have further classified the 8 bit symbols with respect to the above categories as follows:

  • zero constant, which also represents True;
  • other constants
  • arguments;
  • inc function
  • dec function
  • if function
  • user-dedinable function
  • expression terminatior (EOX).

This classification, as we will see, will heavily facilitate the processing of the expressions.


Szólj hozzá!

Címkék: functional programming funcpu tagged architecture homebrew cpu

FunCPU - Computational Model

2014.07.13. 22:40 Budapesti álmodozó

As the computational model will be purely functional without any imperative elements (which could introduce side effects), a simple expression rewriting system will be adequate.

User-defined functions will make up a library, which will in turn, also define the rewriting rules. Functions from library may refer to each other (i.e. one function can be used in another).

The expressions to be evaluated can be made of:

  • built-in functions;
  • user-defined functions (as stored in the library);
  • literals. 

EOX symbol will mark the end of such exrpession. Program execution will be not other, than evaluation of a closed expression (it is called such, because it may not contain any variables or quantifier) following the rewriting rules (as defined in the library) transforming an expression into a chain of expressions, and finally into a constant.

If it is possible to reduce the initial expression into constant in finite steps, then the evaluation stops succesfully, and we say that the expression is defined. Otherwise we say that the expression is undefined. The latter may happen

  • if expression or function deifinitions are not well formed, that is, they do not obey syntactical rules (e.g. a function is called with less or more arguments, than as its definition states,etc.);
  • if an infinite loop is encountered (via recursion);
  • if the particular evaluation strategy causes non-termination. This implies that there may co-exist other evaluation strategies, such as that provided that they are followed, the same expression with the same library (also called interpretation) can be reduced succesfully.

 For example:

f(x,y) :=x+y
g(x,y,z) :=f(x,y)-f(y,z)

Closed expression

g(3,2,1)+f(2,2) = ( f(3,2) – f(2,1) ) + (2+2) =( (3+2) – (2+1) )+ (2+2) = 6

Please note that as a function may refer to itself, it enables recursive function definition, and hence recursion. Along with function composition and atomic functions (inc, dec, if) it gives a real computation power to this simple model.

Szólj hozzá!

Címkék: expression evaluation functional programming funcpu rewrtiting computational model

FunCPU - Instruction set

2014.07.10. 21:42 Budapesti álmodozó

FunCPU will operate with 7-bit numerals (to understand why see the next couple of posts) and will directly support the following built-in functions:

inc(x) is used for incrementing a value, that is for any x: x+1=inc(x).

dec(x) is used for decrementing a value in the similar fashion.

if-then-else is used for conditional evaluation. if cond then exp1 else exp2 will yield exp1 if and only if cond is true, otherwise it will yield exp2 (assuming that evaluation of the condition terminates).

In addition to the above built-in functions, user definable functions may be entered arbitrary in the available 32 slots. These functions may use the built-in functions as well as any other user-defined functions. Functions can be embedded one into other at arbitrary depth. Functions can be created with arity 1, 2, 3 and 4. Constant functions still can be constructed, but they must have at least one argument. Thus, for example, if F(x) is a one-argument constant function, then F(x)=c for any x, where c is a constant.

Functions will only accept numeral constants as arguments (no higher order functions are supported), and will return also numerical value. Note: „0” represents True, any other value is considered to be False by convention.

By design, no advanced data structures (i.e. list, record, object, etc. ) will be supported natively. 

At first it may seem suprising that with this very limited instruction set (and of course with the poweful construct of function embedding), any computable function can be defined and thus computed within this model, provided that its encoded algorithm (and also all expressions as a result of the function evaluation) fits in the memory. 


Szólj hozzá!

Címkék: functional programming FunCPU Homebrew CPU built-in functions

FunCPU - Goals

2014.07.09. 14:28 Budapesti álmodozó

It has been a long time since my last CPU project, which focused on a very simple 8 bit CPU. Nothing groundbreaking, just a very simple variation of an accumulator machine. 

This time I have set out a different, more ambitious goal. I have wanted to create still something simple, but original unconventional architecture, which is still useful. 

I had considered to target a CPU optimized for logical programming, but finally I have decided on a CPU with an instruction set and architecture, which natively supports functional programming.

A tagged architecture is selected. Basically, this means that each unit of information carries its own type information. A special instruction set will be carefully selected, which would be most useful to support creation of functional programs in assembly or even in machine code. The whole design will be very simple, yet powerful enough to perform arbitrary complex computations (in theory) to be Turing-complete.

It will have 

  • no PC;
  • no SP or stack;
  • no conventional user readable flags;
  • no user readable registers (not even an accumulator in the conventional sense);
  • no branching/jumping instructions;
  • no I/O operations

So, what is what we have?

  • 7 bit literals (yep, not mispelt!);
  • 3 built-in,
  • 32 user-definable functions;
  • ROM – to store functions (256 bytes);
  • RAM – to evaluate expressions (256 bytes);
  • 8 bit data bus;
  • 8/9 bit address bus.


Szólj hozzá!

Címkék: cpu homebrew functional programming unconventional Turing-complete

Homebrew CPU Micro-Sequencer

2012.06.16. 19:18 Budapesti álmodozó

So far only the static part of the CPU has been subject to discussion. Now we are about to define the most interesting part: the control unit. The control unit in our case is basically a micro-sequencer, which is responsible for triggering the right control signals at the right moment and some logic gluing everything together.

The table below describes how the 16 operations are implemented by the micro-sequencer. For each instruction a number of steps is required. Some operations are made up of less steps, others comprise more. The most complex operation is the JSR, which contains four steps. The shortest, simplest one is NOP, which is essentailly a one-cycle operation. 

step 1 step 2 step 3 step 4




RR=RAM Required






(not taken)

same as above


(not required)

JSR same as above








same as above





ROR same as above





same as above





CMP same as above






LDA same as above






STA same as above






CLC same as above



SEC same as above



NOP same as above EOP


Each step requires one processor cycle. EOP (stands for end of operation) denotes end of a micro instruction sequence. When executing EOP, the micro-sequencer restarts immediately from step 1, and a new operation is fetched/selected, thus operation length (in terms of CPU cycles) must be calculated by ignoring steps with EOP.

For each step a number of control signals are activated as shown in the table above along with their respective operations. These signals control the registers and the ALU. 

The first step is the same for all operations. In this step the new instruction code is fetched to the intstruction register and also the upper four bit is fetched to the M register. Please note that to save steps as well as cycles (otherwise an extra step would have been required to increment PC), PC is incremented prior to instruction fetch. This has the consequence that jump instructions (JMP, JNC, JNZ, but not JSR) must point to target address-1, since after jump being executed, before the next instruction is fetched the PC is incremented. This small inconvenience can be hidden/cicrumvented by a good assembler. 

Please also pay attention to how and when (which phase of the cycle) control signals are triggered:

Let us have a look at some examples, which may help understand the micro-sequencer logic better. For instance, LDA works as follows:
step 1 - to fetch instruction as described above.
step 2 - PC is further increased. Then in the second phase of this cycle M register is set. By the end of this step M register contains the 12 bit address of the operand.
step 3 -  by signalling ADR_PC_IRM=0 means, that the address multiplexer now selects M register over PC. So content of M is directed to the address bus. In the second phase of this step the data is read from the databus and accumulator is set (A_SET), at the same time Z flag is set (Z_SET) to indicate whether or not the accumulator now has zero value. 

JSR is implemented as follows:
step 1 - fetch
step 2 - same as in case of LDA
step 3 - M is selected as a source for the address multiplexer. According to the control signal definitions higher 4 bits of PC prefixed with "F" (code of JMP instruction) is selected via the data multplexer and written to the memory in the second phase of this cycle. 
step 4 - M is incremented (actually just the LSB bit is set to 1). Lower 8 bits of PC is written to that address in second phase of this cycle. Also in the second phase, the remaining part of the PC (that is the lower 8 bits) is set to the new location defined by M register.

Note that subroutine address must start at an even address. Two bytes must be reserved at the start of the subroutine. The return address of the caller will be saved here. The subroutine itself starts at address+2. To return from the subroutine one must jump to the subroutine entry address (to where the caller location was stored), from where another JMP is issued to return to the caller. To illustrate the whole calling mechanism let us analyze the following code snippet:

loc: JSR  sub -- call subroutine "sub"
....                 -- next instruction is at loc+2

sub: NOP
.......                 -- first real subroutine instruction at sub+2

       JMP sub-1 -- essentially an RTS

When the return address is stored, then PC points to the location loc+1, from where the subroutine location sub was fetched. So loc+1 is saved to sub and sub+1 (two bytes used). When JSR micro-sequence is finished, then PC will have value of sub+1. This is fine, since in the course of the next instruction fetch first PC is incremented, thus effectively fetching instruction from sub+2. When returning, JMP stored at sub will point to loc+1, which is also fine because of the same reason.  

The cost of a subroutine in terms of CPU cycle can be given by:
4 cycles (to perform JSR) and another 6 cycles (two jump instructions), which sums up to 10 cycles. 

Szólj hozzá!

Címkék: cpu homebrew homebuilt micro-sequencer

Arithmetic and Logical Unit (ALU)

2012.04.25. 21:25 Budapesti álmodozó

The following diagram depicts the ALU along with the A and S registers:


ALU implements 8 arithmetical/logical functions. The 3 least significant bits of operation code determine the ALU function (ALU2..0) . As already mentioned input of A register is not directly connected to the databus, therefore identity function (ID) is used to actually bypass the ALU and load A register from memory. LDA (%1000) uses this ALU function when loading accumulator from memory.

Please note that ROR and ID only have one operand, which is the A register and data from databus respectively. Whereas all other functions need two operands: accumulator and 8 bit data coming from the databus via the XOR gates. 

Almost all ALU functions (except compare, substract, addition which are executed by the same unit/circuit, hence only the result of one of them is available at a given time) are performed in parallel, even if the operation being executed is not a logical/arithmetic opearation (e.g. conditional jump, load or store, etc.). However, the actual result is selected by the multiplexer, which is controlled by the ALU function code. The diagram above may indicate that there are two output lines from the multiplexer, but in reality there is just one 8 bit line connected to both data input of accumulator and the NOR gate. 

 Code   ALU function 

Data from bus

000 Identity  Yes ALU is basically bypassed. 
001 rotation  Yes
010 logical bitwise AND  No
011 logical, bitwise OR  No
100 compare  Yes basically add function is used, but with memory data being complemented (ones' complement created) 
101 substract  Yes Works exactly like compare. The only difference is that in case of compare only status register (C and Z flags) are updated, whereas substract also stores result in accumulator.
110 addition  No
111 logical bitwise XOR  No


Ones' complement is generated by setting ALU_SRV_CNV control line to high. The complement function is perfomed by the XOR function, which is feeded by the data bus. The ALU_SRC_INV line is high whenever ALU1 is low, but always low if MSB of instruction code is high. It is worth noting that a slightly modified design could have led to a little simpler (physical) implementation. Namely, if we let any ALU function bit (ALU2..0) directly control the ALU_SRC_INV line without the control signal being inverted. But I wanted to have op-code %0000 reserved for NOP. Naturally an off-the-shelf ALU chip (actually two) could have saved quite a number of components, but I wanted to build the ALU on my own.

When performing substraction, the two's complement of operand (other than accumulator) is generated by setting carry flag to high and at the same time negating the operand bitwise (ones' complement).

Compare (CMP) works just like substration with the difference, that result is thrown away, and not written to accumulator. Only zero and carry flags are affected.

Rotation can be simply done by wiring input lines accordingly, that is, no shift register or any kind of other component is necessary.

A 8 input NOR gate is used to detect if the result is zero. Please notice all the other logical gates are two input gates.

Szólj hozzá!

Címkék: cpu unit homebrew alu logical arithmetic

Control Signals

2012.04.23. 23:44 Budapesti álmodozó

The following table depicts the control signals used to control the CPU components (registers, multiplexers, ALU, etc.):


Phase  Description
PC PC_INC  1 increments program counter
PC PC_SET  2 sets program counter to a specific value
A A_SET  2 sets accumulator
I I_SET  full sets instruction register





increments M register (basically just ORing its LSB with the control signal)

Sets M register

Z Z_WRI  2 writes Z flag accordingly after aritmethic operation
C C_CLR_N  1 clears C flag
C C_SET_N  1 sets C flag
C C_WRI  2 writes C flag accordingly after aritmethic operation
Address Dest.
ADR_PC_IRM  full Directs 12 bit PC or 12 bit M to address bus
Data Dest.
DAT_OUT_SEL1  full Selects among A register, low 8 bit of PC and upper 4 bit of PC prefixed with JMP code
Data Dest.
DAT_OUT_SEL2  full

SEL1 SEL2 Selected
  0    0      A reg.
  0    1      A reg.
  1    0      PCH
  1    1      PCL


Data Dest.
DAT_OUT_DIS  full disables data output from data bus
ALU ALU0  full selects ALU function
000 identity
001 ROR
010 AND
011 ORA
100 ADD
101 SUB
110 CMP
111 XOR
ALU ALU1  full
ALU ALU2  full
ALU ALU_SRC_INV  full invert ALU source coming from data bus?
1 invert source
0  do not invert source
RAM RAM_OE_N  full disables RAM output
0 do not disable
1 disable 
RAM RAM_WRI_N  2 write data?
0 write
1 read 
any register MR or MR_N  n/a master reset


A very simple two phase clock signal is used to shorten micro-instruction cycle length. Value in phase column refers to when the signal in question is applied. Whether during the entire cycle (full), only during cycle 1 or cycle 2, or it is not applicable (n/a).

Szólj hozzá!

Címkék: control signal cpu homebrew bus

Homebrew CPU Architecture

2012.04.21. 23:21 Budapesti álmodozó

Based on operational concept I have sketched the CPU architecture as follows:

The block diagram below (hopefully) clearly illustrates what this architecture is capable of. Which is not really much, but it is good enough to support the operational concept.


It is a good starting point to have an overview of the architecture in form of a diagram, when designing any relatively complex system. Many pitfalls can be eliminated just by looking at the diagram and double-checking whether or not it conforms to the operational concept.

The address bus selects its source via a 12 bit multiplexer. When the next instruction (or operand) is fetched, the PC is directed through the multiplexer. 

Since the databus is 8 bits wide, always one byte information is read from the memory.

During instruction fetch, the higher 4 bits go into the I register, whereas the lower 4 bits are sent to the upper 4 bits of M register. The whole content (representing the instruction code) of I registers goes to the control unit. Please note that control lines are not indicated in the diagram.

When instruction without operand (NOP, ROR, CLC, SEC) is loaded, then no other fetch is executed. (Note in this case the lower 4 bits are ignored). Otherwise, another byte from the databus is loaded to the lower 8 bits of M register.

When performing arithmetic operations the 12 bit M register is selected by the address multiplexer and data available at the particular address will be present on the databus. Also when executing unconditional jump, conditional jump with condition fulfilled, or jump to subroutine then content of M register is written to the PC via the multiplexer and the address bus.

In addition to the logical and arithmetical operations depicted in the table of operational concept, ALU can act as an identity function, this is in fact, how A register is loaded, since A has no direct input other than from ALU.

It is also worth mentioning that one of the ALU sources could have been the output of the multiplexer, instead of A register. This could obviously lead to a more flexible design (at the cost of adding another couple of tri-state buffers), but this is not really required with the instruction in mind as described earlier. 

Generally, ALU is not restricted to performing aritmethic operations on accumulator solely. Many CPU architecture uses ALU , for instance, to increment PC as well. In our case ALU and PC are incompatible in size, therefore I opted for not using ALU to increment PC. PC will be physically implemented as a counter, being capable of inrcementing itself on its own. This way we may also exploit more parallelism (i.e. PC can be incremented, while ALU operation is performed on A register), this may result in shorter micro-instruction cycles.

Data multiplexer selects among accumulator, lower 8 bits of PC, higher 4 bits of PC prefixed by #$F, which happens to be the instruction code of unconditional jump. When storing return address before a subroutine call, 12 bit PC is saved along with the jump code. 

I must mention that some of the details in the diagram were created already having a concrete implementation in my mind. When content of M register is directed through the multiplexer, the LSB bit is the logical OR of LSB bit of M and a control signal. This comes handy while storing return address at subroutine entry point. Basically with the OR gate provided, we can "increment" the M register at the cost of a small inconvenience: every subroutine must start at even address. The benefit is to save some components/data paths.

Szólj hozzá!

Címkék: cpu architecture homebrew hommade

Operational Concept

2012.04.20. 11:56 Budapesti álmodozó


So here is the operational concept for my CPU.

- 12 bit PC (program counter)* 
- 8 bit A (accumulator)
- 4 bit I (instruction register)*
- 12 bit M (memory address register)*
- 2 bit (status register with Carry and Zero flags)

*Registers are marked with asterisk are internal registers, and as such are not accessible for the programmer.

Hence the address bus and data bus are 12 and 8 bit wide respectively. Address space is from $000 to $FFF (4 Kb). It also should support 8 bit I/O for communication.

Code   Mnemonic   Description  Operand   Flags  affected
0 NOP No operation none none
1 ROR simple rotate right operation. Note carry flag is ignored. Bits will cycle/shift in cicrular, so MSB will have the value of LSB after rotation. none Z
2 AND Logical AND  12 bit address Z
3 ORA Logical ORA 12 bit address Z
4 CMP compare accumulator and memory data . Carry flag must be set beforehand. 12 bit address C, Z
5 SUB substract memory data from accumulator. Carry flag must be set beforehand. 12 bit address C, Z
6 ADD add accumulator and memory data. Carry flag must be cleared beforehand. 12 bit address C, Z
7 XOR Logical exclusive or between accumulator and memory data. 12 bit address Z
8 LDA Load accumulator from memory. 12 bit address Z
9 STA Store accumultor in memory. 12 bit address none
A CLC Clear carry flag none C=0
B SEC Set clarry flag none C=1
C JNZ jump if not zero (if taken, program continues at address+1 afterwards) 12 bit address none
D JNC jump if not carry set (if taken, program continues at address+1 afterwards) 12 bit address none
E JSR jump to subroutine (address must be even) 12 bit address none
F JMP jump uncondionally (program continues at address+1 afterwards) 12 bit address none


1. All arithmetic operations are performed between accumulator and memory and their results are stored in accumulator. C and Z flags are set accordingly as depicted in the table above.
2.  Immediate values are not supported. Constant values must be stored in the memory, and references to those location must be made via a 12 bit address. Programmer must guarantee that the stored values for constant do not change in the course of the execution.
3. Although NOT operation is not available, NOT can be performed using XOR #$FF (which is essentially XOR $ADR, where [ADR]=$FF) 
4. It is interesting to note that Carry must be set beforehand when comparing 8 bit values. 
5. The observant reader may wonder, that there is a JSR instruction, whereas there is no stack and there is no RTS (return) statement either. Still subroutines are supported with some restrictions in a way similar to PDP-7 or PDP-11 (not sure which). When calling a subroutine the caller address is stored at the subroutine entry. Subroutine code starts at position 3 (2 positions are required for storing return address with JMP statement). To return from subroutine a jump must be issued to subroutine position 0. Limitations: Clearly, recursion is not supported. Subroutine (entirely) cannot be placed in ROM.
6. I/O is memory mapped, as there is no dedicated I/O operation.
7. As there is no support for indexing, indirection, useful code segments required to sweep through some memory must be stored in RAM.
8. Only rotation to right is included, since rotate to left is superfluous, as it can be circumvented by adding a value  to itself.
9. 16 (or nx8) bit arithmetic is supported in the ususal way. For example, 16 bit addition can be written as:

Where VR (pair of VRH and VRL) is the result of V1 (VH1, VL1) + V2 (VH2 , VL2).

I believe that with the instruction set highlighed above, useful programs can be constructed in the memory space (4 Kb) provided.

Szólj hozzá!

Címkék: homemade concept cpu homebrew built operational

Homebrew CPU Alternatives

2012.04.18. 22:22 Budapesti álmodozó

Many kind of homebrew CPUs have already been designed and produced. From 4 bit to 32 bit processors. Some of which is capable of interrupt handling and running unix-like operating systems. Quite an achievement for such home project.

Being aware of my electronic skills, or better put the lack of them, I immediately realized that I want something very simple. Since the more complex CPU one wants to build (with more operation, more registers, etc.), the quicker it becomes difficult its construction.  


I must mention I have been educated on this subject, namely computer architectures in college. I more or less have a high level concept of the basic principles how processors operate. I have also had lectures on advanced computer architectures, but clearly my project would not reach a level, where the latter knowledge may be proven useful.


Initially, I wanted the CPU to target the so-called L programming language , used in  computability theory. L is a very simple language, yet Turing-complete. It operates with variables storing positive integers only. It has three command:


  • Increase the value of a variable.

  • Decrease the value of the variable, provided that the value is greater than zero before decrementing it.

  • Jump to a label.


My initial design consists of operations:


  • INC M - increment memory by one

  • DEC M - decrement memory by one, provided that R is greater than zero before the operation

  • JNZ – jump if the last referenced memory cell is not zero

  • JZ – jump if the last referenced memory cell is zero.



Soon I have realized that with the architecture/hardware components (or the slight modification and rearrangement of them) targeted to implement L, a more complex and useful architecture can be built, which can be conveniently programmed despite its simplicity.



So I decided to come up with a processor only with 4 commands, but commands being different from the ones above. These are:


  • MOV [DR],[SR] – move data the source location identified by the value stored at SR, to the destination location identified by the value stored at DR. 

  • ADD [DR],[SR] - add source value to destination value and store it as destination value  (for interpretation of [DR], [SR] see MOV sematincs above)

  • NOR [DR],SR - same as ADD, except logical operation NOR is performed.
  • JNZ , JCC     - jump if not zero, jump if clarry clear

Where SR and DR denotes source and desitnation register respectively. The name source and desctination register may be misleading, as they are actually memory addresses. 

Please note that [] denotes (double) indirection.

It is also worth noting that as the operational concept reveals, no register is actually accessible for the programmer. One of its consequences is that almost every operation requires accessing memory a number of times. Thus micro-instructions become long. Depending on the actual architecture implementation ADD can be as long as 7 or 8 cycles. 

Constants must be stored along with the programs, as immediate values cannot be fetched (no such instruction and/or addressing mode). Furthermore, accessing a constant requires defining another (constant) pointer, which is pointing to the constant location. So it is quite cumbersome to program in the assembly language defined by these four operation, but it is not impossible. Suprisingly some tasks can be done quite efficiently, e.g. copying data from one location to another. 

Despite its unorthodox approach, or maybe because of it, I found interesting and poweful enough this concept. 

However, more problem arises when 8 and 12/16 bit arithmetic come into play. I wanted to have 8 bit databus and 12, maybe 16 bit wide address bus and I also wanted to have 8 bit registers (I talk about internal registers. Note that although accumulator register is hidden from the programmer still required.) I have come up a number of solutions to overcome the problem: referencing data via 8 bit registers on a 12 bit wide address bus. A separate data segment register was introduced to agument the 8 bits with additional 4 bits to have a total of 12 bits. 

Clearly the issue can be eliminated if that data and address buses are compatible, ie. they are both 12 or 16 bit wide. 

Anyway, the whole thing was getting more and more complicated. Finally, I had to drop the idea of going on with this concept.

I sketched a more conventional operational concept. To be honest quite a number of them. They all share the same property, that they support accumulator-based computation and the instructions sets look quite similar to those of MOS 6502, although many addressing modes and X and Y registers are missing.  

I started to search for an optimal instruction set, which can be encoded in four bits. 







Szólj hozzá!

Címkék: cpu homebrew alternatives

Motivation - background

2011.11.28. 14:39 Budapesti álmodozó

First I must emphasize that there are a number of different sites dedicated to producing and documenting homebrew CPUs and computers. Most of them much better than this blog.

So here is a link for the reader who might want to educate him/herself on this topic:

The question immediately arises: Why does anyone need another homebrew project?

I have noticed that I tend to have similar attributes/background to those working on these crazy projects.

I have been always fascinated with electronics, computers and programming (in this particular order). As a kid (at the age of something like from 10 to 12) I had been designing basic electronic circuits. I had not had much knowledge of electronics of course.

Once in a while I could meet my uncle who being an electrical engineer guided me. Unfortunately he lived in a different country (still does) and therefore we could meet like once for an afternoon for every two years. 

I was just rather experimenting with basic circuits, combined them and wanted to be able to build more complex and useful electronic components/devices. At that time I totally believed in myself and I though that I could build almost anything even not having any proper backgrounds, nor education on the topic.

Not just I was a true believer but I happened to be quite convincing...  

In grammar school I was designing small electronic "games" similar to those LCD handhelds, which enjoyed quite popularity in the 1980s. Of course my electrical devices were nothing but crap compared to the products available on the market in those days.

In fact, I was only able to build just one working prototype (very basic with some blinking light bulbs and two buttons).

Despite this fact, or because of it I produced catalogues of my devices (with product code, game design,  etc.) to come. I showed this brochure in school to those who were interested in owning a handheld device, but their parents were not able to afford to buy one. 

One afternoon a family of fours visited our house. Up to this point they had never been to our place. The guy was a classmate of mine.

My father welcomed them. They happened to come actually buy a product, a handheld device. Funny is not it?

They actually believed that I could produce such handheld devices and sell them. I was very embarrassed and just watching from the background. Father not knowing almost anything of my hobby and my promises, informed them kindly about where those devices are available for sale. 

Soon after they left with disappointments on their faces. 

I was something like 13, when I was given a C-64. From that point on, I gave up designing electronics and devoted my time to computers almost entirely. 

This turned out to be very useful as far as the tidiness of my room was concerned. With the C-64 I did not produce such mess, so my family was happier then ever before.

Once in the mid 1980s I had the opportunity to visit my uncle abroad. I spent like two weeks with them in the summer. They were very kind and welcoming, organizing all kind of outdoor programs/activities for me, but really all I longed for was just sitting in my uncle's working room and design electronic stuff with him.

In fact, this very summer I designed my first computer-like device. Which as far as I can recall was just based on some counter. It was supposed to be able to give a control signal each step when counting from zero to upwards.

Needless to say it never materialized. Actually it could not have worked since I just simply did not have the technical skills to design such thing.

Then I turned again away from electronics in favour of computing.

Later in college, I have had some kind of courses dedicated to electronics, but at that time I just simply did not want to learn electronics. All I was interested in programming and software development. To be honest, I actually could acquire almost no knowledge out of those courses. It is a shame... 

After being a software developer for many years, an Oracle consultant to be more precise, I turned to electronics again.

It happened exactly on 5th of October, 2011. Steve Jobs died on this day. Although I have never been an Apple fan, never paid particular attention to Apple products I was touched by hearing the sad news. It is difficult to tell what exactly I felt. It was a kind of emptiness. It was as I had lost a close relative. I guess that with Steve a whole era has died. The golden era of computing of the seventies and early eighties. 

I started to study Apple and its history. I came across with the other Steve, Steve Wozniak and his story, which gave me tremendous inspiration. I decided that I would study electronics. I searched for old books on this subject. I started reading them and refreshing my knowledge. Then I soon realized that analogue electronics is a huge and very difficult subject indeed. So I decided to turn my attention to digital electronics, which looked simpler and more promising at least in terms of how a cicruit can be designed and built. Learning by doing is an easy and fun way. That is why I decided that I will actually learn digital electronics by designing and building simple devices. 

I have then decided that the device should be a CPU. Obviously one cannot design/produce a CPU alone meeting today's standards and performance, but this is not really my goal. My goal is:

1. As already stated to learn digital electronics by doing it. That is to design and build a CPU, a very simple one, one of the simplest possible, which is still complex enough to be called a CPU. But more on this later.

2. To better understand computer architectures.

3. And last but not least having fun.

This is how this blog was born.

Szólj hozzá!

Címkék: design cpu computer homebrew