Homebrew CPU construction / házi készítésű processzor

Friss topikok



Homebrew CPU Micro-Sequencer

2012.06.16. 19:18 Budapesti álmodozó

So far only the static part of the CPU has been subject to discussion. Now we are about to define the most interesting part: the control unit. The control unit in our case is basically a micro-sequencer, which is responsible for triggering the right control signals at the right moment and some logic gluing everything together.

The table below describes how the 16 operations are implemented by the micro-sequencer. For each instruction a number of steps is required. Some operations are made up of less steps, others comprise more. The most complex operation is the JSR, which contains four steps. The shortest, simplest one is NOP, which is essentailly a one-cycle operation. 

step 1 step 2 step 3 step 4




RR=RAM Required






(not taken)

same as above


(not required)

JSR same as above








same as above





ROR same as above





same as above





CMP same as above






LDA same as above






STA same as above






CLC same as above



SEC same as above



NOP same as above EOP


Each step requires one processor cycle. EOP (stands for end of operation) denotes end of a micro instruction sequence. When executing EOP, the micro-sequencer restarts immediately from step 1, and a new operation is fetched/selected, thus operation length (in terms of CPU cycles) must be calculated by ignoring steps with EOP.

For each step a number of control signals are activated as shown in the table above along with their respective operations. These signals control the registers and the ALU. 

The first step is the same for all operations. In this step the new instruction code is fetched to the intstruction register and also the upper four bit is fetched to the M register. Please note that to save steps as well as cycles (otherwise an extra step would have been required to increment PC), PC is incremented prior to instruction fetch. This has the consequence that jump instructions (JMP, JNC, JNZ, but not JSR) must point to target address-1, since after jump being executed, before the next instruction is fetched the PC is incremented. This small inconvenience can be hidden/cicrumvented by a good assembler. 

Please also pay attention to how and when (which phase of the cycle) control signals are triggered:

Let us have a look at some examples, which may help understand the micro-sequencer logic better. For instance, LDA works as follows:
step 1 - to fetch instruction as described above.
step 2 - PC is further increased. Then in the second phase of this cycle M register is set. By the end of this step M register contains the 12 bit address of the operand.
step 3 -  by signalling ADR_PC_IRM=0 means, that the address multiplexer now selects M register over PC. So content of M is directed to the address bus. In the second phase of this step the data is read from the databus and accumulator is set (A_SET), at the same time Z flag is set (Z_SET) to indicate whether or not the accumulator now has zero value. 

JSR is implemented as follows:
step 1 - fetch
step 2 - same as in case of LDA
step 3 - M is selected as a source for the address multiplexer. According to the control signal definitions higher 4 bits of PC prefixed with "F" (code of JMP instruction) is selected via the data multplexer and written to the memory in the second phase of this cycle. 
step 4 - M is incremented (actually just the LSB bit is set to 1). Lower 8 bits of PC is written to that address in second phase of this cycle. Also in the second phase, the remaining part of the PC (that is the lower 8 bits) is set to the new location defined by M register.

Note that subroutine address must start at an even address. Two bytes must be reserved at the start of the subroutine. The return address of the caller will be saved here. The subroutine itself starts at address+2. To return from the subroutine one must jump to the subroutine entry address (to where the caller location was stored), from where another JMP is issued to return to the caller. To illustrate the whole calling mechanism let us analyze the following code snippet:

loc: JSR  sub -- call subroutine "sub"
....                 -- next instruction is at loc+2

sub: NOP
.......                 -- first real subroutine instruction at sub+2

       JMP sub-1 -- essentially an RTS

When the return address is stored, then PC points to the location loc+1, from where the subroutine location sub was fetched. So loc+1 is saved to sub and sub+1 (two bytes used). When JSR micro-sequence is finished, then PC will have value of sub+1. This is fine, since in the course of the next instruction fetch first PC is incremented, thus effectively fetching instruction from sub+2. When returning, JMP stored at sub will point to loc+1, which is also fine because of the same reason.  

The cost of a subroutine in terms of CPU cycle can be given by:
4 cycles (to perform JSR) and another 6 cycles (two jump instructions), which sums up to 10 cycles. 

Szólj hozzá!

Címkék: cpu homebrew homebuilt micro-sequencer

Arithmetic and Logical Unit (ALU)

2012.04.25. 21:25 Budapesti álmodozó

The following diagram depicts the ALU along with the A and S registers:


ALU implements 8 arithmetical/logical functions. The 3 least significant bits of operation code determine the ALU function (ALU2..0) . As already mentioned input of A register is not directly connected to the databus, therefore identity function (ID) is used to actually bypass the ALU and load A register from memory. LDA (%1000) uses this ALU function when loading accumulator from memory.

Please note that ROR and ID only have one operand, which is the A register and data from databus respectively. Whereas all other functions need two operands: accumulator and 8 bit data coming from the databus via the XOR gates. 

Almost all ALU functions (except compare, substract, addition which are executed by the same unit/circuit, hence only the result of one of them is available at a given time) are performed in parallel, even if the operation being executed is not a logical/arithmetic opearation (e.g. conditional jump, load or store, etc.). However, the actual result is selected by the multiplexer, which is controlled by the ALU function code. The diagram above may indicate that there are two output lines from the multiplexer, but in reality there is just one 8 bit line connected to both data input of accumulator and the NOR gate. 

 Code   ALU function 

Data from bus

000 Identity  Yes ALU is basically bypassed. 
001 rotation  Yes
010 logical bitwise AND  No
011 logical, bitwise OR  No
100 compare  Yes basically add function is used, but with memory data being complemented (ones' complement created) 
101 substract  Yes Works exactly like compare. The only difference is that in case of compare only status register (C and Z flags) are updated, whereas substract also stores result in accumulator.
110 addition  No
111 logical bitwise XOR  No


Ones' complement is generated by setting ALU_SRV_CNV control line to high. The complement function is perfomed by the XOR function, which is feeded by the data bus. The ALU_SRC_INV line is high whenever ALU1 is low, but always low if MSB of instruction code is high. It is worth noting that a slightly modified design could have led to a little simpler (physical) implementation. Namely, if we let any ALU function bit (ALU2..0) directly control the ALU_SRC_INV line without the control signal being inverted. But I wanted to have op-code %0000 reserved for NOP. Naturally an off-the-shelf ALU chip (actually two) could have saved quite a number of components, but I wanted to build the ALU on my own.

When performing substraction, the two's complement of operand (other than accumulator) is generated by setting carry flag to high and at the same time negating the operand bitwise (ones' complement).

Compare (CMP) works just like substration with the difference, that result is thrown away, and not written to accumulator. Only zero and carry flags are affected.

Rotation can be simply done by wiring input lines accordingly, that is, no shift register or any kind of other component is necessary.

A 8 input NOR gate is used to detect if the result is zero. Please notice all the other logical gates are two input gates.

Szólj hozzá!

Címkék: cpu unit homebrew alu logical arithmetic

Control Signals

2012.04.23. 23:44 Budapesti álmodozó

The following table depicts the control signals used to control the CPU components (registers, multiplexers, ALU, etc.):


Phase  Description
PC PC_INC  1 increments program counter
PC PC_SET  2 sets program counter to a specific value
A A_SET  2 sets accumulator
I I_SET  full sets instruction register





increments M register (basically just ORing its LSB with the control signal)

Sets M register

Z Z_WRI  2 writes Z flag accordingly after aritmethic operation
C C_CLR_N  1 clears C flag
C C_SET_N  1 sets C flag
C C_WRI  2 writes C flag accordingly after aritmethic operation
Address Dest.
ADR_PC_IRM  full Directs 12 bit PC or 12 bit M to address bus
Data Dest.
DAT_OUT_SEL1  full Selects among A register, low 8 bit of PC and upper 4 bit of PC prefixed with JMP code
Data Dest.
DAT_OUT_SEL2  full

SEL1 SEL2 Selected
  0    0      A reg.
  0    1      A reg.
  1    0      PCH
  1    1      PCL


Data Dest.
DAT_OUT_DIS  full disables data output from data bus
ALU ALU0  full selects ALU function
000 identity
001 ROR
010 AND
011 ORA
100 ADD
101 SUB
110 CMP
111 XOR
ALU ALU1  full
ALU ALU2  full
ALU ALU_SRC_INV  full invert ALU source coming from data bus?
1 invert source
0  do not invert source
RAM RAM_OE_N  full disables RAM output
0 do not disable
1 disable 
RAM RAM_WRI_N  2 write data?
0 write
1 read 
any register MR or MR_N  n/a master reset


A very simple two phase clock signal is used to shorten micro-instruction cycle length. Value in phase column refers to when the signal in question is applied. Whether during the entire cycle (full), only during cycle 1 or cycle 2, or it is not applicable (n/a).

Szólj hozzá!

Címkék: control signal cpu homebrew bus

Homebrew CPU Architecture

2012.04.21. 23:21 Budapesti álmodozó

Based on operational concept I have sketched the CPU architecture as follows:

The block diagram below (hopefully) clearly illustrates what this architecture is capable of. Which is not really much, but it is good enough to support the operational concept.


It is a good starting point to have an overview of the architecture in form of a diagram, when designing any relatively complex system. Many pitfalls can be eliminated just by looking at the diagram and double-checking whether or not it conforms to the operational concept.

The address bus selects its source via a 12 bit multiplexer. When the next instruction (or operand) is fetched, the PC is directed through the multiplexer. 

Since the databus is 8 bits wide, always one byte information is read from the memory.

During instruction fetch, the higher 4 bits go into the I register, whereas the lower 4 bits are sent to the upper 4 bits of M register. The whole content (representing the instruction code) of I registers goes to the control unit. Please note that control lines are not indicated in the diagram.

When instruction without operand (NOP, ROR, CLC, SEC) is loaded, then no other fetch is executed. (Note in this case the lower 4 bits are ignored). Otherwise, another byte from the databus is loaded to the lower 8 bits of M register.

When performing arithmetic operations the 12 bit M register is selected by the address multiplexer and data available at the particular address will be present on the databus. Also when executing unconditional jump, conditional jump with condition fulfilled, or jump to subroutine then content of M register is written to the PC via the multiplexer and the address bus.

In addition to the logical and arithmetical operations depicted in the table of operational concept, ALU can act as an identity function, this is in fact, how A register is loaded, since A has no direct input other than from ALU.

It is also worth mentioning that one of the ALU sources could have been the output of the multiplexer, instead of A register. This could obviously lead to a more flexible design (at the cost of adding another couple of tri-state buffers), but this is not really required with the instruction in mind as described earlier. 

Generally, ALU is not restricted to performing aritmethic operations on accumulator solely. Many CPU architecture uses ALU , for instance, to increment PC as well. In our case ALU and PC are incompatible in size, therefore I opted for not using ALU to increment PC. PC will be physically implemented as a counter, being capable of inrcementing itself on its own. This way we may also exploit more parallelism (i.e. PC can be incremented, while ALU operation is performed on A register), this may result in shorter micro-instruction cycles.

Data multiplexer selects among accumulator, lower 8 bits of PC, higher 4 bits of PC prefixed by #$F, which happens to be the instruction code of unconditional jump. When storing return address before a subroutine call, 12 bit PC is saved along with the jump code. 

I must mention that some of the details in the diagram were created already having a concrete implementation in my mind. When content of M register is directed through the multiplexer, the LSB bit is the logical OR of LSB bit of M and a control signal. This comes handy while storing return address at subroutine entry point. Basically with the OR gate provided, we can "increment" the M register at the cost of a small inconvenience: every subroutine must start at even address. The benefit is to save some components/data paths.

Szólj hozzá!

Címkék: cpu architecture homebrew hommade

Operational Concept

2012.04.20. 11:56 Budapesti álmodozó


So here is the operational concept for my CPU.

- 12 bit PC (program counter)* 
- 8 bit A (accumulator)
- 4 bit I (instruction register)*
- 12 bit M (memory address register)*
- 2 bit (status register with Carry and Zero flags)

*Registers are marked with asterisk are internal registers, and as such are not accessible for the programmer.

Hence the address bus and data bus are 12 and 8 bit wide respectively. Address space is from $000 to $FFF (4 Kb). It also should support 8 bit I/O for communication.

Code   Mnemonic   Description  Operand   Flags  affected
0 NOP No operation none none
1 ROR simple rotate right operation. Note carry flag is ignored. Bits will cycle/shift in cicrular, so MSB will have the value of LSB after rotation. none Z
2 AND Logical AND  12 bit address Z
3 ORA Logical ORA 12 bit address Z
4 CMP compare accumulator and memory data . Carry flag must be set beforehand. 12 bit address C, Z
5 SUB substract memory data from accumulator. Carry flag must be set beforehand. 12 bit address C, Z
6 ADD add accumulator and memory data. Carry flag must be cleared beforehand. 12 bit address C, Z
7 XOR Logical exclusive or between accumulator and memory data. 12 bit address Z
8 LDA Load accumulator from memory. 12 bit address Z
9 STA Store accumultor in memory. 12 bit address none
A CLC Clear carry flag none C=0
B SEC Set clarry flag none C=1
C JNZ jump if not zero (if taken, program continues at address+1 afterwards) 12 bit address none
D JNC jump if not carry set (if taken, program continues at address+1 afterwards) 12 bit address none
E JSR jump to subroutine (address must be even) 12 bit address none
F JMP jump uncondionally (program continues at address+1 afterwards) 12 bit address none


1. All arithmetic operations are performed between accumulator and memory and their results are stored in accumulator. C and Z flags are set accordingly as depicted in the table above.
2.  Immediate values are not supported. Constant values must be stored in the memory, and references to those location must be made via a 12 bit address. Programmer must guarantee that the stored values for constant do not change in the course of the execution.
3. Although NOT operation is not available, NOT can be performed using XOR #$FF (which is essentially XOR $ADR, where [ADR]=$FF) 
4. It is interesting to note that Carry must be set beforehand when comparing 8 bit values. 
5. The observant reader may wonder, that there is a JSR instruction, whereas there is no stack and there is no RTS (return) statement either. Still subroutines are supported with some restrictions in a way similar to PDP-7 or PDP-11 (not sure which). When calling a subroutine the caller address is stored at the subroutine entry. Subroutine code starts at position 3 (2 positions are required for storing return address with JMP statement). To return from subroutine a jump must be issued to subroutine position 0. Limitations: Clearly, recursion is not supported. Subroutine (entirely) cannot be placed in ROM.
6. I/O is memory mapped, as there is no dedicated I/O operation.
7. As there is no support for indexing, indirection, useful code segments required to sweep through some memory must be stored in RAM.
8. Only rotation to right is included, since rotate to left is superfluous, as it can be circumvented by adding a value  to itself.
9. 16 (or nx8) bit arithmetic is supported in the ususal way. For example, 16 bit addition can be written as:

Where VR (pair of VRH and VRL) is the result of V1 (VH1, VL1) + V2 (VH2 , VL2).

I believe that with the instruction set highlighed above, useful programs can be constructed in the memory space (4 Kb) provided.

Szólj hozzá!

Címkék: homemade concept cpu homebrew built operational

Homebrew CPU Alternatives

2012.04.18. 22:22 Budapesti álmodozó

Many kind of homebrew CPUs have already been designed and produced. From 4 bit to 32 bit processors. Some of which is capable of interrupt handling and running unix-like operating systems. Quite an achievement for such home project.

Being aware of my electronic skills, or better put the lack of them, I immediately realized that I want something very simple. Since the more complex CPU one wants to build (with more operation, more registers, etc.), the quicker it becomes difficult its construction.  


I must mention I have been educated on this subject, namely computer architectures in college. I more or less have a high level concept of the basic principles how processors operate. I have also had lectures on advanced computer architectures, but clearly my project would not reach a level, where the latter knowledge may be proven useful.


Initially, I wanted the CPU to target the so-called L programming language , used in  computability theory. L is a very simple language, yet Turing-complete. It operates with variables storing positive integers only. It has three command:


  • Increase the value of a variable.

  • Decrease the value of the variable, provided that the value is greater than zero before decrementing it.

  • Jump to a label.


My initial design consists of operations:


  • INC M - increment memory by one

  • DEC M - decrement memory by one, provided that R is greater than zero before the operation

  • JNZ – jump if the last referenced memory cell is not zero

  • JZ – jump if the last referenced memory cell is zero.



Soon I have realized that with the architecture/hardware components (or the slight modification and rearrangement of them) targeted to implement L, a more complex and useful architecture can be built, which can be conveniently programmed despite its simplicity.



So I decided to come up with a processor only with 4 commands, but commands being different from the ones above. These are:


  • MOV [DR],[SR] – move data the source location identified by the value stored at SR, to the destination location identified by the value stored at DR. 

  • ADD [DR],[SR] - add source value to destination value and store it as destination value  (for interpretation of [DR], [SR] see MOV sematincs above)

  • NOR [DR],SR - same as ADD, except logical operation NOR is performed.
  • JNZ , JCC     - jump if not zero, jump if clarry clear

Where SR and DR denotes source and desitnation register respectively. The name source and desctination register may be misleading, as they are actually memory addresses. 

Please note that [] denotes (double) indirection.

It is also worth noting that as the operational concept reveals, no register is actually accessible for the programmer. One of its consequences is that almost every operation requires accessing memory a number of times. Thus micro-instructions become long. Depending on the actual architecture implementation ADD can be as long as 7 or 8 cycles. 

Constants must be stored along with the programs, as immediate values cannot be fetched (no such instruction and/or addressing mode). Furthermore, accessing a constant requires defining another (constant) pointer, which is pointing to the constant location. So it is quite cumbersome to program in the assembly language defined by these four operation, but it is not impossible. Suprisingly some tasks can be done quite efficiently, e.g. copying data from one location to another. 

Despite its unorthodox approach, or maybe because of it, I found interesting and poweful enough this concept. 

However, more problem arises when 8 and 12/16 bit arithmetic come into play. I wanted to have 8 bit databus and 12, maybe 16 bit wide address bus and I also wanted to have 8 bit registers (I talk about internal registers. Note that although accumulator register is hidden from the programmer still required.) I have come up a number of solutions to overcome the problem: referencing data via 8 bit registers on a 12 bit wide address bus. A separate data segment register was introduced to agument the 8 bits with additional 4 bits to have a total of 12 bits. 

Clearly the issue can be eliminated if that data and address buses are compatible, ie. they are both 12 or 16 bit wide. 

Anyway, the whole thing was getting more and more complicated. Finally, I had to drop the idea of going on with this concept.

I sketched a more conventional operational concept. To be honest quite a number of them. They all share the same property, that they support accumulator-based computation and the instructions sets look quite similar to those of MOS 6502, although many addressing modes and X and Y registers are missing.  

I started to search for an optimal instruction set, which can be encoded in four bits. 







Szólj hozzá!

Címkék: cpu homebrew alternatives

Motivation - background

2011.11.28. 14:39 Budapesti álmodozó

First I must emphasize that there are a number of different sites dedicated to producing and documenting homebrew CPUs and computers. Most of them much better than this blog.

So here is a link for the reader who might want to educate him/herself on this topic:

The question immediately arises: Why does anyone need another homebrew project?

I have noticed that I tend to have similar attributes/background to those working on these crazy projects.

I have been always fascinated with electronics, computers and programming (in this particular order). As a kid (at the age of something like from 10 to 12) I had been designing basic electronic circuits. I had not had much knowledge of electronics of course.

Once in a while I could meet my uncle who being an electrical engineer guided me. Unfortunately he lived in a different country (still does) and therefore we could meet like once for an afternoon for every two years. 

I was just rather experimenting with basic circuits, combined them and wanted to be able to build more complex and useful electronic components/devices. At that time I totally believed in myself and I though that I could build almost anything even not having any proper backgrounds, nor education on the topic.

Not just I was a true believer but I happened to be quite convincing...  

In grammar school I was designing small electronic "games" similar to those LCD handhelds, which enjoyed quite popularity in the 1980s. Of course my electrical devices were nothing but crap compared to the products available on the market in those days.

In fact, I was only able to build just one working prototype (very basic with some blinking light bulbs and two buttons).

Despite this fact, or because of it I produced catalogues of my devices (with product code, game design,  etc.) to come. I showed this brochure in school to those who were interested in owning a handheld device, but their parents were not able to afford to buy one. 

One afternoon a family of fours visited our house. Up to this point they had never been to our place. The guy was a classmate of mine.

My father welcomed them. They happened to come actually buy a product, a handheld device. Funny is not it?

They actually believed that I could produce such handheld devices and sell them. I was very embarrassed and just watching from the background. Father not knowing almost anything of my hobby and my promises, informed them kindly about where those devices are available for sale. 

Soon after they left with disappointments on their faces. 

I was something like 13, when I was given a C-64. From that point on, I gave up designing electronics and devoted my time to computers almost entirely. 

This turned out to be very useful as far as the tidiness of my room was concerned. With the C-64 I did not produce such mess, so my family was happier then ever before.

Once in the mid 1980s I had the opportunity to visit my uncle abroad. I spent like two weeks with them in the summer. They were very kind and welcoming, organizing all kind of outdoor programs/activities for me, but really all I longed for was just sitting in my uncle's working room and design electronic stuff with him.

In fact, this very summer I designed my first computer-like device. Which as far as I can recall was just based on some counter. It was supposed to be able to give a control signal each step when counting from zero to upwards.

Needless to say it never materialized. Actually it could not have worked since I just simply did not have the technical skills to design such thing.

Then I turned again away from electronics in favour of computing.

Later in college, I have had some kind of courses dedicated to electronics, but at that time I just simply did not want to learn electronics. All I was interested in programming and software development. To be honest, I actually could acquire almost no knowledge out of those courses. It is a shame... 

After being a software developer for many years, an Oracle consultant to be more precise, I turned to electronics again.

It happened exactly on 5th of October, 2011. Steve Jobs died on this day. Although I have never been an Apple fan, never paid particular attention to Apple products I was touched by hearing the sad news. It is difficult to tell what exactly I felt. It was a kind of emptiness. It was as I had lost a close relative. I guess that with Steve a whole era has died. The golden era of computing of the seventies and early eighties. 

I started to study Apple and its history. I came across with the other Steve, Steve Wozniak and his story, which gave me tremendous inspiration. I decided that I would study electronics. I searched for old books on this subject. I started reading them and refreshing my knowledge. Then I soon realized that analogue electronics is a huge and very difficult subject indeed. So I decided to turn my attention to digital electronics, which looked simpler and more promising at least in terms of how a cicruit can be designed and built. Learning by doing is an easy and fun way. That is why I decided that I will actually learn digital electronics by designing and building simple devices. 

I have then decided that the device should be a CPU. Obviously one cannot design/produce a CPU alone meeting today's standards and performance, but this is not really my goal. My goal is:

1. As already stated to learn digital electronics by doing it. That is to design and build a CPU, a very simple one, one of the simplest possible, which is still complex enough to be called a CPU. But more on this later.

2. To better understand computer architectures.

3. And last but not least having fun.

This is how this blog was born.

Szólj hozzá!

Címkék: design cpu computer homebrew