Microcode

The C74-6502 microcode is composed of 195 individual microinstructions, each designed to configure the CPU’s datapath in particular ways to achieve specific operations. Looking at these various microinstructions, and how they are used to implement the 6502 Instruction Set, is a great way to understand the C74-6502’s datapath and its operation.

The syntax of the C74-6502 microcode is designed to be mnemonic. That is, it is descriptive of function rather than detailing the values of control signals. The microcode is then “assembled” into a binary representation of microinstructions stored in Control ROMs on the CPU. This listing of all C74-6502 Microinstructions shows a mapping between the mneomic syntax of microinstructions and their binary representation.

The C74-6502 uses vertical microcode, so the binary form of microinstructions is decoded on the fly by the CPU’s Control Unit. The CU’s decoders generate the specific control signals that drive the datapath. This Decoder Values table outlines how C74-6502 microinstructions are encoded.

You can find more information about the specific function of control signals in the descriptions of each CPU card found in the Internals section. For our purposes here, we will focus on the syntax and semantics of the microcode itself.

Basic Microinstructions

Let’s begin by looking at two microinstructions in particular, FetchOpcode and FetchOperand. These microinstructions occur in every 6502 instruction, so it’s a great place to start. FetchOpcode is as follows:

IR := *PC; PC += 1; END

C74-6502 microinstructions are composed of distinct operations, separated by semicolons, which the datapath executes concurrently. Each operation typically configures a specific section of the datapath, causing data to flow through various logic elements, and then latching the result into target registers.

In the FetchOpcode microinstruction, the first operation, “IR := *PC”, uses the address at PC to read memory and latches the data into the Instruction Register. (C programmers will note the borrowed syntax to de-reference PC. In function, it means that the CPU will output the address in the PC register on to the Address Bus).

The next operation, “PC += 1”, increments PC by one. On the C74-6502, this is done by a dedicated 16-bit Incrementer circuit. In the microcode, the “+=“ and “-=“ operators specifically denote the use of the Incrementer, as opposed to the ALU, to perform increment or decrement operations.

Finally, the “END” microcode operation tells the CPU Control Unit that this marks the end of the current 6502 instruction. Internally, the C74-6502 keeps a “Q” Step Counter to index into the microcode ROMs. The Q Counter is incremented by one on every cycle to fetch the next microinstruction. The END operation resets this counter to zero so the first microinstruction of the next opcode will be fetched next.

So, to summarize, the FetchOpcode microinstruction uses the address at PC to load the next opcode into the IR, increments PC by one, and resets the Q counter to 0 for the next instruction.

Let’s now examine the FetchOperand microinstruction:

DPL := B := *PC; PC += 1

The first operation here, “DPL := B := *PC”, dereferences PC once again, but this time the fetched byte is loaded into both the B Data Latch (“B”) and the Data Pointer Low (“DPL”) registers simultaneously. As we will see, the B Data Latch is a special purpose register at the “B” input of the ALU, while DPL is an internal register used in address calculations. The second operation in this microinstruction is the same “increment PC” operation we saw in FetchOpcode, “PC += 1”. So, in summary, FetchOperand uses the address at PC to load an opcode’s operand byte into the B and DPL registers, and increments PC.

Microcode For A Simple 6502 Instruction

Now, let’s see how microinstructions work together to specify a real 6502 instruction. Below is the microcode for the LDA Immediate opcode, $A9. The instruction begins with the $A9 opcode already in the IR, so the first microinstruction for this opcode sequence is a FetchOperand. The microcode is as follows:

DPL := B := *PC; PC +=1
A := B; SETF(NZ); IR := *PC; PC += 1; END

In this case, the FetchOperand retrieves the immediate value for the LDA from memory, at the address pointed to by PC. The next microinstruction is transfers the B Data Latch value into register A. This transfer, “A := B”, is done by way of the ALU. On the C74-6502, register transfer operations like this one pass the data value unchanged through the ALU so the Status Flags can be evaluated in the process. In this case, the “SETF (NZ)” operation indicates that the N and Z flags will be updated based on the data value. Finally, this microinstruction includes also a FetchOpcode operation, which will be executed concurrently with the transfer. The C74-6502’s datapath allows values to be loaded from memory while the ALU is in use, so the FetchOpcode can complete during the ALU cycle itself.

In summary then, the LDA Immediate instruction is made up of two microinstructions, each of which executes in one cycle. The first fetches the immediate operand from memory. The second transfers the value to register A, updates the N and Z flags, and fetches the next opcode.

An ALU Instruction

Let’s now look at an ALU instruction proper, ADC Immediate. The microcode is as follows:

DPL := B := *PC; PC +=1
A := A ADC B; SETF(NZCV); IR := *PC; PC += 1; END.D

By now, this microcode should look familiar. The logic is very similar to LDA Immediate, except that in this case the ALU performs an ADC operation rather then simply passing the value unchanged. The N, Z, C and V flags are updated. (In the microcode, “ADC” is used to indicate an add with carry operation. A “+” is used for add operations where the carry is ignored). There is also a special “END.D” operation. This performs the same function as the END we saw above, but also places the ALU in Decimal Mode if the D status flag is on. When in Decimal Mode, the BCD datapath in the ALU is enabled. Only the ADC and SBC instructions use this special version of the END operation.

A Store Instruction

Let’s now look at a store instruction, STA Absolute. The microcode is as follows:

DPL := B := *PC; PC += 1
DPH := *PC; PC += 1
*DP := A
IR := *PC; PC += 1; END

The absolute address is in memory in low-byte/high-byte format, immediately following the opcode. In the first cycle, the familiar FetchOperand microinstruction “DPL := B := *PC; PC += 1” loads the low-byte of the address into the Data Pointer Low register (“DPL”) and increments PC. The next microinstruction loads the high-byte into the Data Pointer High register (“DPH”), and increments PC.

Once these two microinstructions are completed, DPL/DPH contains the full the 16-bit absolute address. The memory write operation “*DP := A” outputs address DPL/DPH on the Address Bus, the A register on the Data Bus, and takes the R/W pin low for the write. The microcode sequence ends with the normal FetchOpcode microinstruction to read the next opcode. (Unlike ALU operations, memory write operations cannot be performed in the same cycle as a FetchOpcodeHence an additional cycle is required).

Addressing Modes

The general form of various 6502 instructions will reflect the Addressing Mode in effect for a given opcode. It’s useful, therefore, to look at these Addressing Modes in turn, as these patterns will repeat throughout the instruction set. We already have encountered the microcode for the Immediate and Absolute addressing modes above. Here they are once again, along with others:

Immediate — e.g., “LDA #$A0”

DPL := B := *PC; PC +=1
A := B; SETF(NZ); IR := *PC; PC += 1; END

Absolute — e.g., “LDA $A0A0”

DPL := B := *PC; PC += 1
DPH := *PC; PC += 1
B := *DP
A := B; SETF(NZ); IR := *PC; PC += 1; END

Zero Page — e.g., “LDA $A0”

DPL := B := *PC; PC += 1
B := *zDP
A := B; SETF(NZ); IR := *PC; PC += 1; END

The *zDP notation is the second cycle indicates that the upper bits of the Address Bus are forced to zero, while the lower 8-bits come from DPL — hence a Zero Page address is output to the Address Bus.

Zero Page Indexed — e.g., “LDA $A0,X”

DPL := B := *PC; PC += 1
DPL := B + X; B := *zDP
B := *zDP
A := B; SETF(NZ); IR := *PC; PC += 1; END

The “DPL := B + X” operation in the second cycle applies the index to the Base Address, using the ALU. From the point of view of the external bus, this is a so-called “dead bus-cycle”. That is a cycle during which the CPU is busy with and internal operation and performs a superfluous I/O from memory. In this case, the “B := *zDP” operation in the second cycle above performs read from the Zero Page Base Address (before the index is applied). The data fetched during this dead-cycle is simply discarded, and memory is read a second time in the following cycle, this time with the fully resolved target address. Since the 6502, 65C02 and 65816 MPUs sometimes differ in the way they treat dead-cycles, the C74-6502 microcode explicitly encodes dead-cycle behaviour as required. (see C74-6502 Dead Cycles)

Absolute Indexed — e.g., “LDA $A0A0,X”

DPL := B := *PC; PC += 1
DPL := B + X; DPH.db := *PC; PC += 1; INCDPH.C
DPH := DPH + 1      # (dynamically inserted microinstruction)
B := *DP
A := B; SETF(NZ); IR := *PC; PC += 1; END

The second cycle of Absolute Indexed instructions has some special features. The “DPL := B + X” operation adds the X register as an offset to the low-byte of the base address using the ALU. At the same time, “DPH.db := *PC” reads the high-byte of the base address from memory directly into the DPH register. (There is a dedicated path in the CPU that connects the DPH directly to the Data Bus so this operation can be performed). The INCDPH.C operation tests the carry output of the ALU to see if a page-boundary is being crossed by the addition of the index. If so, the Control Logic will insert an additional cycle into the instruction stream to adjust DPH. The inserted microinstruction, DPH := DPH + 1, is not present in the microcode. Instead, the control circuitry generates the microinstruction on the fly as needed. The instruction will complete in five cycles if that’s the case, rather than four cycles otherwise.

Indirect Indexed — e.g., “LDA ($A0),Y”

DPL := B := *PC; PC += 1
B := *zDP; DPL += 1
DPL := B + Y; DPH.db := *zDP; INCDPH.C
DPH := DPH + 1 # (dynamically inserted microinstruction)
B := *DP
A := B; SETF(NZ); IR := *PC; PC += 1; END

The first cycle of the Indirect Indexed sequence loads the one-byte zero page address into DPL. We then load the low-byte of the indirect base address into B, and increment DPL. The third cycle of the sequence applies the index to the low-byte of the indirect address, loads the high-byte of the indirect address into DPH, and tests for a page-crossing. As above, the control logic will add an additional cycle to adjust DPH if a page boundary is crossed. Once the target address is fully resolved, the target byte is read from memory by “B := *DP”, and the value is transferred to register A in the final cycle.

Indexed Indirect — e.g., LDA (X,$A0)

DPL := B := *PC; PC += 1
DPL := B + X; B := *zDP
T := *zDP; DPL += 1
DPH := *zDP
B := *DPt
A := B; SETF(NZ); IR := *PC; PC += 1; END

The Indexed Indirect mode applies the index to the zero-page address in the second cycle. This is a dead-cycle. We do not need to check the carry since zero-page addresses are 8-bits wide. The third and fourth cycles read the low and high bytes of the indirect address from zero page. The fifth cycle uses the fully resolved address to read the target byte from memory. The “*DPt” notation indicates that the lower 8-bits of the address bus will come from the T register, and the upper 8-bits from DPL. The final cycle in the sequence transfers the value read to the A register through the ALU, and sets flags accordingly.

Read, Modify, Write Instructions — “ASL $A0”

DPL := B := *PC; PC += 1
B := *zDP; ML
T := 0 ASL B; SETF(NZC)
*zDP := T
IR := *PC; PC += 1; END

By now, this microcode should look fairly familiar. We fetch the operand (in this case a zero-page address). Read the byte, shift it left through the ALU while setting flags, and then write it back to the zero-page address. One special aspect is that the modify cycle is a dead-cycle. The NMOS 6502 performs a write to the zero-page address during this cycle, while the CMOS 65C02 performs a read. The microcode is not explicit about this. Rather, dedicated circuitry in the CPU is triggered by the “ML” operation which produces the correct behaviour for the instruction-set currently in effect. “ML” also brings the 65C02 ML pin low during the Read, Modify and Write cycles.

One-Byte Opcodes — e.g., “PLP”

DPL := B := *PC; PC += 1
SP += 1; B := *SP
P := *SP
IR := *PC; PC += 1; END

As we will see below, the C74-6502’s microcode pipeline requires that all opcodes use exactly the same FetchOperand microinstruction (“DPL := B := *PC; PC += 1”). One-byte opcodes, like the one above, are automatically detected by the CPU’s Control Unit and the 16-bit Incrementer is inhibited during this cycle. This ensures that PC is not advanced, and is left pointing correctly to the next opcode. This instruction also uses the Incrementer in the second cycle to modify SP. This operation isa unaffected by the change. Note that only the lower 8-bits of the Incrementer’s output are used when manipulating SP.

Flag-modifying Instructions — e.g., “CLI”

DPL := B := *PC; PC += 1
SETF(OPCODE 0); IR := *PC; PC += 1; END

Flag-modifying instructions are also one-byte opcodes, so the Incrementer is inhibited during the FetchOperand cycle. The “SETF(OPCODE 0)” operation in the second cycle tells the CPU to clear the status flag indicated by upper two bits of the opcode. The “SETF(OPCODE 1)” directive does the same but sets the selected flag instead.

Branch Instructions — e.g., “BPL”

DPL := B := *PC; PC += 1
PCL := PCL + B; EXIT.CC
PCH = PCH + signextend(*); USE(IC) 
IR := *PC; PC += 1; END

Branches require that the specified status bit be tested during the FetchOperand cycle. Once again, since the microcode pipeline requires that all instructions use exactly the same FetchOperand microinstruction, special circuitry is necessary for the branch test. The Control Unit automatically detects branches, performs the appropriate branch test, and terminates the execution of the branch if the test fails. It does so by dynamically replacing the next microinstruction in the sequence with a FetchOpcode (“IR := *PC; PC += 1; END”). If, on the other hand, the test succeeds and the branch needs to be taken, then execution of the microcode sequence continues. In the second cycle of the sequence, the brach offset is applied to the Program Counter. The “EXIT.CC” operation will terminate the execution of the microcode sequence if the carry is clear (again by executing a FetchOpcode in the following cycle). Otherwise, the third cycle will execute to adjust the high-bye pf PC. The low-byte result is sign-extended (meaning a $00 for a positive result and an $FF for a negative result) and is added to PCH using the Internal Carry (IC) from the prior ALU operation. Finally, with PC fully adjusted, the microcode will execute the normal FetchOpcode microinstruction at the end of the sequence.

BIT Instruction

DPL := B := *PC; PC += 1
B := *zDP
A AND B; SETF(NZV); BIT; IR := *PC; PC += 1; END

The BIT instruction performs a generic AND ALU operation but discards the ALU result and updates the flags in its own unique way. This ALU microinstruction includes the “BIT” directive which signals to the Control Unit to update flags accordingly.

BRK Instruction

DPL := B := *PC; PC += 1
*SP := PCH; SP -= 1
*SP := PCL; SP -= 1
*SP := P; SP -= 1; PBR.CLR
PCL := *fCP; DPL += 1
PCH := *fDP; SETF(SEI/CLD)
IR := *PC; PC += 1; END.INT

The seven-cycle BRK instruction displays some new constructs worth noting. After the normal FetchOperand cycle, the PC, and P registers are pushed onto the stack in cycles two, three and four. The 65816 PBR register is also cleared in cycle four (this function will be a NOP if the K24 card is not installed). The interrupt vector is then fetched from high memory. “fCP” is a special address. The high-byte is $FF, while the low-byte value depends on the state of internal interrupt-detect flags. It will be $FE, $FC, and $FA for IRQ, Reset and NMI interrupts respectively. “fDP” sets the high-byte of the address bus to $FF, and the low-byte to the value in the DPL register. “SETF(SIE/CLD) will set the I Flag and, if running the 65C02 or 65816 instruction sets, will also clear the D Flag. Finally “END.INT” is a special form of the “END” operation which simply signals to the Control Unit that an interrupt sequence has just completed.

Microcode ROMs

Microcode is stored in six 8-bit ROMs (ROMs A through F); two on each of the three cards of the CPU. The ROMs are accessed using the opcode in the IR, a 4-bit Q State Counter and a two-bit instruction-set selector. The Q Counter is incremented once per cycle until an “END” resets to zero. Hence, the Control Unit will walk down the microcode of each opcode in sequence, and then execute the FetchOpcode microinstruction to jump to the next instruction.

There is an important subtlety here. The C74-6502 Control Unit prefetches microcode from ROM one cycle ahead. That means that a new instruction has already been fetched by the time a FetchOpcode completes. The Control Unit will simply discard this fetched microinstruction and execute a FetchOperand microinstruction instead. The C74-6502 microcode is specifically encoded such that FetchOperand is an “all-zeroes” microinstruction, so one can easily be generated when needed.

For this reason, the FetchOperand for every opcode is in fact in the position following the FetchOpcode at the end of the sequence, rather then at the beginning. So, in fact, this microcode:

DPL := B := *PC; PC += 1
DPH := *PC; PC += 1
B := *DP
A := B; SETF(NZ); IR := *PC; PC += 1; END

is stored in ROM as follows:

DPH := *PC; PC += 1
B := *DP
A := B; SETF(NZ); IR := *PC; PC += 1; END
DPL := B := *PC; PC += 1

Effectively the FetchOperand (“DPL := B := *PC; PC += 1”) is moved from the first to the final cycle. With that change, microcode will be fetched correctly by the microcode pipeline.

Now, one implication of this scheme is that any processing specific to a particular opcode cannot be encoded in microcode. Instead, it must be handled by the Control Unit itself. To this end, the CU includes logic to detect specific opcodes and execute the required operations when necessary. Please see C74-6502 Microcode Pipeline Notes for details.

Instruction Sets

The C74-6502 implements four different instruction-sets, each stored in a separate bank of microcode in the ROMs. This includes the 6502, 65C02, 6502+NOPs and K24 instruction-sets. Details of these instruction-sets can be found in the C74-6502 Draft Datasheet.

Microcode Listings

The C74-6502 Microcode contains a complete listing of all four instruction sets and contents of all Control ROMs. Also available is a listing of all C74-6502 Microinstructions, as well as C74-6502 Decoder Values. These values are used by the Control Unit to decode microinstructions into all the necessary control signals used to drive the datapath. They are the link between the microcode and the internal circuitry of the CPU.

Binary images and assembler source code for all six Control ROMs can be found in the Project Files download available Internals Section.