The C74-6502 microcode is composed of 195 individual microinstructions, each designed to configure the CPU’s datapath in particular ways to achieve specific operations. Looking at these various microinstructions, and how they are used to implement the 6502 Instruction Set, is a great way to understand the C74-6502’s datapath and its operation.
The syntax of the C74-6502 microcode is designed to be mnemonic. That is, it is descriptive of function rather than detailing the values of control signals. The microcode is then “assembled” into a binary representation of microinstructions stored in Control ROMs on the CPU. This listing of all C74-6502 Microinstructions shows a mapping between the mneomic syntax of microinstructions and their binary representation.
The C74-6502 uses vertical microcode, so the binary form of microinstructions is decoded on the fly by the CPU’s Control Unit. The CU’s decoders generate the specific control signals that drive the datapath. This Decoder Values table outlines how C74-6502 microinstructions are encoded.
You can find more information about the specific function of control signals in the descriptions of each CPU card found in the Internals section. For our purposes here, we will focus on the syntax and semantics of the microcode itself.
Let’s begin by looking at two microinstructions in particular, FetchOpcode and FetchOperand. These microinstructions occur in every 6502 instruction, so it’s a great place to start. FetchOpcode is as follows:
IR := *PC; PC += 1; END
C74-6502 microinstructions are composed of distinct operations, separated by semicolons, which the datapath executes concurrently. Each operation typically configures a specific section of the datapath, causing data to flow through various logic elements. The results are then latched into target registers.
In the FetchOpcode microinstruction, the first operation, “IR := *PC”, uses the address at PC to read memory and latches the data into the Instruction Register. (C programmers will note the borrowed syntax to de-reference PC. In function, it means that the CPU will output the address in the PC register on to the Address Bus).
The next operation, “PC += 1”, increments PC by one. On the C74-6502, this is done by a dedicated 16-bit Incrementer circuit. In the microcode, the “+=“ and “-=“ operators specifically denote the use of the Incrementer, as opposed to the ALU, to perform increment or decrement operations.
Finally, the “END” microcode operation tells the CPU Control Unit that this marks the end of the current 6502 instruction. Internally, the C74-6502 keeps a “Q” Step Counter to index into the microcode for a given opcode. The Q Counter is incremented by one every cycle in order to fetch the next microinstruction. The END operation resets this counter to zero so it points to the first microinstruction of the opcode just fetched.
To summarize, the FetchOpcode microinstruction uses the address at PC to load the next opcode into the IR, increments PC by one, and resets the Q counter to 0 to start the instruction.
Let’s now examine the FetchOperand microinstruction:
DPL := B := *PC; PC += 1
The first operation here, “DPL := B := *PC”, dereferences PC once again, but this time the fetched byte is loaded into both the B Data Latch (“B”) and the Data Pointer Low (“DPL”) registers simultaneously. As we will see, the B Data Latch is a special purpose register at the “B” input of the ALU, while DPL is an internal register used in address calculations. The second operation in this microinstruction is the same “increment PC” operation we saw in FetchOpcode, “PC += 1”. So, in summary, FetchOperand uses the address at PC to load an opcode’s operand byte into the B and DPL registers, and increments PC.
Microcode For A Simple 6502 Instruction
Now, let’s see how microinstructions work together to specify a real 6502 instruction. Below is the microcode for the LDA Immediate opcode, $A9. The instruction begins with the $A9 opcode already in the IR, so the first microinstruction for this opcode sequence is a FetchOperand. The microcode is as follows:
DPL := B := *PC; PC +=1 A := B; SETF(NZ); IR := *PC; PC += 1; END
In this case, the FetchOperand retrieves the immediate value for the LDA from memory, at the address pointed to by PC. The next microinstruction is transfers the B Data Latch value into register A. This transfer, “A := B”, is done by way of the ALU. On the C74-6502, register transfer operations like this one pass the data value unchanged through the ALU so the Status Flags can be evaluated in the process. In this case, the “SETF (NZ)” operation indicates that the N and Z flags will be updated based on the data value. Finally, this microinstruction includes also a FetchOpcode operation, which will be executed concurrently with the transfer. The C74-6502’s datapath allows values to be loaded from memory while the ALU is in use, so the FetchOpcode can complete during the ALU cycle itself.
In summary then, the LDA Immediate instruction is made up of two microinstructions, each of which executes in one cycle. The first fetches the immediate operand from memory. The second transfers the value to register A, updates the N and Z flags, and fetches the next opcode.
An ALU Instruction
Let’s now look at an ALU instruction proper, ADC Immediate. The microcode is as follows:
DPL := B := *PC; PC +=1 A := A ADC B; SETF(NZCV); IR := *PC; PC += 1; END.D
By now, this microcode should look familiar. The logic is very similar to LDA Immediate, except that in this case the ALU performs an ADC operation rather then simply passing the value unchanged. The N, Z, C and V flags are updated. (In the microcode, “ADC” is used to indicate an add with carry operation. A “+” is used for add operations where the carry is ignored). There is also a special “END.D” operation. This performs the same function as the END we saw above, but also places the ALU in Decimal Mode if the D status flag is on. When in Decimal Mode, the BCD datapath in the ALU is enabled. Only the ADC and SBC instructions use this special version of the END operation.
A Store Instruction
Let’s now look at a store instruction, STA Absolute. The microcode is as follows:
DPL := B := *PC; PC += 1 DPH := *PC; PC += 1 *DP := A IR := *PC; PC += 1; END
The absolute address is in memory in low-byte/high-byte format, immediately following the opcode. In the first cycle, the familiar FetchOperand microinstruction “DPL := B := *PC; PC += 1” loads the low-byte of the address into the Data Pointer Low register (“DPL”) and increments PC. The next microinstruction loads the high-byte into the Data Pointer High register (“DPH”), and increments PC.
Once these two microinstructions are completed, DPL/DPH contains the full 16-bit absolute address. The memory write operation, “*DP := A”, outputs the absolute address in DPL/DPH onto the Address Bus, the A register onto the Data Bus, and takes the R/W pin low for the write. The microcode sequence ends with the normal FetchOpcode microinstruction to read the next opcode. (Unlike ALU operations, memory write operations cannot be performed in the same cycle as a FetchOpcode. Hence an additional cycle is required for the write itself).
The general form of various 6502 instructions will reflect the Addressing Mode in effect for a given opcode. It’s useful, therefore, to look at these Addressing Modes in turn, as these patterns will repeat throughout the instruction set. We already have encountered the microcode for the Immediate and Absolute addressing modes above. Here they are once again, along with others:
Immediate — e.g., “LDA #$A0”
DPL := B := *PC; PC +=1 A := B; SETF(NZ); IR := *PC; PC += 1; END
Absolute — e.g., “LDA $A0A0”
DPL := B := *PC; PC += 1 DPH := *PC; PC += 1 B := *DP A := B; SETF(NZ); IR := *PC; PC += 1; END
Zero Page — e.g., “LDA $A0”
DPL := B := *PC; PC += 1 B := *zDP A := B; SETF(NZ); IR := *PC; PC += 1; END
The *zDP notation is the second cycle indicates that the upper bits of the Address Bus are forced to zero, while the lower 8-bits come from DPL — hence a Zero Page address is output to the Address Bus.
Zero Page Indexed — e.g., “LDA $A0,X”
DPL := B := *PC; PC += 1 DPL := B + X; B := *zDP B := *zDP A := B; SETF(NZ); IR := *PC; PC += 1; END
The “DPL := B + X” operation in the second cycle applies the index to the Base Address, using the ALU. From the point of view of the external bus, this is a so-called “dead bus-cycle”. That is a cycle during which the CPU is busy with and internal operation and performs a superfluous I/O from memory. In this case, the “B := *zDP” operation in the second cycle above performs read from the Zero Page Base Address (before the index is applied). The data fetched during this dead-cycle is simply discarded, and memory is read a second time in the following cycle, this time with the fully resolved target address. Since the 6502, 65C02 and 65816 MPUs sometimes differ in the way they treat dead-cycles, the C74-6502 microcode explicitly encodes dead-cycle behaviour as required. (see 6502 Dead Cycles)
Absolute Indexed — e.g., “LDA $A0A0,X”
DPL := B := *PC; PC += 1 DPL := B + X; DPH.db := *PC; PC += 1; INCDPH.C DPH := DPH + 1 # (dynamically inserted microinstruction) B := *DP A := B; SETF(NZ); IR := *PC; PC += 1; END
The second cycle of Absolute Indexed instructions has some special features. The “DPL := B + X” operation adds the X register as an offset to the low-byte of the base address using the ALU. At the same time, “DPH.db := *PC” reads the high-byte of the base address from memory directly into the DPH register. (There is a dedicated path in the CPU that connects the DPH directly to the Data Bus so this operation can be performed). The INCDPH.C operation tests the carry output of the ALU to see if a page-boundary is being crossed by the addition of the index. If so, the Control Logic will insert an additional cycle into the instruction stream to adjust DPH. The inserted microinstruction, DPH := DPH + 1, is not present in the microcode. Instead, the control circuitry generates the microinstruction on the fly as needed. The instruction will complete in five cycles if that’s the case, rather than four cycles otherwise.
Indirect Indexed — e.g., “LDA ($A0),Y”
DPL := B := *PC; PC += 1 B := *zDP; DPL += 1 DPL := B + Y; DPH.db := *zDP; INCDPH.C DPH := DPH + 1 # (dynamically inserted microinstruction) B := *DP A := B; SETF(NZ); IR := *PC; PC += 1; END
The first cycle of the Indirect Indexed sequence loads the one-byte zero page address into DPL. We then load the low-byte of the indirect base address into B, and increment DPL. The third cycle of the sequence applies the index to the low-byte of the indirect address, loads the high-byte of the indirect address into DPH, and tests for a page-crossing. As above, the control logic will add an additional cycle to adjust DPH if a page boundary is crossed. Once the target address is fully resolved, the target byte is read from memory by “B := *DP”, and the value is transferred to register A in the final cycle.
Indexed Indirect — e.g., LDA (X,$A0)
DPL := B := *PC; PC += 1 DPL := B + X; B := *zDP T := *zDP; DPL += 1 DPH := *zDP B := *DPt A := B; SETF(NZ); IR := *PC; PC += 1; END
The Indexed Indirect mode applies the index to the zero-page address in the second cycle. This is a dead-cycle. We do not need to check the carry since zero-page addresses are 8-bits wide. The third and fourth cycles read the low and high bytes of the indirect address from zero page. The fifth cycle uses the fully resolved address to read the target byte from memory. The “*DPt” notation indicates that the lower 8-bits of the address bus will come from the T register, and the upper 8-bits from DPH. The final cycle in the sequence transfers the value read to the A register through the ALU, and sets flags accordingly.
Read, Modify, Write Instructions — e.g., “ASL $A0”
DPL := B := *PC; PC += 1 B := *zDP; ML T := 0 ASL B; SETF(NZC) *zDP := T IR := *PC; PC += 1; END
By now, this microcode should look fairly familiar. We first fetch the operand (in this case a zero-page address). We then read the target byte from memory (Read Cycle “B := *zDP”), shift it left through the ALU (Modify Cycle “T := 0 ASL B”) and then write it back to the zero-page address (Write Cycle “*zDP := T”). The Modify Cycle is a dead-cycle. The NMOS 6502 performs a write to the zero-page address during this cycle, while the CMOS 65C02 performs a read. The microcode is not explicit about this. Rather, dedicated circuitry in the CPU is triggered by the “ML” operation which produces the correct behaviour. “ML” also brings the ML CPU pin low during the Read, Modify and Write cycles.
One-Byte Opcodes — e.g., “PLP”
DPL := B := *PC; PC += 1 SP += 1; B := *SP P := *SP IR := *PC; PC += 1; END
As we will see below, the C74-6502’s microcode pipeline requires that all opcodes use exactly the same FetchOperand microinstruction (“DPL := B := *PC; PC += 1”). One-byte opcodes, like the one above, are detected by the CPU’s Control Unit and the 16-bit Incrementer is inhibited during this cycle. This ensures that PC is not advanced, and is left pointing correctly to the next opcode. This instruction also uses the Incrementer in the second cycle to modify SP (“SP += 1”). Note that only the lower 8-bits of the Incrementer are used when manipulating SP.
Flag-modifying Instructions — e.g., “CLI”
DPL := B := *PC; PC += 1 SETF(OPCODE 0); IR := *PC; PC += 1; END
Flag-modifying instructions are also one-byte opcodes, so the Incrementer is inhibited during the FetchOperand cycle. The “SETF(OPCODE 0)” operation in the second cycle tells the CPU to clear the status flag indicated by upper two bits of the opcode. The “SETF(OPCODE 1)” directive does the same but sets the selected flag instead.
Branch Instructions — e.g., “BPL”
DPL := B := *PC; PC += 1 PCL := PCL + B; EXIT.CC PCH = PCH + signextend(*); USE(IC) IR := *PC; PC += 1; END
Branches require that the specified status bit be tested during the FetchOperand cycle. Once again, since the microcode pipeline requires that all instructions use exactly the same FetchOperand microinstruction, special circuitry is necessary for the branch test. The Control Unit automatically detects branches, performs the appropriate branch test, and terminates the execution of the branch if the test fails. It does so by dynamically replacing the next microinstruction in the sequence with a FetchOpcode (“IR := *PC; PC += 1; END”). If, on the other hand, the test succeeds and the branch needs to be taken, then execution of the microcode sequence continues. In the second cycle of the sequence, the brach offset is applied to the Program Counter (“PCL := PCL + B”). The “EXIT.CC” operation will terminate the execution of the microcode sequence if the carry is clear (again by forcing the execution of a FetchOpcode in the following cycle). Otherwise, the third cycle will execute to adjust the high-bye of PC. To do so, the low-byte result is sign-extended (generating a $00 for a positive result and an $FF for a negative result) and is added to PCH using the Internal Carry (IC) from the prior ALU operation. Finally the microcode will execute a FetchOpcode microinstruction from the now fully adjusted PC value to complete the bramch.
DPL := B := *PC; PC += 1 B := *zDP A AND B; SETF(NZV); BIT; IR := *PC; PC += 1; END
The BIT instruction performs a generic AND ALU operation but updates the flags in its own unique way. The “BIT” operation signals to the Control Unit to update flags accordingly.
DPL := B := *PC; PC += 1 *SP := PCH; SP -= 1 *SP := PCL; SP -= 1 *SP := P; SP -= 1; PBR.CLR PCL := *fCP; DPL += 1 PCH := *fDP; SETF(SEI/CLD) IR := *PC; PC += 1; END.INT
The seven-cycle BRK instruction uses some new constructs worth noting. After the normal FetchOperand cycle, the PC, and P registers are pushed onto the stack in cycles two, three and four. The 65816 PBR register is also cleared in cycle four by “PBR.CLR” (this operation will be a ignored if the K24 Card is not installed). The interrupt vector is then fetched from high memory (“PCL := *fCP”). “fCP” refers to an internally-generated where the high-byte is $FF, while the low-byte value depends on interrupt being processed. The low-byte will be $FE, $FC, or $FA for an IRQ, Reset and NMI interrupts respectively. In the next microinstruction, (“PCH := *fDP”), “fDP” sets the high-byte of the address bus to $FF, and the low-byte to the value in the DPL register. “SETF(SIE/CLD) will set the I Flag and will also clear the D Flag if the 65C02 or 65816 instruction sets are in effect. Finally, a FetchOpcode microinstruction is executed with a special “END.INT” operation to signal to the Control Unit that an interrupt sequence has just completed.
Microcode is stored in six 8-bit ROMs (ROMs A through F); two ROMs on each of the three cards of the CPU. The ROMs are accessed using the opcode in the IR, a 4-bit Q State Counter and a two-bit instruction-set selector. The Q Counter is incremented once per cycle until an “END” operation in a FetchOpcode microinstruction resets to zero. Hence, the Control Unit will walk down the microcode of each opcode in sequence, and execute a FetchOpcode microinstruction to jump to the next instruction.
The C74-6502 microcode pipeline prefetches microcode from ROM one cycle ahead. That means that a new microinstruction will have already been fetched by the time a FetchOpcode completes. This fetched microinstruction will therefore come from the current opcode, rather than the new one. Hence, the CU will discard the fetched microinstruction and force the execution of a generated FetchOperand microinstruction instead. This is why every opcode on the C74-6502 executes the same generic microinstruction for the FetchOperand cycle. The microcode is specifically encoded such that FetchOperand is an “all-zeroes” microinstruction and can easily be generated by the CU when needed.
For this reason, the FetchOperand for every opcode is stored in ROM following the FetchOpcode at the end of the microcode sequence, rather then at the beginning. In fact, this microcode:
DPL := B := *PC; PC += 1 DPH := *PC; PC += 1 B := *DP A := B; SETF(NZ); IR := *PC; PC += 1; END
is stored in ROM as follows:
DPH := *PC; PC += 1 B := *DP A := B; SETF(NZ); IR := *PC; PC += 1; END DPL := B := *PC; PC += 1
Effectively the FetchOperand (“DPL := B := *PC; PC += 1”) at the end of the sequenced is there for documentation only, since the CU will discard it anyways. Hence, any processing specific to the FetchOperand cycle of a particular opcode cannot be encoded in microcode. Instead, it must be handled by the Control Unit itself. The CU therefore includes dedicated logic to detect specific opcodes and append any required operations to their FetchOperand cycles as necessary. Please see C74-6502 Microcode Pipeline Notes for details.
The C74-6502 implements four different instruction-sets, each stored in a separate bank in the microcode ROMs. These are the 6502, 65C02, 6502+NOPs and K24 instruction-sets. Details of these instruction-sets can be found in the C74-6502 Draft Datasheet.
The C74-6502 Microcode contains a complete listing of all four instruction sets and contents of all Control ROMs. Also available is a listing of all C74-6502 Microinstructions, as well as C74-6502 Decoder Values. These values are used by the Control Unit to decode microinstructions into all the necessary control signals used to drive the datapath. They are the link between the microcode and the internal circuitry of the CPU.
Binary images and assembler source code for all six Control ROMs can be found in the Project Files download available Internals Section.