www.igglybob.com r.e.t.a.r.d. The Core |
the main - main page - about me - l.o.u.n.g.e. - r.e.t.a.r.d. - s.o.l. meter - s.c.r.e.a.m. - videos the l.o.u.n.g.e. - overview - 2006 edition - 2007 edition - 2008 edition - control the s.c.r.e.a.m. - main page - tech review - proposal - draft summary the r.e.t.a.r.d. - overview - assignment - proposal - requirements - timeline - the core the s.o.l. meter - overview - car modification - meter design - meter input |
I. Introduction
The R.E.T.A.R.D. is composed of both input/output regions and an internal processor (which will be referred to herein as 'the core' or 'the R.E.T.A.R.D. core'). The functionality of the input/output, as seen by the user, was described in the proposal. However, the functionality and architecture of the internal processor itself was not defined in the proposal; this document aims to clarify the internal processor's structure and abilities. II. Basic Architecture and Flow The R.E.T.A.R.D. core is made up of the same elements as most CPUs; however, it is certainly not the same. The most notable feature of the R.E.T.A.R.D. core is that it is capable of loading a register from RAM and executing an operation on two registers concurrently. To allow this, the RAM was not connected to the output of the register file (which is what is done in a simple single cycle datapath). Instead, it has a separate channel of communication with the register file and the instruction decoder. This also eliminates the need for an immediate register. The equivalent method of an immediate add would then be to load a value into a register before an add instruction. The loading would be done concurrently with the previous instruction, so the R.E.T.A.R.D. core's implementation of an immediate add is not slower than a CPU with an immediate register. A register transfer level schematic of the R.E.T.A.R.D. core can be seen in Figure 1, and it can be compared with an equivalent schematic of a simple single cycle datapath in Figure 2. Figure 1. RTL schematic of the R.E.T.A.R.D. core. Figure 2. RTL schematic of a simple single cycle datapath. The instructions accepted by the R.E.T.A.R.D. core are similar to those accepted by most instruction sets, though it has less instructions than most. The number of instructions are few to keep the complexity of the core low, which also serves to keep the cost of the FPGA the R.E.T.A.R.D. core will be implemented on down. As discussed earlier, the R.E.T.A.R.D. core has no immediate register; therefore, instructions involving immediate registers are not required and do not exist in the ISA. However, instructions do exist to load values directly into RAM or a register. Table 1 details all the instructions accepted by the R.E.T.A.R.D. core, their syntax, and a description of what they do. Due to the large amount of addressing space it takes to concurrently load a memory value and perform an operation, most operations store their output in one of the registers passed as input. Any programmer with basic thinking skills should be able to use this, and clever programmers should be able to use it to their advantage.
IV. Input/Output of the R.E.T.A.R.D. Core (the Short Bus) As discussed in the original proposal, the R.E.T.A.R.D. will communicate with other devices solely over the Short Bus. The Short Bus is a 37-bit bus through which all the devices contained in the R.E.T.A.R.D. are connected. It is a master-slave bus, where the R.E.T.A.R.D. core is the master and is the only device allowed to control the first 5 bits, which are the control bits. All other devices can only respond to control requests and put data on the other 32 bits of the bus. The first 3 bits are used for addressing specific devices; this allows for up to 7 devices (not including the R.E.T.A.R.D. core itself), which is sufficient for the project. The next 2 bits specify flags that are being passed to the device. Each device accepts different flags for different functions, and will respond differently. Specific information about how each device will respond to flags is not defined in this document. Figure 3 shows how the Short Bus interconnects all the devices to the R.E.T.A.R.D. core. Figure 3. Block diagram of the R.E.T.A.R.D. Due to the architecture of the R.E.T.A.R.D. core, each instruction cannot be performed in one clock cycle. However, each instruction will take the same amount of clock cycles to complete, to simplify timing issues. Also, to simplify timing issues, the instruction decoder will be clocked at a fraction of the frequency of the other components. By doing this, the instruction decoder does not need to store information about what it is waiting on, since it will begin each clock cycle by fetching the next instruction. Therefore, the R.E.T.A.R.D. core is not a state machine, making it different from many other processors. VI. Detailed Description of Core Components a. Instruction Decoder
b. RAM
c. Register File
d. ALU The ALU is actually three smaller components: an arithmetic unit, a logic unit, and a shift unit. Each of these components is capable of what its name might imply; the arithmetic unit adds, subtracts, increments, or decrements input; the logic unit performs logical operations on two input registers; the shift unit performs shifting and rotating operations. This is functionally equivalent to the ALU in the simple single cycle datapath. Figure 7. I/O diagram of ALU.
f. Stack Module The stack module holds information about the memory addresses currently on the stack. It receives input from the instruction decoder and the comparator. It is capable of adding registers to the stack and removing them from the stack. The stack length is up to 8 levels long; recursion will not be necessary. Figure 9. I/O diagram of stack module. VII. References and Useful links [1]. R.E.T.A.R.D. Project Proposal - http://www.igglybob.com/projects/retard/proposal.php [2]. Project Timeline - http://www.igglybob.com/projects/retard/project_timeline.php [3]. Requirements and Test Document - http://www.igglybob.com/projects/retard/requirements.php |