Starting from:

$30

CA2 Project 2 Solved

In this homework, you are going to extend your Homework 4 to a pipelined CPU. This CPU has 32 registers and 1KB instruction memory. It should take 32-bit binary codes as input and should do the corresponding RISC-V instructions, saving the result of arithmetic operations into the corresponding registers. Besides the instruction specified in Homework 4, you have to support lw, sw, beq in this Project additionally. We will examine the correctness of your implementation by dumping the value of each register and data memory after each cycle.

 

1.1. Load / Store Operations
        In this project, you are provided a “Data_Memory” module and are required to implement lw and sw instructions. Figure 1 shows the data path after adding data memory to your Homework 4. The dashed lines are placeholders for modules in the following sections. You can just see them as real lines.

  

Figure 1 Single-cycle CPU with Data Memory 

1.2. Pipeline Registers
        To support pipeline execution, the first step is adding pipeline registers to the CPU. Pipeline registers store control signals and data from the last step and isolate each step. Note that you have to use the non-blocking assignment in your pipeline register to get the correct result. And you have to properly initialize reg values in your pipeline registers.

  Figure 2 Datapath after adding pipeline registers (colored in green) 

1.3. Data Hazard and Forwarding Control
        The first issue after adding pipeline execution is to handle data hazards. For example, if we execution “add t3, t1, t2” and “add t4, t1, t3” sequentially, t3 will not be written back until the 5th cycle. But the second instruction is brought into the execution stage at the 4th cycle, causing a different result from single-cycle implementation. To properly handle such data hazards, we can forward the ALU result of the first instruction to the execution stage of the second instruction.

        As a result, we shall have a “forwarding unit” exploring whether we should forward data from MEM stage or WB stage to EX stage. If the forwarding unit finds that the data in the later stage will be written back to the register that is taken in the EX stage, the data should be forwarded from the later stage to ensure correct arithmetic operation. To be more specific, Table 1, Listing 1, and Listing 2 are the recommended method to resolve data hazards from the textbook.

        Note that the forwarding unit is placed in EX stage, implying that we only forward to EX stage. You don’t have to handle the cases that forwarding to ID stage is necessary to keep the correctness. For example,  

add x5, x6, x7 beq x5, x4, BRANCH 

               instruction sequences like this will not be involved in our evaluation.                                                      

MUX
value
Source
Explanation
ForwardA
00
ID/EX
The first ALU operand comes from the register files
10
EX/MEM
The first ALU operand is forwarded from the prior ALU result.
01
MEM/WB
The first ALU operand is forwarded from data memory or an earlier ALU result.
ForwardB
00
ID/EX
The second ALU operand comes from the register files
10
EX/MEM
The second ALU operand is forwarded from the prior ALU result.
01
MEM/WB
The second ALU operand is forwarded from data memory or an earlier ALU result.
Table 1 Forwarding Control 

if (EX/MEM.RegWrite and (EX/MEM.RegisterRd != 0) 

and (EX/MEM.RegisterRd == ID/EX.RegisterRs1)) ForwardA = 10 

 

if (EX/MEM.RegWrite and (EX/MEM.RegisterRd != 0) 

and (EX/MEM.RegisterRd = ID/EX.RegisterRs2)) ForwardB = 10 
Listing 1 EX hazard 

if (MEM/WB.RegWrite and (MEM/WB.RegisterRd != 0) 

and not(EX/MEM.RegWrite and (EX/MEM.RegisterRd != 0)        and (EX/MEM.RegisterRd = ID/EX.RegisterRs1)) 

and (MEM/WB.RegisterRd = ID/EX.RegisterRs1)) ForwardA = 01 

 

if (MEM/WB.RegWrite and (MEM/WB.RegisterRd != 0) 

and not(EX/MEM.RegWrite and (EX/MEM.RegisterRd != 0)        and (EX/MEM.RegisterRd = ID/EX.RegisterRs2)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs2)) ForwardB = 01 
Listing 2 MEM hazard 

  

Figure 3 Data path after adding Forwarding Control (colored in blue)

1.4. Hazard Detection, Stall, and Flush
        Besides data hazards caused by arithmetic operations, which can be resolved by forwarding, data hazards caused by “load” cannot be resolved simply by forwarding and requires stall. Another major difference between Homework 4 and this project is that we have to support branch instruction, which causes control hazards for the wrong prediction.

  

Figure 4 An example to stall from textbook 

        As a result, we have to implement a “hazard detection unit” to detect whether to stall the pipeline or to flush when a control hazard happens. The hazard detection unit detects whether the rd in EX stage is the same as rs1 or rs2 in ID stage. If so, adding a nop (no operation) to the pipeline to resolve data hazard.

  

Figure 5 Final Datapath after Adding Hazard Detection Unit (colored in orange) 

        Note that to mitigate the impact of the branch instruction, we implement the branch decision at ID stage without using ALU, which is a recommended implementation by the textbook. You don’t have to handle the forwarding to ID stage. We will avoid the test cases that require forwarding to ID stage (example at the end of 1.3).

1.5. Instructions
        Besides instructions specified in Homework 4, you have to support additional 3 instructions, lw, sw, beq. Their machine code is as follows.

funct7 
rs2 
rs1 
funct3 
rd 
opcode 
function 
0000000 
rs2 
rs1 
111 
rd 
0110011 
and 
0000000 
rs2 
rs1 
100 
rd 
0110011 
xor 
0000000 
rs2 
rs1 
001 
rd 
0110011 
sll 
0000000 
rs2 
rs1 
000 
rd 
0110011 
add 
0100000 
rs2 
rs1 
000 
rd 
0110011 
sub 
0000001 
rs2 
rs1 
000 
rd 
0110011 
mul 
imm[11:0] 
rs1 
000 
rd 
0010011 
addi 
0100000 
imm[4:0] 
rs1 
101 
rd 
0010011 
srai 
imm[11:0] 
rs1 
010 
rd 
0000011 
lw 
imm[11:5] 
rs2 
rs1 
010 
imm[4:0] 
0100011 
sw 
imm[12,10:5] 
rs2 
rs1 
000 
imm[4:1,11] 
1100011 
beq 
 

1.6. Input / Output Format  
        Besides the modules listed above, you are also provided “testbench.v” and “instruction.txt”. After you finish your modules and CPU, you should compile all of them including “testbench.v”. A recommended compilation command would be

$ iverilog *.v –o CPU.out 

Then by default, your CPU loads “instruction.txt”, which should be placed in the same directory as CPU.out, into the instruction memory. This part is written in

“testbench.v”. You don’t have to change it. “instruction.txt” is a plain text file that consists of 32 bits (ASCII 0 or 1) per line, representing one instruction per line. For example, the first 3 lines in “instruction.txt” are

 

0000000_00000_00000_000_01000_0110011 //add  $t0,$0,$0 

000000001010_00000_000_01001_0010011  //addi $t1,$0,10 

000000001101_00000_000_01010_0010011  //addi $t2,$0,13 

 

        Note that underlines and texts after “//” (i.e. comments) are neglected. They are inserted simply for human readability. Therefore, the CPU should take

“00000000000000000000010000110011” and execute it in the first cycle, then “00000000101000000000010010010011” in the second cycle, and 

“00000000110100000000010100010011” in the third, and so on.

 

        Also, if you include unchanged “testbench.v” into the compilation, the program will generate a plain text file named “output.txt”, which dumps values of all registers and data memory at each cycle after execution. The file is self-explainable.  

             

1.7. Modules You Need to Add or Modify
1.7.1. Control and ALU_Control

Because your CPU has to support load/store instructions in this project, which are not involved in Homework 4, you have to add some additional control signals in the Control and ALU_Control module.

1.7.2. Pipeline Registers

As is introduced in section 1.2, you have to implement 4 pipeline registers to isolate 5 pipeline stages, and passing essential information to the next stage.

1.7.3. Forwarding Unit

As is introduced in section 1.3, you need a forwarding unit and two multiplexers to control forwarding from MEM stage or WB stage.

1.7.4. Hazard Detection Unit

As is introduced in section 1.4, you need a hazard detection unit to handle the necessary stall and nop (no operation).

1.7.5. testbench

You have to initialize reg in your pipeline registers before any instruction is executed. It is recommended that you initialize them in the “initial” block of the testbench.v. Except for registers initialization, please do not change the output format ($fdisplay part) of this file.

1.7.6. Others

You can add more modules than listed above if you want. Figure 5 is simply a recommended data path for you to refer to. You are free to change some details as long as your CPU can perform correctly.

1.7.7. CPU

Use structure modeling to connect the input and output of modules following the data path in Figure 5.

1.8. Reminder

        Project 2 will be strongly related to this homework. Please make sure you can fully understand how to write this homework; otherwise, you may encounter difficulties in your next project. Plagiarism is strongly prohibited.

             

2.   Report 
 

2.1. Modules Explanation

        You should briefly explain how the modules you implement work in the report. You have to explain them in human-readable sentences. Either English or Chinese is welcome, but no Verilog. Explaining Verilog modules in Verilog is nonsense. Simply pasting your codes into the report with no or little explanation will get zero points for the report. You have to write more detail than Section 1.7.         Take “PC.v” as an example, an acceptable report would be:

PC module reads clock signals, reset bit, start bit, and next cycle PC as input, and outputs the PC of the current cycle. This module changes its internal register “pc_o” at the positive edge of the clock signal. When the reset signal is set, PC is reset to 0. And PC will only be updated by next PC when the start bit is on.
 

        And following report will get zero points.

The inputs of PC are clk_i, rst_i, start_i, pc_i, and ouput pc_o. It works as follows:

 

always@(posedge clk_i or negedge rst_i) begin     if(rst_i) begin         pc_o <= 32'b0;     end     else begin         if(start_i)             pc_o <= pc_i;         else 

            pc_o <= pc_o;     end end 
 

2.2. Members & Teamwork

        Specify your team members and your work division. For example, who writes the pipeline registers, who is in charge of connecting wires in the CPU, etc.

2.3. Difficulties Encountered and Solutions in This Project

        Write down the difficulties if any you encountered in doing this project, and the final solution to them.

2.4. Development Environment

        Please specify the OS (e.g. MacOS, Windows, Ubuntu 18.04) and compiler (e.g. iverilog) or IDE (e.g. ModelSim) you use in the report, in case that we cannot reproduce the same result as the one in your computer.  

More products