Transcription of Datapath& Control Design
1 1 We will Design a simplified MIPS processor The instructions supported are memory-reference instructions : lw, sw arithmetic-logical instructions : add, sub, and, or, slt Control flow instructions : beq, j Generic Implementation: use the program counter (PC) to supply instruction address get the instruction from memory read registers use the instruction to decide exactly what to do All instructions use the ALU after reading the registersWhy? memory-reference? arithmetic? Control flow?Datapath& Control Design2 We need an ALU We have already designed that We need memory to store inst and data Instruction memory takes address and supplies inst Data memory takes address and supply data for lw Data memory takes address and data and write into memory We need to manage a PC and its update mechanism We need a register file to include 32 registers We read two operands and write a result back in register file Some times part of the operand comes from instruction We may add support of immediate class of instructions We may add support for J, JR, JALWhat blocks we need3 Simple Implementation Include the functional units we need for each instructionWhy do we need this stuff?
2 PCInstruction memoryInstruction addressInstructiona. Instruction memoryb. Program counterAddSumc. AdderALU controlRegWriteRegistersWrite registerRead data 1 Read data 2 Read register 1 Read register 2 Write dataALU resultALUDataDataRegister numbersa. Registersb. ALUZero55531632 Sign extendb. Sign-extension unitMemReadMemWriteData memoryWrite dataRead dataa. Data memory unitAddress4 Abstract / Simplified View: Two types of functional units: elements that operate on data values (combinational) Example: ALU elements that contain state (sequential) Examples: Program and Data memory, Register FileMore Implementation DetailsRegistersRegister #DataRegister #Data memoryAddressDataRegister #PCInstructionALUI nstruction memoryAddress5 Unclockedvs. Clocked Clocks used in synchronous logic when should an element that contains state be updated?cycle timerising edgefalling edgeManaging State Elements6 MIPS Instruction Format31 26 25 21 2016 15 11 10 6 5 0 JUMPJUMP ADDRESS31 26 25 21 2016 15 11 10 6 5 0 REG 1 REG 2 BEQ/BNE/JBRANCH ADDRESS OFFSET31 26 25 21 2016 15 11 10 6 5 0 REG 1 REG 2 SWSTORE ADDRESS OFFSET31 26 25 21 2016 15 11 10 6 5 0 REG 1 REG 2 LWLOAD ADDRESS
3 OFFSET31 26 25 21 2016 15 11 10 6 5 0 REG 1 REG 2 DSTR-TYPESHIFT AMOUNTADD/AND/OR/SLT31 26 25 21 2016 15 11 10 6 5 0 REG 1 REG 2I-TYPEIMMEDIATE DATA7 Building the Datapath Use multiplexorsto stitch them togetherPCInstruction memoryRead addressInstruction1632 AddALU resultM u xRegistersWrite registerWrite dataRead data 1 Read data 2 Read register 1 Read register 2 Shift left 24M u xALU operation3 RegWriteMemReadMemWritePCSrcALUSrcMemtoR egALU resultZeroALUData memoryAddress Write dataRead dataM u xSign extendAdd8A Complete Datapathfor R-Type instructions Lw, Sw, Add, Sub, And, Or, Sltcan be performed For j (jump) we need an additional multiplexorMemtoRegMemReadMemWriteALUOpA LUSrcRegDstPCInstruction memoryRead addressInstruction [31 0]Instruction [20 16]Instruction [25 21]AddInstruction [5 0]RegWrite41632 Instruction [15 0]0 RegistersWrite registerWrite dataWrite dataRead data 1 Read data 2 Read register 1 Read register 2 Sign extendALU resultZeroData memoryAddressRead dataM u x10M u x10M u x10M u x1 Instruction [15 11]ALU controlShift left 2 PCSrcALUAddALU result9 What Else is Needed in Data Path Support for j and jr For both of them PC value need to come from somewhere else For J, PC is created by 4 bits (31:28) from old PC, 26 bits fromIR (27:2) and 2 bits are zero (1.)
4 0) For JR, PC value comes from a register Support for JAL Address is same as for J inst OLD PC needs to be saved in register 31 And what about immediate operand instructions Second operand from instruction, but without shifting Support for other instructions like lwand immediate inst write10 Operation for Each Instruction LW:1. READ INST2. READ REG 1 READ REG 23. ADD REG 1 + OFFSET 4. READ MEM5. WRITE REG2SW:1. READ INST2. READ REG 1 READ REG 23. ADD REG 1 + OFFSET 4. WRITE MEM5. R/I/S-Type:1. READ INST2. READ REG 1 READ REG 23. OPERATE on REG 1 / REG 2 4. 5. WRITE DSTBR-Type:1. READ INST2. READ REG 1 READ REG 23. SUB REG 2 from REG 14. 5. JMP-Type:1. READ INST2. 3. 4. 5. 11 Data Path OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg12 All of the logic is combinational We wait for everything to settle down.
5 And the right thing to bedone ALU might not produce right answer right away we use write signals along with clock to determine when to write Cycle time determined by length of the longest pathOur Simple Control StructureWe are ignoring some details like setup and hold timesClock cycleState element 1 Combinational logicState element 213 Control PointsMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg14LW Instruction OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg15SW Instruction OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg16R-Type Instruction OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg17BR-Instruction OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg18 Jump Instruction OperationMUXPCS hiftLeft 225-0025-2120-1615-1115-0005-0031-2631-0 0 SignExtINSTMEMORYIAINST4 ADDDATAMEMORYMAMDWDMUXALUMUXMUXADDREGFIL ERA1RA2RD1RD2 WAWDMUXALUCONALUOPCONTROLjmpANDbrzeroWER DESALUSRCMRMWM emreg19 Control For each instruction Select the registers to be read (always read two)
6 Select the 2nd ALU input Select the operation to be performed by ALU Select if data memory is to be read or written Select what is written and where in the register file Select what goes in PC Information comes from the 32 bits of the instruction Example:add $8, $17, $18 Instruction Format:00000010001100100100000000100000 oprsrtrdshamtfunct20 Adding Control to DataPathInstructionRegDstALUSrcMemto-Reg Reg WriteMem ReadMem WriteBranchALUOp1 ALUp0R-format100100010lw011110000swX1X00 1000beqX0X000101 PCInstruction memoryRead addressInstruction [31 0]Instruction [20 16]Instruction [25 21]AddInstruction [5 0]MemtoRegALUOpMemWriteRegWriteMemReadBr anchRegDstALUSrcInstruction [31 26]41632 Instruction [15 0]00M u x01 ControlAddALU resultM u x01 RegistersWrite registerWrite dataRead data 1 Read data 2 Read register 1 Read register 2 Sign extendShift left 2M u x1 ALU resultZeroData memoryWrite dataRead dataM u x1 Instruction [15 11]ALU controlALUA ddress21 ALU'soperation based on instruction type and function code , what should the ALU do with any instruction Example.
7 Lw$1, 100($2) 3521100 oprsrt16 bit offset ALU Control input000 AND001OR010add110subtract111set-on-less- than Why is the code for subtract 110 and not 011?ALU Control22 Must describe hardware to compute 3-bit ALU conrolinput given instruction type 00 = lw, sw01 = beq, 10 = arithmetic 11 = Jump function code for arithmetic Control can be described using a truth table:ALUO pcomputed from instruction typeOther Control InformationALUOpFunct fieldOperationALUOp1 ALUOp0F5F4F3F2F1F000 XXXXXX010X1 XXXXXX1101 XXX00000101 XXX00101101 XXX01000001 XXX01010011 XXX101011123 Implementation of Control Simple combinational logic to realize the truth tablesOperation2 Operation1 Operation0 OperationALUOp1F3F2F1F0F (5 0)ALUOp0 ALUOpALU Control blockR-formatIwswbeqOp0Op1Op2Op3Op4Op5 InputsOutputsRegDstALUSrcMemtoRegRegWrit eMemReadMemWriteBranchALUOp1 ALUOpO24A Complete Datapathwith Control25 Datapathwith Control and Jump Instruction26 Timing: Single Cycle Implementation Calculate cycle time assuming negligible delays except.
8 Memory (2ns), ALU and adders (2ns), register file access (1ns)MemtoRegMemReadMemWriteALUOpALUSrcR egDstPCInstruction memoryRead addressInstruction [31 0]Instruction [20 16]Instruction [25 21]AddInstruction [5 0]RegWrite41632 Instruction [15 0]0 RegistersWrite registerWrite dataWrite dataRead data 1 Read data 2 Read register 1 Read register 2 Sign extendALU resultZeroData memoryAddressRead dataM u x10M u x10M u x10M u x1 Instruction [15 11]ALU controlShift left 2 PCSrcALUAddALU result27 Where we are headed Design a data path for our machine specified in the next 3 slides Single Cycle Problems: what if we had a more complicated instruction like floating point? wasteful of area One Solution: use a smaller cycle time and use different numbers of cycles for each instruction using a multicycle datapath:PCMemoryAddressInstruction or dataDataInstruction registerRegistersRegister #DataRegister #Register #ALUM emory data registerABALUOut28 16-bit data path (can be 4, 8, 12, 16, 24, 32) 16-bit instruction (can be any number of them) 16-bit PC (can be 16, 24, 32 bits) 16 registers (can be 1, 4, 8, 16, 32) With m register, log m bits for each register Offset depends on expected offset from registers Branch offset depends on expected jump address Many compromise are made based on number of bits in instructionMachine Specification29 LWR2, #v(R1) ; Load memory from address (R1) + v SWR2, #v(R1) ; Store memory to address (R1) + v R-Type OPER R3, R2, R1 ; Perform R3 flR2 OP R1 Five operations ADD, AND, OR, SLT, SUB I-Type OPER R2, R1, V.
9 Perform R2 flR1 OP V Four operation ADDI, ANDI, ORI, SLTI B-Type BC R2, R1, V; Branch if condition met to address PC+V Two operation BNE, BEQ Shift class SHIFT TYPE R2, R1 ; Shift R1 of type and result to R2 One operation Jump Class--JAL and JR (JAL can be used for Jump) What are thimplications of J vsJAL Two instructionsInstruction30 LW/SW/BC Requires opcode, R2, R1, and V values R-Type Requires opcode, R3, R2, and R1 values I-Type Requires opcode, R2, R1, and V values Shift class Requires opcode, R2, R1, and shift type value JAL requires opcodeand jump address JR requires opcodeand register address Opcode can be fixed number or variable number of bits Register address 4 bits if 16 registers How many bits in V? How many bits in shift type? 4 for 16 types, assume one bit shift at a time How many bits in jump address? Instruction bits needed31 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivationWhy is some hardware better than others for different programs?
10 What factors of system performance are hardware related?( , Do we need a new machine, or a new operating system?)How does the machine's instruction set affect performance?Performance32 Which of these airplanes has the best performance?AirplanePassengersRange (mi)Speed (mph)Boeing 737-100101630598 Boeing 7474704150610 BAC/SudConcorde13240001350 Douglas DC-8-501468720544 How much faster is the Concorde compared to the 747? How much bigger is the 747 than the Douglas DC-8?33 Response Time (latency) How long does it take for my job to run? How long does it take to execute a job? How long must I wait for the database query? Throughput How many jobs can the machine run at once? What is the average execution rate? How much work is getting done? If we upgrade a machine with a new processor what do we increase?If we add a new machine to the lab what do we increase?Computer Performance: TIME, TIME, TIME34 Elapsed Time counts everything (disk and memory accesses, I/O , etc.)