Example: dental hygienist

MIPS Pipeline - Cornell University

Hakim WeatherspoonCS 3410, Spring 2012 Computer ScienceCornell UniversityMIPS PipelineSee P&H Chapter ProcessoraluPCimmmemorymemorydindoutaddr targetoffsetcmpcontrol=?new pcregisterfileinstextend+4+4 Review: Single cycle processor3 What determines performance of Processor?A) Critical PathB) Clock Cycle TimeC) Cycles Per Instruction (CPI)D) All of the aboveE) None of the above4 Review: Single Cycle ProcessorAdvantages Single Cycle per instruction make logic and clock simpleDisadvantages Since instructions take different time to finish, memory and functional unit are not efficiently utilized.

Five stage “RISC” load‐store architecture 1.Instruction fetch (IF) –get instruction from memory, increment PC 2.Instruction Decode (ID) –translate opcodeinto control signals and read registers 3.Execute (EX) –perform ALU operation, compute jump/branch targets ... Collaboration, Late, Re‐grading Policies

Tags:

  Late, Stage, Pipeline, Imps, Mips pipeline

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of MIPS Pipeline - Cornell University

1 Hakim WeatherspoonCS 3410, Spring 2012 Computer ScienceCornell UniversityMIPS PipelineSee P&H Chapter ProcessoraluPCimmmemorymemorydindoutaddr targetoffsetcmpcontrol=?new pcregisterfileinstextend+4+4 Review: Single cycle processor3 What determines performance of Processor?A) Critical PathB) Clock Cycle TimeC) Cycles Per Instruction (CPI)D) All of the aboveE) None of the above4 Review: Single Cycle ProcessorAdvantages Single Cycle per instruction make logic and clock simpleDisadvantages Since instructions take different time to finish, memory and functional unit are not efficiently utilized.

2 Cycle time is the longest delay. Load instruction Best possible CPI is 1 However, lower MIPS and longer clock period (lower clock frequency); hence, lower : Multi Cycle ProcessorAdvantages Better MIPS and smaller clock period (higher clock frequency) Hence, better performance than Single Cycle processor Disadvantages Higher CPI than single cycle processorPipelining: Want better Performance want small CPI (close to 1) with high MIPS and short clock period (high clock frequency) CPU time = instruction count x CPI x clock cycle time6 Single Cycle vs Pipelined ProcessorSee.

3 P&H Chapter KidsAliceBobThey don t always get Bicycle9 The MaterialsSawDrillGluePaint10 The InstructionsN pieces, each built following same sequence:SawDrillGluePaint11 Design 1: Sequential ScheduleAlice owns the roomBob can enter when Alice is finishedRepeat for remaining tasksNo possibility for conflicts12 Elapsed Time for Alice: 4 Elapsed Time for Bob: 4 Total elapsed time: 4*NCan we do better?Sequential Performancetime12345678 ..Latency:Throughput:Concurrency: CPI =13 Design 2: Pipelined DesignPartition room into stagesof a pipelineOne person owns a stage at a time4 stages4 people working simultaneouslyEveryone moves right in lockstepAliceBobCarolDave14 Pipelined :Throughput:Concurrency: 15 LessonsPrinciple:Throughput increased by parallel executionPipelining: Identify Pipeline stages Isolate stages from each other Resolve Pipeline hazards (Thursday)16A ProcessoraluPCimmmemorymemorydindoutaddr targetoffsetcmpcontrol=?

4 New pcregisterfileinstextend+4+4 Review: Single cycle processor17 Write BackMemoryInstructionFetchExecuteInstruc tionDecoderegisterfilecontrolA ProcessoraluimmmemorydindoutaddrinstPCme morycomputejump/branchtargetsnew pc+4extend18 Basic PipelineFive stage RISC load store fetch (IF) get instruction from memory, increment Decode (ID) translate opcode into control signals and read (EX) perform ALU operation, compute jump/branch (MEM) access memory if (WB) update register file19 Time Graphs123456789 Clock cycleLatency:Throughput.

5 Concurrency:IFIDEXMEMWBIFIDEXMEMWBIFIDEX MEMWBIFIDEXMEMWBIFIDEXMEMWB20 Principles of Pipelined ImplementationBreak instructions across multiple clock cycles (five, in this case)Design a separate stage for the execution performed during each clock cycleAdd Pipeline registers (flip flops) to isolate signals between different stages21 Pipelined ProcessorSee: P&H Chapter BackMemoryInstructionFetchExecuteInstruc tionDecodeextendregisterfilecontrolPipel ined ProcessoralumemorydindoutaddrPCmemorynew pcinstIF/IDID/EXEX/MEMMEM/WBimmBActrlctr lctrlBDDM computejump/branchtargets+423 IFStage 1.

6 Instruction FetchFetch a new instruction every cycle Current PC is index to instruction memory Increment the PC at end of cycle (assume no branches for now)Write values of interest to Pipeline register (IF/ID) Instruction bits (for later decoding) PC+4 (for later computing branch targets)24 IFPC instructionmemorynewpcaddrmc+425 IFPC instructionmemorynewpcinstaddrmc00 = read word1IF/IDWE1 Rest of Pipeline +4PC+4pcselpcregpcrelpcabs26 IDStage 2: Instruction DecodeOn every cycle: Read IF/ID Pipeline register to get instruction bits Decode instruction, generate control signals Read from register fileWrite values of interest to Pipeline register (ID/EX) Control information, Rd index, immediates, offsets.

7 Contents of Ra, Rb PC+4 (for computing branch targets later)27 IDctrlID/EXRest of pipelinePC+4instIF/IDPC+4 stage 1: Instruction FetchregisterfileWERdRaRbDBABAimm28 IDctrlID/EXRest of pipelinePC+4instIF/IDPC+4 stage 1: Instruction FetchregisterfileWERdRaRbDBABA extendimmdecoderesultdest29 EXStage 3: ExecuteOn every cycle: Read ID/EX Pipeline register to get values and control bits Perform ALU operation Compute targets (PC+4+offset, etc.) in case this is a branch Decide if jump/branch should be takenWrite values of interest to Pipeline register (EX/MEM) Control information, Rd index.

8 Result of ALU operation Value in casethis is a memory store instruction30 stage 2: Instruction DecodepcrelpcabsEXctrlEX/MEMRest of pipelineBDctrlID/EXPC+4 BAaluj+||branch?immpcselpcregtarget31 MEMS tage 4: MemoryOn every cycle: Read EX/MEM Pipeline register to get values and control bits Perform memory load/store if needed address is ALU resultWrite values of interest to Pipeline register (MEM/WB) Control information, Rd index, .. Result of memory operation Pass result of ALU operation32 MEMctrlMEM/WBRest of pipelineStage 3: ExecuteMDctrlEX/MEMBD memorydindoutaddrmctarget33 MEMctrlMEM/WBRest of pipelineStage 3: ExecuteMDctrlEX/MEMBD memorydindoutaddrmctargetbranch?

9 Pcselpcrelpcabspcreg34 WBStage 5: Write backOn every cycle: Read MEM/WB Pipeline register to get values and control bits Select value and write to register file35 WBStage 4: MemoryctrlMEM/WBMD36 WBStage 4: MemoryctrlMEM/WBMD resultdest37IF/ID+4ID/EXEX/MEMMEM/WBmemd indoutaddrinstPC+4 OPBARtBDMDPC+4immOPRdOPRdPCinstmemRdRa RbDBARd38 AdministriviaHW2 due today Fill out Survey online. Receive credit/points on homework for survey: Survey is anonymousProject1 (PA1) due week after prelim Continue working diligently.

10 Use design doc momentumSave your work! Save often. Verify file is non zero. Periodically save to Dropbox, email. Beware of MacOSX (leopard) and (snow leopard)Use your resources Lab Section, , Office Hours, Homework Help Session, Class notes, book, Sections, CSUGLab39 AdministriviaPrelim1: next Tuesday, February 28thin evening We will start at 7:30pm sharp, so come early Prelim Review: This Wed / Fri, 3:30 5:30pm, in 155 Olin Closed Book Cannot use electronic device or outside material Practice prelims are online in CMS Material covered everything up to end of this week Appendix C (logic, gates, FSMs, memory, ALUs) Chapter 4 (pipelined [and non Pipeline ] MIPS processor with hazards) Chapters 2 (Numbers / Arithmetic, simple MIPS instructions) Chapter 1 (Performance)


Related search queries