Example: quiz answers

04 ARM Architecture Overview - Electrical Engineering and ...

1 Confidential11 ARM Architecture Overview222 Development of the ARM Architecture4 TARM7 TDMIARM922 TThumb instruction setARM926EJ-SARM946E-SARM966E-SImproved ARM/Thumb InterworkingDSP instructionsExtensions:Jazelle (5 TEJ)5TE6 ARM1136JF-SARM1176 JZF-SARM11 MPCoreSIMD InstructionsUnaligned data supportExtensions:Thumb-2 (6T2)TrustZone (6Z)Multicore (6K)7 Note: Implementations of the same Architecture can be very different ARM7 TDMI - Architecture v4T. Von Neuman core with 3 stage pipeline ARM920T - Architecture v4T. Harvard core with 5 stage pipeline and MMUC ortex-A8/R4/M3/M1 Thumb-2 Extensions:v7A (applications) NEONv7R (real time) HW DivideV7M (microcontroller) HW Divide and Thumb-2 only Processor Architecture = Instruction Set + Programmer s model2 Confidential333 ARM Architecture profiles Application profile (ARMv7-A Cortex-A8) Memory management support (MMU) Highest performance at low power Influenced by multi-tasking OS system requirements TrustZone and Jazelle-RCT for a safe, ext

2 Confidential 3 ARM Architecture profiles §Application profile (ARMv7 -A àe.g. Cortex -A8) §Memory management support (MMU) §Highest performance at low power §Influenced by multi-tasking OS system requirements §TrustZone and Jazelle-RCT for a safe, extensible system §Real-time profile (ARMv7 -R àe.g. Cortex -R4) §Protected memory (MPU) §Low latency and …

Tags:

  Architecture, Overview, Arm architecture overview

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of 04 ARM Architecture Overview - Electrical Engineering and ...

1 1 Confidential11 ARM Architecture Overview222 Development of the ARM Architecture4 TARM7 TDMIARM922 TThumb instruction setARM926EJ-SARM946E-SARM966E-SImproved ARM/Thumb InterworkingDSP instructionsExtensions:Jazelle (5 TEJ)5TE6 ARM1136JF-SARM1176 JZF-SARM11 MPCoreSIMD InstructionsUnaligned data supportExtensions:Thumb-2 (6T2)TrustZone (6Z)Multicore (6K)7 Note: Implementations of the same Architecture can be very different ARM7 TDMI - Architecture v4T. Von Neuman core with 3 stage pipeline ARM920T - Architecture v4T. Harvard core with 5 stage pipeline and MMUC ortex-A8/R4/M3/M1 Thumb-2 Extensions:v7A (applications) NEONv7R (real time) HW DivideV7M (microcontroller) HW Divide and Thumb-2 only Processor Architecture = Instruction Set + Programmer s model2 Confidential333 ARM Architecture profiles Application profile (ARMv7-A Cortex-A8) Memory management support (MMU) Highest performance at low power Influenced by multi-tasking OS system requirements TrustZone and Jazelle-RCT for a safe, extensible system Real-time profile (ARMv7-R Cortex-R4) Protected memory (MPU) Low latency and predictability real-time needs Evolutionary path for traditional embedded business Microcontroller profile (ARMv7-M Cortex-M3)

2 Lowest gate count entry point Deterministic and predictable behavior a key priority Deeply embedded use444 Programmer s Model3 Confidential555 Data Sizes and Instruction Sets When used in relation to the ARM: Halfwordmeans 16 bits (two bytes) Wordmeans 32 bits (four bytes) Doublewordmeans 64 bits (eight bytes) Most ARMs implement two instruction sets 32-bit ARMI nstruction Set 16-bit ThumbInstruction Set Latest ARM cores introduce a new instruction set Thumb-2 Provides a mixture of 32-bit and 16-bit instructions Maintains code density with increased flexibility Jazelle-DBX cores can also execute Java bytecode666 The ARM has seven basic operating modes: Each mode has access to own stack and a different subset of registers Some operations can only be carried out in a privileged modeProcessor ModesEntered when a high priority (fast) interrupt is raisedFIQE ntered when a low priority (normal) interrupt is raisedIRQUsed to handle memory access violationsAbortUsed to handle undefined instructionsUndefPrivileged mode using the same registers as User modeSystemUnprivileged modeMode under which most Applications / OS tasks runUserPrivilegedmodesEntered on reset and when a Software Interrupt instruction (SWI) is executedSupervisor(SVC)

3 DescriptionModeException modes4 Confidential777 The ARM Register Setr0r1r2r3r4r5r6r7r8r9r10r11r12r15 (pc)cpsrr13 (sp)r14 (lr)User modespsrr13 (sp)r14 (lr)IRQFIQr8r9r10r11r12r13 (sp)r14 (lr)spsrspsrr13 (sp)r14 (lr)Undefspsrr13 (sp)r14 (lr)Abortspsrr13 (sp)r14 (lr)SVCC urrent modeBanked out registersARM has 37 registers, all 32-bits longA subset of these registers is accessible in each mode888 Program Status Registers Condition code flags N =Negative result from ALU Z = Zero result from ALU C = ALU operation Carried out V = ALU operation oVerflowed Sticky Overflow flag -Q flag Architecture 5TE and later only Indicates if saturation has occurred J bit Architecture 5 TEJ and later only J = 1: Processor in Jazelle state Interrupt Disable bits I = 1: Disables IRQ F = 1: Disables FIQ T Bit T = 0: Processor in ARM state T = 1.

4 Processor in Thumb state Introduced in Architecture 4T Mode bits Specify the processor modefsxc2731N Z C V Q2867I F Tmode16231554024U n d e f i n e dJ New bits in V6 GE[3:0]used by some SIMD instructions Ebit controls load/store endianness Abit disables imprecise data aborts IT [abcde]IF THEN conditional execution of Thumb2 instruction groups108919GE[3:0]E AIT cond_abcde5 Confidential999 Data alignment Prior to Architecture v6 data accesses must be appropriately aligned for access size Unaligned addresses will produce unexpected/undefined results Unaligned data can be accessed using multiple aligned accesses combined with shift/mask operationsByte access(byte aligned)Halfword access(halfword aligned)Word access(word aligned)3210765402468ace089abcdef48c1010 10 Vector TableException Handling When an exception occurs, the core.

5 Copies CPSR into SPSR_<mode> Sets appropriate CPSR bits Change to ARM state Change to exception mode Disable interrupts (if appropriate) Stores the return address in LR_<mode> Sets PC to vector address To return, exception handler needs to: Restore CPSR from SPSR_<mode> Restore PC from LR_<mode> Must be done in ARM state in most cores, capable cores can do this in Thumb state Vector table can also be at 0xFFFF0000on most coresFIQIRQ(Reserved)Data AbortPrefetch AbortSoftware InterruptUndefined InstructionReset0x1C0x180x140x100x0C0x08 0x040x006 Confidential111111 Introduction toInstruction Sets121212 ARM Instruction Set All instructions are 32 bits long / many execute in a single cycle Instructions are conditionally executed A load / store Architecture Example data processing instructionsSUB r0,r1,#5 ADD r2,r3,r3,LSL #2 ADDEQ r5,r5,r6 Example branching instructionB <Label> Example memory access instructionsLDR r0,[r1]STRNEB r2,[r3,r4]STMFD sp!

6 ,{r4-r8,lr}r0 = r1 -5r2 = r3 + (r3 * 4)IF EQ condition true r5 = r5 + r6 Branch forwards or backwards relative to current PC (+/-32MB range)Load word at address r1 into r0IF NE condition true, store bottom byte of r2 to address r3+r4 Store registers r4 to r8 and lr on stack. Then update stack pointer7 Confidential131313 Thumb Instruction Set Thumb is a 16-bit instruction set Optimized for code density from C code (~65% of ARM code size) Improved performance from narrow memory Subset of the functionality of the ARM instruction set Thumb is not a regular instruction set! Constraints are not generally consistent Targeted at compiler generation, not hand coding141414 Thumb-2 Instruction Set Thumb-2 is a major extension to the Thumb ISA Adds 32-bit instructions to implement almost all of the ARM ISA functionality Retains the complete 16-bit Thumb instruction set Design objective.

7 ARM performance with Thumb code density No switching between ARM-Thumb states Compiler automatically selects mix of 16 and 32 bit instructions8 Confidential151515 Thumb 2 Performance / DensityPerformanceCode density100% ARM code100% Thumb codeRandom mix Profiled mixThumb-2161616 Processor Cores9 Confidential171717 ARM7 TDMI Processor Architecture v4T 3-stage pipeline Single interface to memory181818 ARM926EJ-S ProcessorARM926EJ-S Architecture v5TE 5-stage pipeline Single-cycle 32x16 multiplier Caches and TCMs Memory management unit (MMU) 2 AHB memory interfaces Jazelletechnology10 Confidential191919 ARM1176JZ(F)-S Processor Core TrustZone 8-stage pipeline Branch prediction Four AXI memory ports IEM (Intelligent Energy Management) Integrated VFP coprocessor202020 ARM11 MPCoreProcessor 1 4 MP11 processors Cache coherency Distributed interrupt controllerMP11MP11MP11MP1111 Confidential212121 ARM Cortex-M3 Processor Architecture v7-M (Thumb-2 only) Very different from previous ARM processors No CPSR register Vector table contains addresses, not instructions Processor automatically saves/restores state in exceptions Only 2 processor modes (Thread/Handler)

8 No Coprocessor 15 3-stage pipeline with static branch prediction Atypical Implementation Fixed memory map Integrated interrupt controller Serial-Wire Debug222222 ARM Cortex-A8 Processor Architecture v7-A 14 stage pipeline NEON media processor12 Confidential232323 The Instruction Pipeline242424 The Instruction Pipeline The ARM7 TDMI uses a 3-stage pipeline in order to increase the speed of the flow of instructions to the processor Allows several operations to be performed simultaneously, ratherthan serially The PC points to the instruction being fetched, not executed Debug tools will hide this from you This is now part of the ARM Architecture and applies to all processorsFETCHDECODEEXECUTEI nstruction fetched from memoryDecoding of registers used in instructionRegister(s) read from Register BankShift and ALU operationWrite register(s) back to Register BankPCPCPC -4PC-2PC -8PC -4 ARMT humb13 Confidential252525 CycleOperationADDSUBORRANDEORORRO ptimal Pipelining All operations here are on registers (single cycle execution) In this example it takes 6 clock cycles to execute 6 instructions Clock cycles per Instruction (CPI)

9 = 1123456789 FDEFDEFEFDEFDEDFDEWF -FetchD-DecodeE -ExecuteM262626 Breaking the pipeline Note that the core is executing in ARM stateCycleAddressOperation0x8000BL 0x8 FEC0x8004 SUB0x8FF0 ORR0x8 FECAND0x8FF4 EOR0x8008 ORR123456789 FDEFDFEFDEFDFDEWF -FetchD-DecodeE ExecuteL LinkretA -AdjustMELEAB ranch Pipeline Example14 Confidential272727 Cortex-A8 Integer PipelineInstruction Execute / Load StoreInstruction FetchF1 F2F0 Instruction DecodeReplay PenaltyD0 D1 D2 D3 D4E0 E1 E2 E3 E4 E5 Branch MispredictPenaltyAGUQ ueueRAM DECDECQ ueueDECSEQR egfileRemapScore board & Issue LogicShiftSATALUWBBP UpdateRegFileEarly DECDECP ending Replay QueueRouteMUL2 MUL1 WBADDS hiftSATALUWBBP UpdateAGUF ormat FwdRAM + TLBWBALUMULPIPE0 ALUPIPE1 LOADSTOREBP Update Optimisingcode to make use of the processor pipeline is very difficult Leave it to the compiler!

10 !282828 Reference Slides15 Confidential292929 Reference Material ARM ARM( Architecture Reference Manual ) ARM DDI 0100E covers v5TE DSP extensions Can be purchased from booksellers -ISBN 0-201-737191 (Addison-Wesley) Available for download from ARM swebsite ARM v7-M ARM available for download from ARM swebsite Contact ARM if you need a different version (v6, v7-AR, etc.) Steve Furber ARM system-on-chip Architecture -2nd edition ISBN 0-201-67519-6 (Addison-Wesley) Sloss, Symes& Wright ARM System Developer's Guide ISBN: 1-55860-874-5 (Morgan Kaufman) RVCT Assembler Guide Available for download from ARM swebsite Technical Reference Manuals for processor core being used Available for download from ARM s website303030 Naming Conventions ARMx1z ( ARM710T) indicates cache & full MMU ARMx2z( ARM720T) indicates cache, MMU & Process ID support ARMx3z()


Related search queries