Example: bachelor of science

LPDDR3/4-ECC DRAM for High-reliability - IoT, Automotive ...

LPDDR3/4-ECC DRAM for high -reliabilityIoT, Automotive and control system ApplicationsWolfgang 13th, 2015 Copyright Green Mountain Semiconductor Errors and Technology Scaling2 Growth Markets for Memory3 ECC for Safety Critical IoT, Automotive and IndustrialApplications4A flexible ECC solution for LPDDR3/4 DRAM up to 2133 Mbps5 ConclusionCopyright Green Mountain Semiconductor FaultsFailures in a DRAM can be classified as follows[Constantinescu, 2002].Hard FaultsPermanent recurring faults. These faults cause the memorylocation to persistently return incorrect FaultsThese faults cause a memory location to occasionally returnincorrect data. This may be due to a weak cell, and or moreextreme operating conditions (temperature).Transient FaultsAlso know as soft error, these faults are unpredictable and are notrelated to device damage.

Title: LPDDR3/4-ECC DRAM for High-reliability - IoT, Automotive and Control System Applications Author: Wolfgang Hokenmaier Created Date: 10/1/2015 10:35:38 PM

Tags:

  Applications, High, System, Control, Reliability, Automotive, For high reliability iot, Automotive and control system applications

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of LPDDR3/4-ECC DRAM for High-reliability - IoT, Automotive ...

1 LPDDR3/4-ECC DRAM for high -reliabilityIoT, Automotive and control system ApplicationsWolfgang 13th, 2015 Copyright Green Mountain Semiconductor Errors and Technology Scaling2 Growth Markets for Memory3 ECC for Safety Critical IoT, Automotive and IndustrialApplications4A flexible ECC solution for LPDDR3/4 DRAM up to 2133 Mbps5 ConclusionCopyright Green Mountain Semiconductor FaultsFailures in a DRAM can be classified as follows[Constantinescu, 2002].Hard FaultsPermanent recurring faults. These faults cause the memorylocation to persistently return incorrect FaultsThese faults cause a memory location to occasionally returnincorrect data. This may be due to a weak cell, and or moreextreme operating conditions (temperature).Transient FaultsAlso know as soft error, these faults are unpredictable and are notrelated to device damage.

2 The memory location can be fixed byre-writing the correct Errors and Technology ScalingCopyright Green Mountain Semiconductor RatesSome study points on DRAM in a server farm[Sridharan, 2012].IThere is a chance a 2GB DDR2 DIMM will have a hardfault over an 11 month faults dominate, accounting for 70% of faultsI47% of faults are single of faults are from a single column or rowIThese results are from a server farm, which is a controlledenvironmentMemory Errors and Technology ScalingCopyright Green Mountain Semiconductor of Hard and Intermittent FaultsVRT errors increase with scaling [Kang, 2014]ICell capacitance decreasingITransistor leakage increasing (GIDL, charge trapping)Circuit degradation increases with scaling [Constantinescu, 2007]IIncreased resistance, permanent or intermittentICrosstalk delaysIUltra-thin oxide breakdownApparent hard and intermittent errors may result from systemdesign errors, namely specification violations [Aichinger, 2012]ITiming specification violationsIRefresh violationsMemory Errors and Technology ScalingCopyright Green Mountain Semiconductor Error Rate Scaling DRAM Effective Scaling Vertical Integration etc.

3 2015 12Gb system level Soft Error Rate remained flat with technology scalingfor DRAM, due to reduction in cell area, and voltage andcapacitance not scaling proportionally. Vertical or integrationwill see linear increase of system Level SER. [Baumann, 2005]Memory Errors and Technology ScalingCopyright Green Mountain Semiconductor Error Rate Scaling SRAM Effective Scaling Vertical Integration? with BPSG 2015 10nm SRAM Soft Error Rate per bit remains flat with technology scalingdue to reduction in node capacitance and aggressive voltagescaling. system level SER continues to increase with memoryusage. [Baumann, 2005]Memory Errors and Technology ScalingCopyright Green Mountain Semiconductor Error Rate Scaling DRAM Effective Scaling Vertical Integration Vertical disturb negligible?

4 Possibility to address systematic weakness through system /specification Hypothesis, insufficient published data IErrors demonstrated on 2GB DDR2 DRAM modules[Kim, 2014]IScaling risk: Cells and Word/Bit Lines move closer togetherI3D Integration effect on disturb needs studyISensitive outlier cells may be screened (at high test cost)ITargeted refresh ( PARA ) difficult to employ in practiceMemory Errors and Technology ScalingCopyright Green Mountain Semiconductor Standard DensitiesMemory Errors and Technology ScalingCopyright Green Mountain Semiconductor StackingDRAM is growing VerticallyIVertical stacks mean largeramounts of memory per deviceIRepair is typically done afterassembly of all layersIMore layersincreasesredundancy flexibilityI Effective scaling does notreduce the DRAM cell size, andtherefore does not reduce thesoft error rate per cell.

5 Thisresults in a potential increase inthe soft error rateMemory Errors and Technology ScalingCopyright Green Mountain Semiconductor Market is maturingIGrowth is in other sectorsGrowth Markets for MemoryCopyright Green Mountain Semiconductor EnvironmentsServer EnvironmentITemperature ControlIReplaceableIDevice OrientationControlMobile EnvironmentINo Thermal ControlIUncontrolled RadiationIHigh Physical StressIElectromagnetic noiseGrowth Markets for MemoryCopyright Green Mountain Semiconductor system RequirementsIoTComputer MarketMedicalAviationManufacturingAutomo tiveEnergyConsumerOfficeServerData Security/PrivacyXXXXXXXR eliabilityXXXXXXHack/Sabotage ResistanceXXXXXXXLow MaintenanceXXXXFail-SafeXXXXXXE arly WarningXXXXXXE nvironmental (temp, humidity)XXXXXE nvironmental (radiation)XXDocumentationXXXXXXXCostXXX XT able.

6 IoT system requirements, addressable by on-die ECC (yellow)Growth Markets for MemoryCopyright Green Mountain Semiconductor ECCDRAM without ECCDRAM with ECC (ninth module contains parity bits)ITraditional DRAM ECC is done external to the moduleITypically an extra device per 8 devices is used to includeparity for ECCIECC drives more data lines for the parity bits, increasingpower consumptionECC for Safety Critical IoT, Automotive and Industrial ApplicationsCopyright Green Mountain Semiconductor system ECC ImplementationIn-package ECC solution more suitable for compact systemsIEmbedded, industrial and mobile computing devices use oneor few multi-chip memory packages, soldered directly to boardor even onto memory configuration,single boardIPoP and SiP package solutionsIOn-die ECC for lower powerIOn-die ECC allows retrofitECC for Safety Critical IoT.

7 Automotive and Industrial ApplicationsCopyright Green Mountain Semiconductor to Lower PowerSome developers have used ECC to clean up weak is under consideration for LPDDR4 spectREFWCELL CHARGELow Leak CellHigh Leak CellCell failure due to leakageECC can be used push retention times even longer. ECC cancorrect the tail end of worst case cells that contribute to retentionfails. By letting ECC correct these fails, retention can be pushedout and total power consumption can be for Safety Critical IoT, Automotive and Industrial ApplicationsCopyright Green Mountain Semiconductor has a large data prefetch, which makes a ECCdesign device more appealing since a large data word is moreefficient for Masking poses a problem for ECC by invalidating theECC solution, so LPDDR4 has introduced a dedicated MaskedWrite commandIThis allows for a read, modify write operation to recalulate anew syndromeIHowever correction on a large data word effects performance,and the ECC data still forces a fixed size for Safety Critical IoT.

8 Automotive and Industrial ApplicationsCopyright Green Mountain Semiconductor flexible ECC solution Wide Bus ECC Memory Array (Multiples of 16 bits) ECC ECC ECC ECC ECC ECC Data IO Memory Array Data + Parity Data + Parity Data IO Dedicated ECC architecture versus fully configurable flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor flexible ECC solution General Purpose Memory ECC Data IO ECC Protected Memory ECC ECC ECC ECC ECC ECC ECC User configurable ECC allocation through mode register flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor Correct/Repair (Scrub) During RefreshARRAYSENSE AMPLIERR efreshed WLCorrected Area of WLFirst RefreshARRAYSENSE AMPLIERR efreshed WLARRAYSENSE AMPLIERR efreshed WLtREFWADDR 0tREFWADDR 0tREFWADDR 0 Corrected Area of WLSecond RefreshCorrected Area of WLThird RefreshtREFWADDR 0A flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor and Physical Repair During RefreshDuring Refresh, address of fail can be stored for address fails multiple times in a row, it s likely to be a hardfailIIf failing address is a hard fail, cell can be repaired either witha register, or with spare elementISince refresh increments through all addresses, smarter repairscan be madeIIf two or more bits fail on a bitline.

9 Spare bitline can be usedfor replacementIIf two or more bits fail on a wordline, spare wordline can beused for health of chip during refresh scrubbing can be storedin user available register for chip health flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor Repair during refreshARRAYSENSE AMPLIERBITLINEWORDLINEBITLINEHard failing cell, replace with redundant element, or with registerARRAYSENSE AMPLIERBITLINEWORDLINEBITLINEHard fail along a bitline(Replace with spare bitline)ARRAYSENSE AMPLIERBITLINEWORDLINEBITLINEHard fail along a wordline(Replace with spare wordline)A flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor ERR Register Memory Array ECC ECC ECC ECC ECC ECC Data IO SEC-DED Out Self repair Logic Redundancy Activation IReal-time Single Error Correct, Double Error Detect OutputIFail register and self repair engineA flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor Dynamic RefreshUsing ECC to control Refresh RateRefresh RateTemperatureTypical Refresh Margin SettingMarginCell Distribution of Retention FailsECC ControlledRefresh RateIncrease refresh rate if too many fails and reduce rate if too fewfails, always guaranteeing refesh rate mimics cell fail system .

10 No need for tightly calibrated flexible ECC solution for LPDDR3/4 DRAM up to 2133 MbpsCopyright Green Mountain Semiconductor Scaling is pushing technologies to the limits, with errorrates expected to critical embedded applications introduce reliabilitychallenges not met by traditional ECC Mountain s architecture is fully backwards compatible,with no additional latency or speed derating up to circuit area overhead allows for economicalECC/non-ECC combo chip architecture, lowering productdevelopment Dynamic Refresh enables ultra low power atmonitored, definable quality Green Mountain Semiconductor Tilak Agerwala - Data Centric Systems - The Next Paradigm in Computing (2014),Keynote Lecture,ICCP 2014 Christian Constantinescu - Impact of Deep Submicron Technology on Dependenability of VLSI Circuits(2002),International Conference on Dependable Systems and Networks (DSN), pp.


Related search queries