Example: marketing

Efficient Implementation of Ultrasound Color Doppler ...

ApplicationReportSPRAB11 November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x ,valvedefects, highperformanceC64x+ ,wallfilteringandflowpower,velocity, +architecture,thenappliesthistechniqueto Dopplerprocessingalgorithms,explainsthei rmappingtotheC64xarchitecture, ,itwillbeshownthatthesealgorithmscanruno nTI + (mixing)toTIC64+ (decimation)toTIC64+ + + + November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms1 SubmitDocumentationFeedback1 Background2 ColorDopplerModeTransducerFrontend + RXBeamformerB-Mode ImagingColor ScanConverterColor Doppler : B(Brightness) ColorDopplermode:SimilartoB-mode,buttran smitsmultiplepulses(ensemble) ,direction, shighperformanceC64x+DSP[1] ,complexityestimates, , , , ,turbulence,andpowerestimatesfromthecolo rDopplerimagingblockaremerg

1 Background 2 Color Doppler Mode ransducer Frontend + RX Beamformer B-Mode Imaging Color Scan Converter Color Doppler Imaging Tissue/ Flow Decision Color

Tags:

  Implementation, Efficient, Color, Imaging, Ultrasound, Doppler, Efficient implementation of ultrasound color doppler, Doppler imaging

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Efficient Implementation of Ultrasound Color Doppler ...

1 ApplicationReportSPRAB11 November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x ,valvedefects, highperformanceC64x+ ,wallfilteringandflowpower,velocity, +architecture,thenappliesthistechniqueto Dopplerprocessingalgorithms,explainsthei rmappingtotheC64xarchitecture, ,itwillbeshownthatthesealgorithmscanruno nTI + (mixing)toTIC64+ (decimation)toTIC64+ + + + November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms1 SubmitDocumentationFeedback1 Background2 ColorDopplerModeTransducerFrontend + RXBeamformerB-Mode ImagingColor ScanConverterColor Doppler : B(Brightness) ColorDopplermode.

2 SimilartoB-mode,buttransmitsmultiplepuls es(ensemble) ,direction, shighperformanceC64x+DSP[1] ,complexityestimates, , , , ,turbulence,andpowerestimatesfromthecolo rDopplerimagingblockaremergedinthetissue flowdecisionblock, RFdemodulationconsistingofmixing,filteri ng,anddecimationofechodata Wallfilteringusingamatrixinitializedform ofIIRfilters[5]ofdemodulateddata Colorflowestimator[6]thatestimatesveloci ty,turbulence,andpowertogether Flowpowerestimatorthatestimatesthepowero fmultiplesets(ensembles) , , C64x PlatformsSPRAB11 November2008 SubmitDocumentationFeedback3 MethodforEstimatingComplexityInstruction FetchSPLOOP Buffer16/32-Bit Instruction DispatchInstruction Path ARegister File Path BRegister File (DSPcycles)ontheC64x+core, +cores,thegeneralmethodologycarriesovert ootherverylonginstructionword(VLIW) +PlatformDuringeveryclockcycleintheTIC64 x+platform,theinstructionfetch,dispatch, anddecodeunitsdeliverinstructionstotheei ghtfunctionalunitsthatresideindatapathsA andB, (L,S,M,andD)

3 ,instructions,andmappingofinstructionsto functionalunits,seetheTMS320C64x/C64x+DS PCPUandInstructionSetReferenceGuide(SPRU 732)[1].Unlessotherwisenoted, ,outerloopoperationsinnested-loopimpleme ntations, ,theimplementationmayneedtochangethealgo rithmdatapath,withoutaffectingperformanc e,togetthelowestcomplexityontheC64x+ , ,thevariousunitswheretheseinstructionsco uldbemappedtoareconsideredand,finally,th ecyclesontheunitthataremostloadedareused astheestimateofthealgorithm [2],[3],[4].Inthenextfoursections, (NMSE) , , ,eachscanlineisprocessedindependentlybyf irstmixingitwithsinusoidstoproducein-pha se(I)andquadrature(Q)components,followed bylowpassfiltering(LPF)usingafiniteimpul seresponse(FIR)filtertopreventaliasing,a ndfinallydecimatingthefilteredoutputbyaf actor, (L) (=T/S) November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms3 SubmitDocumentationFeedback2 f tcRe{ m } r cos,0 t T 1,t tfsp = <- (1)2 f tcIm{ m } r sin,0 t T 1,t tfsp = <- (2)

4 LRe{ e }Re{ m },0 d D 1,t Sd,dt - ll 0= -== (3)LIm{ e }Im{ m },0 d D 1,t Sddt - ll 0= -== (4) (fc)andfront-endsamplingfrequencyfs,theb eamformedRFdataforeachscanline,RT,isfirs tmixedtocreatein-phaseandquadraturecompo nentsofthemixedoutputvector, ,lower-caselettersareusedasindexesintoqu antitieswhosemaximumvaluesareindicatedby theirupper-casecounterparts, , (R,M),whiletheirlowercasecounterparts(r, m) ;whereas,subscriptsusedwithelementsdenot etheirpositionsinthematrix/vector, , ,F,anddown-sampledinasinglestep, ,ithasbeenassumedthattheinputRconsistsof 32-bitsignedintegers, ,cosinevalues, C64x PlatformsSPRAB11 +CoreRepeat the Following Kernel T/2 Times to Process T RF Pointss (16)tc (16)tMemoryr (32)tr(32)1+1 Memorys(16)1+1c(16)1+1 MemoryMPY2 IRs (16)tc (16)tRegisters(16)1+1c(16)1+1 Registerr (32)tr(32)1+1 Register PairMPY2 IRRe{m (32)t}Im{m (32)t}Register PairRe{m(32)1+1}Im{m(32)1+1}Register PairPACKH2 PACKH2 PACK and STORE (<4KB), (mixing)

5 ToTIC64+ ,thiskernelcanbeconsideredtobeload-limit edandwouldneedapproximatelyTcyclesforpro cessingTRFpoints, , +InstructionCPUU nitsLoads Input(T/2)/2 LDDWD1/D2 Loads Sine/CosineT/2 LDWD1/D2 Multiplies(2T/2)/2 MPY2 IRM1/M2 Adds(Tableindexing)2T/4 ADDL1/L2 ShiftT/2 SHRS1/S2 Modulus(Tableindexing)T/4 ANDL,SStores(2T/4)/2 STDWD1/D2 SPRAB11 November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms5 SubmitDocumentationFeedbacks-rds-idMemor yPACK2 SHIFT and STORE COMPLEX ydLDDWR epeat Outer Loop D/4 Times to Process D Depth Pointsy R =y I =y R =y I =01 d1 d0 d0 dRepeat Inner Loop L/2 Times to Process L Filter Coeffss-rd+1s-id+1flfl+1 MemoryLDDWfl+2fl+3s-rd+2s-id+2 MemoryLDDWs-rd+3s-id+3s-rds-idRegister Pairs-rd+1s-id+1flfl+1 Register Pairfl+2fl+3s-rd+2s-id+2 Register Pairs-rd+3sid+3 PACKH2R0R1 PACK2 PACKH2y I0 ds-rds-rd+1

6 Registers-ids-id+1 Registers-rd+2s-rd+3 Registers-id+2s-id+3 RegisterDOTP2 DOTP2R0R0 ADDADDy R0 dy I0 dy R0 dy I1 dDOTP2 DOTP2R1R1 ADDADDy R1 dy R1 dy I1 dADDy R0 dyRdy R1 dy I1 dADDy Idy I0 , [7],whichisconceptuallysimilartoclocking themixedinputsamplesintothetwofilters,on efortherealpartandonefortheimaginarypart s,everycycleoftheup-sampledclock, (=T/S) ,itrunsforonlyD,notT, (decimation)toTIC64+ArchitectureEfficien tImplementationofUltrasoundColorDopplerA lgorithmsonTexasInstruments C64x Platforms6 SPRAB11 ,DxNDxNNxN= (5) + ,L, , +InstructionCPUU nitsLoads Input(T/2)/2 LDDWD1/D2 Loads Filter(L/4)/2 LDDWD1/D2 MultipliesD*L/2 DOTP2M1/M2 AddsD*(L/2-1)ADDL1/L2 Stores(2D/4)/2 STDWD1/D2 TestcodeusingCandintrinsicswasusedtoachi evepipelinedkernelperformanceofL/2+Scycl es-per-outputpoint, ,sinceitneedstofilterveryshortsequenceso fensemblesizeN, ,astate-spaceformulationoftheIIRfilter[5 ] ,themostcommonformsbeing.

7 Zero,step, ,X,withrealcoefficientmatrix,W,givenby,B othoftheseinputandoutputmatricesconsisto fDrowscorrespondingtotheDdepthpointsandN columns, ,iscreatedfromtheNdecimatedscanlines(E), , , , ,itisassumedthattheinputandoutputdataare arrangedensemble-by-ensemble,( ,alltheNinput/outputpointsatagivendepthp oint,d,lieadjacenttoeachotherinmemory,be foretheNinput/outputpointsofthenextdepth point).SPRAB11 November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms7 SubmitDocumentationFeedbacks-rd,ns-id,nM emoryPACK2 SHIFT and STORE COMPLEX yLDDWR epeat Inner Loop 1 (m) N Times to Process N Ensemble PointsyR=yI=0.

8 Repeat Inner Loop 2 (n) N/2 Times to Process N Ensemble Pointss-rd,n+1s-id,n+1wn,mwn+1,mMemoryLD WPACKH2 DOTP2 DOTP2R0R0 ADDADDR epeat Outer Loop D Times to Process D Depth PointsyRyRyIwn,mwn+1,mRegisters-rd,ns-id ,nRegister Pairs-rd,n+1s-id,n+1s-rd,ns-rd,n+1 Registers-id, , ,packing, + ,thetwoMunitswouldbeusedthemostbecauseth eywouldbeusedforthetwosets(realandimagin ary) ,2 Nrealmultipliesareneededforeachpoint,tak ingDN2 ,thisalgorithmshouldneedDN2/2 , , C64x Platforms8 SPRAB11 122piq ,0 d (D 1),dd,n d,nn 0-=+ -= KKK(6) + +InstructionCPUU nitsLoads InputDN/4 LDDWD1/D2 Loads CoefficientsN2/2 LDDWD1/D2 MultipliesDN2/2 DOTP2M1/M2 AddsDN2/2 ADDL1/L2 StoresDN/4 STDWD1 , (XDxN=IDxN+jQDxN)

9 WhereallthematricesareofsizeDxN,theoutpu tvectorPD={pd}iscomputedas, , + , ,youcanseethatthisalgorithmismultiply-li mitedandwouldneedDN/2cyclestoprocessDdec imatedpointstranslatingtoapipelinedperfo rmanceofN/2 ,matchingthebenchmarks, November2008 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms9 SubmitDocumentationFeedbacks-rd+1,nMemor ySHIFT and STORE RESULTSLDDWR epeat Outer Loop D/4 Times to Process D Ensemble PointsRepeat Inner Loop N/2 Times to Process N Ensemble PointsDOTP2 DOTP2 ADDADDSUM0s-id+1,ns-id+1,n+1s-id+1,n+1s- rd+1,nMemorys-id+1,ns-id+1,n+1s-id+1,n+1 LDDWs-rd+2,nMemorys-id+2,ns-id+2,n+1s-id +2,n+1s-rd+3,nMemorys-id+3,ns-id+3,n+1s- id+3,n+1s-rd+1,nRegister Pairs-id+1,ns-id+1,n+1s-id+1,n+1 SUM0s-rd+1,nRegister Pairs-id+1,ns-id+1,n+1s-id+1,n+1 DOTP2 DOTP2 ADDADDSUM1 SUM1 LDDWs-rd+2,nRegister Pairs-id+2,ns-id+2,n+1s-id+2,n+1 DOTP2 DOTP2 ADDADDSUM2 SUM2 LDDWs-rd+3,nRegister Pairs-id+3,ns-id+3,n+ + InputDN/4 LDDWD1/D2 MultipliesDN/2 DOTP2M1/M2 AddsD(N 1)

10 /2 ADDL1/L2 StoresD/8 STDWD1/D2 EfficientImplementationofUltrasoundColor DopplerAlgorithmsonTexasInstruments C64x Platforms10 SPRAB11 2*yy,dd ,nd ,nn 0-= = (7)cN 2*yy ,dd,n 1 d,nn 0-= += (8)vIm(c )1dtan,dRe(c )d -= (9)tcd1,dpd = - (10) + ,Y,thismoduleestimatesbloodflowvelocity, power, ,d,forcomputingbothitsvelocity,vd,andtur bulence,td,estimates,itusesthecorrelatio nsbetweenadjacentensemblepoints,cd,anden semblepower,pd,asshownbelow,Themoduleout putsestimatesforflowpower,velocity, ,YDxN={yd,n}consistsofpacked16-bitcomple xvalues;thepower,velocity, ,theDSPIQM athlibrary[8]isusedforthefourquadrantinv ersetangent(_IQNatan2),division(_IQNdiv) ,andmagnitude(_IQNmag) , :oneforpowerandcorrelation,thesecondonef orvelocity, , ,thiskernelismultiply-limiteda


Related search queries