PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: dental hygienist

D COMPRESSION: COMPRESSING DEEP NEURAL ETWORKS …

Published as a conference paper at ICLR 2016 DEEPCOMPRESSION: COMPRESSINGDEEPNEURALNETWORKS WITHPRUNING, TRAINEDQUANTIZATIONANDHUFFMANCODINGSong HanStanford University, Stanford, CA 94305, MaoTsinghua University, Beijing, 100084, J. DallyStanford University, Stanford, CA 94305, USANVIDIA, Santa Clara, CA 95050, networks are both computationally intensive and memory intensive, makingthem difficult to deploy on embedded systems with limited hardware resources. Toaddress this limitation, we introduce deep compression , a three stage pipeline: pruning , trained quantization and Huffman coding, that work together to reducethe storage requirement of NEURAL networks by35 to49 without affecting theiraccuracy. Our method first prunes the network by learning only the importantconnections. Next, we quantize the weights to enforce weight sharing, finally, weapply Huffman coding. After the first two steps we retrain the network to finetune the remaining connections and the quantized centroids.

Published as a conference paper at ICLR 2016 DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING Song Han Stanford University, Stanford, CA 94305, USA songhan@stanford.edu Huizi Mao Tsinghua University, Beijing, 100084, China

Tags:

  With, Deep, Trained, Compression, Pruning, Neural, Compressing deep neural, Compressing, With pruning, Trained quantization and, Quantization

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of D COMPRESSION: COMPRESSING DEEP NEURAL ETWORKS …

Related search queries