PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: confidence

D COMPRESSION: COMPRESSING DEEP NEURAL ETWORKS …

Published as a conference paper at ICLR 2016 DEEPCOMPRESSION: COMPRESSINGDEEPNEURALNETWORKS WITHPRUNING, TRAINEDQUANTIZATIONANDHUFFMANCODINGSong HanStanford University, Stanford, CA 94305, MaoTsinghua University, Beijing, 100084, J. DallyStanford University, Stanford, CA 94305, USANVIDIA, Santa Clara, CA 95050, networks are both computationally intensive and memory intensive, makingthem difficult to deploy on embedded systems with limited hardware resources. Toaddress this limitation, we introduce deep compression , a three stage pipeline:pruning, trained quantization and Huffman coding , that work together to reducethe storage requirement of NEURAL networks by35 to49 without affecting theiraccuracy. Our method first prunes the network by learning only the importantconnections.

Huffman coding gives more compression: between 35 and 49 . The compression rate already included the meta-data for sparse representation. ... During update, all the gradients are grouped by the color and summed together, multiplied by the learning rate and subtracted from the shared centroids from last iteration. For pruned AlexNet, we are able ...

Tags:

  Coding, Color, Compression

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of D COMPRESSION: COMPRESSING DEEP NEURAL ETWORKS …

Related search queries