Transcription of @google.com arXiv:1609.03499v2 [cs.SD] 19 Sep 2016
{{id}} {{{paragraph}}}
WAVENET: A GENERATIVEMODEL FORRAWAUDIOA aron van den OordSander DielemanHeiga Zen Karen SimonyanOriol VinyalsAlex GravesNal KalchbrennerAndrew SeniorKoray Kavukcuoglu{avdnoord, sedielem, heigazen, simonyan, vinyals, gravesa, nalk, andrewsenior, DeepMind, London, UK Google, London, UKABSTRACTThis paper introduces WaveNet, a deep neural network for generating raw audiowaveforms. The model is fully probabilistic and autoregressive, with the predic-tive distribution for each audio sample conditioned on all previous ones; nonethe-less we show that it can be efficiently trained on data with tens of thousands ofsamples per second of audio. When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly more naturalsounding than the best parametric and concatenative systems for both English andMandarin. A single WaveNet can capture the characteristics of many differentspeakers with equal fidelity, and can switch between them by conditioning on thespeaker identity.}
where 1 <x t <1 and = 255. This non-linear quantization produces a significantly better reconstruction than a simple linear quantization scheme. …
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}