Transcription of Distributed Representations of Sentences and Documents
{{id}} {{{paragraph}}}
Distributed Representations of Sentences and DocumentsQuoc Inc, 1600 Amphitheatre Parkway, Mountain View, CA 94043 AbstractMany machine learning algorithms require theinput to be represented as a fixed-length featurevector. When it comes to texts, one of the mostcommon fixed-length features is their popularity, bag-of-words featureshave two major weaknesses: they lose the order-ing of the words and they also ignore semanticsof the words. For example, powerful, strong and Paris are equally distant. In this paper, weproposeParagraph Vector, an unsupervised algo-rithm that learns fixed-length feature representa-tions from variable-length pieces of texts, such assentences, paragraphs, and Documents .
Distributed Representations of Sentences and Documents example, “powerful” and “strong” are close to each other, whereas “powerful” and “Paris” are more distant.
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}