Domain Adaptation for Large-Scale Sentiment Classification ...

Domain Adaptation for Large-Scale Sentiment Classification :A deep learning ApproachXavier IRO, Universit e de Montr eal. Montr eal (QC), H3C 3J7, Canada(2)Heudiasyc, UMR CNRS 6599, Universit e de Technologie de Compi`egne, 60205 Compi`egne, FranceAbstractThe exponential increase in the availability ofonline reviews and recommendations makessentiment Classification an interesting topicin academic and industrial can span so many different domainsthat it is difficult to gather annotated train-ing data for all of them. Hence, this pa-per studies the problem of Domain adapta-tion for Sentiment classifiers, hereby a sys-tem is trained on labeled reviews from onesource Domain but is meant to be deployedon another.

We propose a deep learning ap-proach which learns to extract a meaningfulrepresentation for each review in an unsuper-vised fashion. Sentiment classifiers trainedwith this high-level feature representationclearly outperform state-of-the-art methodson a benchmark composed of reviews of 4types of Amazon products. Furthermore, thismethod scales well and allowed us to success-fully perform Domain Adaptation on a largerindustrial-strength dataset of 22 IntroductionWith the rise of social media such as blogs and so-cial networks, reviews, ratings and recommendationsare rapidly proliferating; being able to automaticallyfilter them is a current key challenge for businesseslooking to sell their wares and identify new marketopportunities.

This has created a surge of research insentiment Classification (or Sentiment analysis), whichaims to determine the judgment of a writer with re-Appearing inProceedings of the 28thInternational Con-ference on Machine learning , Bellevue, WA, USA, 2011 by the author(s)/owner(s).spect to a given topic based on a given textual com-ment. Sentiment analysis is now a mature machinelearning research topic, as illustrated with this review(Pang and Lee, 2008). Applications to many differ-ent domains have been presented, ranging from moviereviews (Panget al., 2002) and congressional floor de-bates (Thomaset al., 2006) to product recommenda-tions (Snyder and Barzilay, 2007; Blitzeret al.)

, 2007).This large variety of data sources makes it difficult andcostly to design a robust Sentiment classifier. Indeed,reviews deal with various kinds of products or servicesfor which vocabularies are different. For instance, con-sider the simple case of training a system analyzingreviews about only two sorts of products:kitchen ap-pliancesandDVDs. One set of reviews would con-tain adjectives such as malfunctioning , reliable or sturdy , and the other thrilling , horrific or hi-larious , etc. Therefore, data distributions are differ-ent across domains. One solution could be to learna different system for each Domain . However, thiswould imply a huge cost to annotate training data fora large number of domains and prevent us from ex-ploiting the information shared across domains.

Analternative strategy, evaluated here, consists in learn-ing a single system from the set of domains for whichlabeled and unlabeled data are available and then ap-ply it to any target Domain (labeled or unlabeled).This only makes sense if the system is able to discoverintermediate abstractions that are shared and mean-ingful across domains. This problem of training andtesting models on different distributions is known asdomain Adaptation (Daum e III and Marcu, 2006).In this paper, we propose a deep learning approachfor the problem of Domain Adaptation of sentimentclassifiers. The promising new area of deep Learn-ing has emerged recently; see (Bengio, 2009) for a re-view.

deep learning is based on algorithms fordis-covering intermediate representationsbuilt in aDomain Adaptation for Sentiment Classification with deep Learninghierarchical manner. deep learning relies on the dis-covery that unsupervised learning could be used to seteach level of a hierarchy of features, one level at atime, based on the features discovered at the previouslevel. These features have successfully been used toinitialize deep neural networks (Hinton and Salakhut-dinov, 2006; Hintonet al., 2006; Bengioet al., 2006).Imagine a probabilistic graphical model in which we in-troduce latent variables which correspond to the trueexplanatory factors of the observed data. It is likelythat answering questions and learning dependencies inthe space of these latent variables would be easier thananswering questions about the raw input.

A simple lin-ear classifier or non-parametric predictor trained fromas few as one or a few examples might be able to do thejob. The key to achieving this is learning better rep-resentations, mostly from unlabeled data: how this isdone is what differentiates deep learning deep learning system we introduce in Section 3is designed to use unlabeled data to extract high-levelfeatures from reviews. We show in Section 4 that senti-ment classifiers trained with these learnt features can:(i) surpass state-of-the-art performance on a bench-mark of 4 kinds of products and (ii) successfully per-form Domain Adaptation on a Large-Scale data set of 22domains, beating all of the baselines we Domain AdaptationDomain Adaptation considers the setting in which thetraining and testing data are sampled from differentdistributions.

Assume we have two sets of data: asourcedomainSproviding labeled training instancesand atargetdomainTproviding instances on whichthe classifier is meant to be deployed. We do not makethe assumption that these are drawn from the samedistribution, but rather thatSis drawn from a distri-butionpSandTfrom a distributionpT. The learningproblem consists in finding a function realizing a it is trained on data drawnfrompSand generalizes well on data drawn learning algorithms learns intermediate con-cepts between raw input and target. Our intuitionfor using it in this setting is that these intermediateconcepts could yield better transfer across for example that these intermediate conceptsindirectly capture things like product quality, productprice, customer service, etc.

Some of these conceptsare general enough to make sense across a wide rangeof domains (corresponding to products or services, inthe case of Sentiment analysis). Because the samewords or tuples of words may be used across domainsto indicate the presence of these higher-level concepts,it should be possible to discover them. Furthermore,because deep learning exploits unsupervised learningto discover these concepts, one can exploit the largeamounts of unlabeled data across all domains to learnthese intermediate representations. Here, as in manyother deep learning approaches, we do not engineerwhat these intermediate concepts should be, but in-stead use generic learning algorithms to discover Related WorkLearning setups relating to Domain Adaptation havebeen proposed before and published under differentnames.

Daum e III and Marcu (2006) formalized theproblem and proposed an approach based on a mix-ture model. A general way to address Domain adapta-tion is through instance weighting, in which instance-dependent weights are added to the loss function(Jiang and Zhai, 2007). Another solution to domainadaptation can be to transform the data representa-tions of the source and target domains so that theypresent the same joint distribution of observations andlabels. Ben-Davidet al.(2007) formally analyze theeffect of representation change for Domain adaptationwhile Blitzeret al.(2006) propose the Structural Cor-respondence learning (SCL) algorithm that makes useof the unlabeled data from the target Domain to finda low-rank joint representation of the , Domain Adaptation can be simply treated as astandard semi-supervised problem by ignoring the do-main difference and considering the source instancesas labeled data and the target ones as unlabeled data(Daiet al.)

Domain Adaptation for Large-Scale Sentiment Classification ...

Tags:

Information

Transcription of Domain Adaptation for Large-Scale Sentiment Classification ...

Related search queries

Domain Adaptation for Large-Scale Sentiment Classification ...

Tags:

Information

Related documents

Sentribute: Image Sentiment Analysis from a Mid-level ...

Predicting Croatian Phrase Sentiment Using a Deep Matrix ...

Dhrystone Benchmark - Dr. John S. Loomis

Countering the Adversary: Effective Policies or a DIME a ...

Published by: Institute for Massachusetts Studies and ...

Related search queries