Transcription of Graph Convolutional Matrix Completion
1 Graph Convolutional Matrix CompletionRianne van den BergUniversity of AmsterdamAmsterdam, The N. KipfUniversity of AmsterdamAmsterdam, The Welling University of AmsterdamAmsterdam, The consider Matrix Completion for recommender systems from thepoint of view of link prediction on graphs. Interaction data suchas movie ratings can be represented by a bipartite user-item graphwith labeled edges denoting observed ratings. This representationis especially useful in the setting where additional Graph -based sideinformation is present. Building on recent progress in deep learn-ing on Graph -structured data, we propose a Graph auto-encoderframework based on differentiable message passing on the bipar-tite interaction Graph .
2 In settings where complimentary featureinformation or structured data such as a social network is avail-able, our framework outperforms recent state-of-the-art , to validate the proposed message passing scheme, wetest our model on standard collaborative filtering tasks and showcompetitive CONCEPTS Computing methodologies Learning latent representations;Machine learning;Neural networks;Factorization methods; Human-centered computing Collaborative filtering;Social networks;KEYWORDSG raph neural networks, Matrix Completion , collaborative filtering1 INTRODUCTIONWith the explosive growth of e-commerce and social media plat-forms, recommendation algorithms have become indispensabletools for many important subtask of recommender systems is Matrix comple-tion.
3 In this work, we view Matrix Completion as a link predictionproblem on graphs: the interaction data between users and itemscan be represented by a bipartite Graph between user and itemnodes, with observed ratings/purchases represented by links. Pre-dicting ratings then corresponds to predicting labeled links in thebipartite user-item accordance with this point of view, we propose Graph convo-lutional Matrix Completion (GC-MC): a Graph -based auto-encoderframework for Matrix Completion , which builds on recent progressin deep learning on Graph -structured data [1,4,5,7,13,17]. Theauto-encoder produces latent features of user and item nodes through also affiliated with the Canadian Institute for Advanced Research (CIFAR)Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page.
4 Copyrights for third-party components of this work must be all other uses, contact the owner/author(s).KDD 18 deep Learning Day, August 2018, London, UK 2018 Copyright held by the owner/author(s).a form of message passing on the bipartite interaction Graph . Theselatent user and item representations are used to reconstruct therating links through a bilinear benefit of formulating Matrix Completion as a link predic-tion task on a bipartite Graph becomes especially apparent whenrecommender graphs are accompanied with structured externalinformation such as social networks. Combining such external infor-mation with interaction data can alleviate performance bottlenecksrelated to the cold start problem. We demonstrate that our graphauto-encoder model efficiently combines interaction data with sideinformation, without resorting to recurrent frameworks as in [20].
5 We furthermore show that in the pure collaborative filtering setting,our methods is able to compete with recent state of the art main contributions are as follows: (1) we apply Graph neuralnetworks to the task of Matrix Completion with structured side-information, and show that our simple message passing modeloutperforms more complicated Graph -based approaches such as[20]. (2) we introduce node dropout, an effective regularizationtechnique that drops out the entire set of all outgoing messages ofa single node with a fixed open source implementation of this work can be found Matrix Completion AS LINKPREDICTION IN BIPARTITE GRAPHSC onsider a rating matrixMof dimensionsNu Nv, whereNuisthe number of users andNvis the number of items.
6 A nonzeroentryMi jin this Matrix represents an observed rating from userifor j=0reflects an unobserved rating. See Figure 1 foran illustration. The task of Matrix Completion consists of predictingthe value of unobserved entries an equivalent picture, Matrix Completion can be cast as a linkprediction problem on a bipartite user-item interaction Graph . Morespecifically, the interaction data can be represented by an undirectedgraphG=(W,E,R)with entities consisting of a collection ofuser nodesui Wuwithi {1,..,Nu}, and item nodesvj Wvwithj {1,..,Nv}, such thatWu Wv=W. The edges(ui,r,vj) Ecarry labels that represent ordinal rating levels, suchasr {1,..,R}=R. This connection was previously exploredin [16] and led to the development of Graph -based methods Graph -based approaches for recommender systems (see[16] for an overview) typically employ a multi-stage pipeline, con-sisting of a Graph feature extraction model and a link predictionmodel, all of which are trained separately.
7 Recent results, however,have shown that results can often be significantly improved by mod-eling Graph -structured data with end-to-end learning techniques[1,4,5,13,17,19] and specifically with Graph auto-encoders [12,26]KDD 18 deep Learning Day, August 2018, London, UKR. van den Berg, Kipf, M. WellingUsersItems00200400500103005000 UsersItemsRating matrixBipartite graph13524 UsersItemsLink prediction??GAEG raphAuto-Encoder???M2 Figure 1:Left: Rating matrixMwith entries that correspond to user-item interactions (ratings between 1-5) or missing ob-servations (0).Right: User-item interaction Graph with bipartite structure. Edges correspond to interaction events, numberson edges denote the rating a user has given to a particular item.
8 The Matrix Completion task ( predictions for unobservedinteractions) can be cast as a link prediction problem and modeled using an end-to-end trainable Graph unsupervised learning and link prediction. In what follows, weintroduce a specific variant of Graph auto-encoders for the task ofmatrix Completion . We will show how Graph -based side informationcan be incorporated Revisiting Graph auto-encodersWe revisit Graph auto-encoders, which were originally introduced in[12,26] as an end-to-end model for unsupervised learning [26] andlink prediction [12] on undirected graphs. We consider the setupintroduced in [12], where the Graph encoder modelZ=f(X,A)takes as input anN Dfeature matrixXand a Graph adjacencymatrixA, and produces anN Hnode embedding matrixZ=[z1.]
9 ,zN]T. The decoder model is a pairwise decoder A=д(Z),which takes pairs of node embeddings(zi,zj)and predicts entries Ai jin the adjacency Matrix . Note thatNdenotes the total numberof nodes,Dthe number of input features, andHthe bipartite recommender graphsG=(W,E,R), we can refor-mulate the encoder as[Zu,Zv]=f(Xu,Xv,M1,..,MR), whereMr {0,1}Nu Nvis the adjacency Matrix associated with ratingtyper R, such thatMrcontains1 s for those elements for whichthe original rating matrixMcontains observed ratings with now matrices of user and item embeddings withdimensionsNu HandNv H, respectively. A single user (item)embedding takes the form of a real-valued vectorzui(zvj) for useri(itemj).Analogously, we can reformulate the decoder as M=д(Zu,Zv), as a function acting on the user and item embeddings and return-ing a (reconstructed) rating Matrix Mof dimensionsNu Nv.
10 Wecan train this Graph auto-encoder by minimizing the reconstructionerror between the predicted ratings in Mand the observed ground-truth ratings inM. Examples of metrics for the reconstruction errorare the root mean square error, or the cross entropy when treatingthe rating levels as different recent state-of-the-art models for Matrix Completion [6,15,23,28] can be cast into this framework and understood as aspecial case of our model. An overview of these models is providedin Section Graph Convolutional encoderIn what follows, we propose a particular choice of encoder modelthat makes efficient use of weight sharing across locations in thegraph and that assigns separate processing channels for each edgetype (or rating type)r R.