Time-series Generative Adversarial Networks

Time-series Generative Adversarial NetworksJinsung Yoon University of California, Los Angeles, Jarrett University of Cambridge, van der SchaarUniversity of Cambridge, UKUniversity of California, Los Angeles, USAAlan Turing Institute, good Generative model for Time-series data should preservetemporal dynamics,in the sense that new sequences respect the original relationships between variablesacross time. Existing methods that bring Generative Adversarial Networks (GANs)into the sequential setting do not adequately attend to the temporal correlationsunique to Time-series data. At the same time, supervised models for sequenceprediction which allow finer control over network dynamics are inherentlydeterministic.

We propose a novel framework for generating realistic time-seriesdata that combines the flexibility of the unsupervised paradigm with the controlafforded by supervised training. Through a learned embedding space jointlyoptimized with both supervised and Adversarial objectives, we encourage thenetwork to adhere to the dynamics of the training data during sampling. Empirically,we evaluate the ability of our method to generate realistic samples using a variety ofreal and synthetic Time-series datasets. Qualitatively and quantitatively, we find thatthe proposed framework consistently and significantly outperforms state-of-the-artbenchmarks with respect to measures of similarity and predictive IntroductionWhat is a good Generative model for Time-series data?

The temporal setting poses a unique challengeto Generative modeling. A model is not only tasked with capturing the distributions of featureswithineach time point, it should also capture the potentially complex dynamics of those variablesacrosstime. Specifically, in modeling multivariate sequential datax1:T= (x1,..,xT), we wish toaccurately capture the conditional distributionp(xt|x1:t 1)of temporal transitions as the one hand, a great deal of work has focused on improving the temporal dynamics of au-toregressive models for sequence prediction. These primarily tackle the problem of compoundingerrors during multi-step sampling, introducing various training-time modifications to more accuratelyreflect testing-time conditions [1,2,3].

Autoregressive models explicitly factor the distribution ofsequences into a product of conditionals tp(xt|x1:t 1). However, while useful in the context offorecasting, this approach is fundamentally deterministic, and is not trulygenerativein the sense thatnew sequences can be randomly sampled from them without external conditioning. On the otherhand, a separate line of work has focused on directly applying the Generative Adversarial network (GAN) framework to sequential data, primarily by instantiating recurrent Networks for the rolesof generator and discriminator [4,5,6]. While straightforward, the Adversarial objective seeks tomodelp(x1:T)directly, without leveraging the autoregressive prior. Importantly, simply summing indicates equal contribution33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, standard GAN loss over sequences of vectors may not be sufficient to ensure that the dynamics ofthe network efficiently captures stepwise dependencies present in the training this paper, we propose a novel mechanism to tie together both threads of research, giving rise to agenerative model explicitly trained to preserve temporal dynamics.

We present Time-series GenerativeAdversarial Networks (TimeGAN), a natural framework for generating realistic Time-series data invarious domains. First, in addition to theunsupervisedadversarial loss on both real and syntheticsequences, we introduce a stepwisesupervisedloss using the original data as supervision, therebyexplicitly encouraging the model to capture the stepwise conditional distributions in the data. Thistakes advantage of the fact that there is more information in the training data than simply whether eachdatum is real or synthetic; we can expressly learn from the transition dynamics from real , we introduce anembedding networkto provide a reversible mapping between features andlatent representations, thereby reducing the high-dimensionality of the Adversarial learning capitalizes on the fact the temporal dynamics of even complex systems are often driven by fewerand lower-dimensional factors of variation.

Importantly, the supervised loss is minimized by jointlytraining both the embedding and generator Networks , such that the latent space not only serves topromote parameter efficiency it is specifically conditioned to facilitate the generator in learningtemporal relationships. Finally, we generalize our framework to handle the mixed-data setting, whereboth static and Time-series data can be generated at the same approach is the first to combine the flexibility of the unsupervised GAN framework with thecontrol afforded by supervised training in autoregressive models. We demonstrate the advantagesin a series of experiments on multiple real-world and synthetic datasets. Qualitatively, we conductt-SNE [7] and PCA [8] analyses to visualize how well the generated distributions resemble theoriginal distributions.

Quantitatively, we examine how well a post-hoc classifier can distinguishbetween real and generated sequences. Furthermore, by applying the "train on synthetic, test on real(TSTR)" framework [5,9] to the sequence prediction task, we evaluate how well the generated datapreserves the predictive characteristics of the original. We find that TimeGAN achieves consistentand significant improvements over state-of-the-art benchmarks in generating realistic Related WorkTimeGAN is a Generative Time-series model, trained adversarially and jointly via a learned embeddingspace with both supervised and unsupervised losses. As such, our approach straddles the intersectionof multiple strands of research, combining themes from autoregressive models for sequence prediction,GAN-based methods for sequence generation, and Time-series representation recurrent Networks trained via the maximum likelihood principle [10] are prone topotentially large prediction errors when performing multi-step sampling, due to the discrepancybetweenclosed-looptraining ( on ground truths) andopen-loopinference ( on previous guesses).

Based on curriculum learning [11], Scheduled Sampling wasfirst proposed as a remedy, whereby models are trained to generate output conditioned on a mix ofboth previous guesses and ground-truth data [1]. Inspired by Adversarial domain adaptation [12],Professor Forcing involved training an auxiliary discriminator to distinguish between free-runningand teacher-forced hidden states, thus encouraging the network s training and sampling dynamics toconverge [2]. Actor-critic methods [13] have also been proposed, introducing a critic conditionedon target outputs, trained to estimate next-token value functions that guide the actor s free-runningpredictions [3]. However, while the motivation for these methods is similar to ours in accounting forstepwise transition dynamics, they are inherently deterministic, and do not accommodate explicitlysampling from a learned distribution central to our goal of synthetic data the other hand, multiple studies have straightforwardly inherited the GAN framework within thetemporal setting.

The first (C-RNN-GAN) [4] directly applied the GAN architecture to sequentialdata, using LSTM Networks for generator and discriminator. Data is generated recurrently, taking asinputs a noise vector and the data generated from the previous time step. Recurrent Conditional GAN(RCGAN) [5] took a similar approach, introducing minor architectural differences such as droppingthe dependence on the previous output while conditioning on additional input [14]. A multitude ofapplied studies have since utilized these frameworks to generate synthetic sequences in such diversedomains as text [15], finance [16], biosignals [17], sensor [18] and smart grid data [19], as well asrenewable scenarios [20]. Recent work [6] has proposed conditioning on time stamp information to2handle irregularly sampling.

However, unlike our proposed technique, these approaches rely onlyon the binary Adversarial feedback for learning, which by itself may not be sufficient to guaranteespecifically that the network efficiently captures the temporal dynamics in the training , representation learning in the Time-series setting primarily deals with the benefits of learningcompact encodings for the benefit of downstream tasks such as prediction [21], forecasting [22], andclassification [23]. Other works have studied the utility of learning latent representations for purposesof pre-training [24], disentanglement [25], and interpretability [26]. Meanwhile in the static setting,several works have explored the benefit of combining autoencoders with Adversarial training, withobjectives such as learning similarity measures [27], enabling efficient inference [28], as well asimproving Generative capability [29] an approach that has subsequently been applied to generatingdiscrete structures by encoding and generating entire sequences for discrimination [30].

Time-series Generative Adversarial Networks

Tags:

Information

Transcription of Time-series Generative Adversarial Networks

Related search queries

Time-series Generative Adversarial Networks

Tags:

Information

Documents from same domain

Related documents

Related search queries