Machine Learning Applied to Weather Forecasting

Machine Learning Applied to Weather ForecastingMark Holmstrom, Dylan Liu, Christopher VoStanford University(Dated: December 15, 2016) Weather Forecasting has traditionally been done by physical models of the atmosphere, which areunstable to perturbations, and thus are inaccurate for large periods of time. Since Machine learningtechniques are more robust to perturbations, in this paper we explore their application to weatherforecasting to potentially generate more accurate Weather forecasts for large periods of time. Thescope of this paper was restricted to Forecasting the maximum temperature and the minimum tem-perature for seven days, given Weather data for the past two days. A linear regression model anda variation on a functional regression model were used, with the latter able to capture trends inthe Weather .

Both of our models were outperformed by professional Weather Forecasting services,although the discrepancy between our models and the professional ones diminished rapidly for fore-casts of later days, and perhaps for even longer time scales our models could outperform professionalones. The linear regression model outperformed the functional regression model, suggesting thattwo days were too short for the latter to capture significant Weather trends, and perhaps basingour forecasts on Weather data for four or five days would allow the functional regression model tooutperform the linear regression Forecasting is the task of predicting the stateof the atmosphere at a future time and a specified lo-cation. Traditionally, this has been done through phys-ical simulations in which the atmosphere is modeled asa fluid.

The present state of the atmosphere is sampled,and the future state is computed by numerically solv-ing the equations of fluid dynamics and thermodynam-ics. However, the system of ordinary differential equa-tions that govern this physical model is unstable underperturbations, and uncertainties in the initial measure-ments of the atmospheric conditions and an incompleteunderstanding of complex atmospheric processes restrictthe extent of accurate Weather Forecasting to a 10 day pe-riod, beyond which Weather forecasts are significantly un-reliable. Machine Learning , on the contrary, is relativelyrobust to perturbations and doesn t require a completeunderstanding of the physical processes that govern theatmosphere. Therefore, Machine Learning may representa viable alternative to physical models in Weather Machine Learning algorithms were implemented:linear regression and a variation of functional corpus of historical Weather data for Stanford, CA wasobtained and used to train these algorithms.

The in-put to these algorithms was the Weather data of the pasttwo days, which include the maximum temperature, min-imum temperature, mean humidity, mean atmosphericpressure, and Weather classification for each day. Theoutput was then the maximum and minimum tempera-tures for each of the next seven WORKR elated works included many different and interestingtechniques to try to perform Weather forecasts. Whilemuch of current Forecasting technology involves simula-tions based on physics and differential equations, manynew approaches from artificial intelligence used mainlymachine Learning techniques, mostly neural networkswhile some drew on probabilistic models such as of the three papers on Machine Learning forweather prediction we examined, two of them used neu-ral networks while one used support vector networks seem to be the popular Machine learn-ing model choice for Weather Forecasting because of theability to capture the non-linear dependencies of pastweather trends and future Weather conditions, unlike thelinear regression and functional regression models that weused.

This provides the advantage of not assuming simplelinear dependencies of all features over our models. Ofthe two neural network approaches, one [3] used a hybridmodel that used neural networks to model the physicsbehind Weather Forecasting while the other [4] appliedlearning more directly to predicting Weather , the approach using support vector machines [6]also Applied the classifier directly for Weather predictionbut was more limited in scope than the neural approaches for Weather Forecasting included us-ing Bayesian networks. One interesting model [2] usedBayesian networks to model and make Weather predic-tions but used a Machine Learning algorithm to find themost optimal Bayesian networks and parameters whichwas quite computationally expensive because of the largeamount of different dependencies but performed approach [1] focused on a more spe-2 NumberNameValue1 ClassificationClear2 Maximum Temperature (F)573 Minimum Temperature (F)334 Mean Humidity495 Mean Atmospheric Pressure (in) I.

Sample data from January 1, 2015, with the num-ber, name, and value of each of the five case of predicting severe Weather for a specific geo-graphical location which limited the need for fine tuningBayesian network dependencies but was limited in AND FEATURESThe maximum temperature, minimum temperature,mean humidity, mean atmospheric pressure, and weatherclassification for each day in the years 2011-2015 for Stan-ford, CA were obtained from Weather Underground. [7]Originally, there were nine Weather classifications: clear,scattered clouds, partly cloudy, mostly cloudy, fog, over-cast, rain, thunderstorm, and snow. Since many of theseclassifications are similar and some are sparsely popu-lated, these were reduced to four Weather classificationsby combining scattered clouds and partly cloudy intomoderately cloudy; mostly cloudy, foggy, and overcastinto very cloudy; and rain, thunderstorm, and snow intoprecipitation.

The data from the first four years wereused to train the algorithms, and the data from the lastyear was used as a test set. Sample data for January 1,2015 are shown in table first algorithm that was used was linear regression,which seeks to predict the high and low temperatures as alinear combination of the features. Since linear regressioncannot be used with classification data, this algorithmdid not use the Weather classification of each day. As aresult, only eight features were used: the maximum tem-perature, minimum temperature, mean humidity, andmean atmospheric pressure for each of the past two , for thei-th pair of consecutive days,x(i) R9is a nine-dimensional feature vector, wherex0= 1 is de-fined as the intercept term.

There are 14 quantities to bepredicted for each pair of consecutive days: the high andlow temperatures for each of the next seven days. Lety(i) R14denote the 14-dimensional vector that con-tains these quantities for thei-th pair of consecutive prediction ofy(i)givenx(i)ish (x(i)) = Tx, where R9 14. The cost function that linear regression seeksto minimize isJ( ) =12m i=1 h (x(i)) y(i) 2,.(1)wheremis the number of training examples. LettingX Rm 9be defined such thatXij=x(i)jandY Rm 14be defined such thatYij=y(i)j, the value of that minimizes the cost in equation 1 is = (XTX) 1 XTY.(2)The second algorithm that was used was a variationof functional regression, which searches for historicalweather patterns that are most similar to the currentweather patterns, then predicts the Weather based uponthese historical patterns.

Given a sequence of nine con-secutive days, define its spectrumfas (1),f(2) R5be the feature vectors for the first dayand the second day, respectively. Foriin the range 3to 9, letf(i) R2be a vector containing the maximumtemperature and the minimum temperature for thei-thday in the sequence. Then define a metric on the spaceof spectrad(f1,f2) =2 j=1[w11[f1(j)16=f2(j)1]+5 k=2wk(f1(j)k f2(j)k)2],(3)wherewis a weight vector that assigns weights to eachfeature. Since the first feature is the Weather classifica-tion and the difference between classifications is mean-ingless, the squared difference has been replaced by anindicator function of whether the classifications are dif-ferent. Define a kernelker(t) = max{1 t,0},(4)and let neighk(f) denote thekindicesi {1.}

,m}ofthekspectra in the training set that are the closest tofwith respect to the metricd. That is,d(f(i),f)< d(f(j),f)(5)for alli neighk(f) andj6 neighk(f), and|neighk(f)|=k. Furthermore, defineh=maxi {1,..,m}d(f(i),f).(6)Then, given the valuesf(1),f(2) of the first two days ofa spectrumf, the remainder of the spectrumf(i) foriin the range 3 to 9 can be predicted as f(i) = j neighk(f)ker(d(f(j),f)/h)f(j)(i) j neighk(f)ker(d(f(j),f)/h).(7)3 Training Set Year(s)Test Set Year201120122011-201220132011-2013201420 11-20142015 TABLE II. The four training sets and test sets used in the4-fold forward chaining time-series cross error of the estimator fis defined to beError =9 i=3 f(i) f(i) 2.(8)A more useful error that will be used in lieu of this is theroot mean square (rms) error, which is defined to beErrorrms= 9 i=3 f(i) f(i) 214,(9)and provides the standard deviation of the individual er-ror Weather Forecasting inherently involves time se-ries,k-fold cross-validation is a poor technique to analyzewhether our model will generalize to an independent testset.

Machine Learning Applied to Weather Forecasting

Tags:

Information

Advertisement

Transcription of Machine Learning Applied to Weather Forecasting

Related search queries

Machine Learning Applied to Weather Forecasting

Tags:

Information

Advertisement

Documents from same domain

Related documents

Related search queries