Transcription of A dv Complex Systems - Machine Learning
1 Adv ComplexSystems Forecastingpriceincrementsusinganarti cialNeuralNetworkFILIPPOCASTIGLIONEC enterforAdvancedComputerScience UniversityofCologneZPR ZAIK Weyertal D K oln Germanyfilippo zpr uni koeln deABSTRACT Financialforecastingisadi culttaskduetotheintrinsiccom plexityofthe nancialsystem Asimpli edapproachinforecastingisgivenby blackbox methodslikeneuralnetworksthatassumelittl eaboutthestructureoftheeconomy Inthepresentpaperwerelateourexperienceus ingneuralnetsas nancialtimeseriesforecastmethod Inparticularweshowthataneuralnetabletofo recastthesignofthepriceincrementswithasu ccessrateslightlyabove percentcanbefound Targetseriesarethedailyclosingpriceofdi erentassetsandindexesduringtheperiodfrom aboutJanuary toFebruary KEYWORDS Forecasting NeuralNetworks FinancialTimeSeries Detrend ingAnalysis IntroductionForecastingfuturevaluesofana ssetgives besidesthestraightforwardpro topportunities indicationstocomputevariousinterestingqu antitiessuchasthepriceofderivatives Complex nancialproducts ortheprobabilityforanadversemodewhichist heessentialinformationwhenassessingandma nagingtheriskassociatedwithaportfolioinv estment Forecastingthepriceofacertainasset stock index foreigncurrency etc
2 Onthegroundofavailablehistoricaldata correspondstothewellknownprobleminscienc eandengineeringoftimeseriesprediction Whilemanytimeseriesmaybeapproximatedwith ahighdegreeofcon dence nancialtimeseriesarefoundamongthemostdi culttobeanalyzedandpredicted Thisisnotsurprisingsincethedynamicsofthe marketsfollowingatleastthesemi strongEMHshoulddestroyanyeasymethodtoest imatefutureactivitiesusingpastinformatio ns AmongthemethodsdevelopedinEconometricsas wellasotherdisciplinesc HERMES FilippoCastiglioney thearti cialNeuralNetworks NN arebeingusedby non orthodox scien tistsasnon parametricregressionmethods Campbell LoandMacKinlay MoodyandNeuneier Zimmermann Theyconstituteanalternativetononparametr icregressionmethodslikekernelregression Campbell LoandMacKinlay Theadvantageofusinganeuralnetworkasnonli nearfunctionapproximatoristhatitappearst obewellsuitedinareaswherethemathemat icalknowledgeofthestochasticprocessunder lyingtheanalyzedtimeseriesisunknownandqu itedi culttoberationalized Besides itisimportanttonotethatthelackoflinearco rrelationsinthe nancialpriceseriesandthealreadyacceptede videnceofanunderlyingprocessdi erentfromi i d noisepointouttotheexistenceofhigher ordercorrelationsornon linearities Itisthisnon
3 Linearcorrelationthattheneuralnetmayeven tuallycatchduringitslearningphase Ifsomemacroscopicregularities arisingfromtheapparentlychaoticbehaviour ofthelargeamountofcomponentsarepresent thenawelltrainednetcouldiden tifyand store theminitsdistributedknowledgerepresentat ionsystemmadebyunitsandsynapticweights MoodyandNeuneier Zimmermann Refenes BurgessandBentz InthefollowingwewillseethatawellsuitedNN foreachofasetofpricetimeseriesshowinga surprising rateofsuccessinpredictingthesignofthepri cechangeonadailybasecanbefound Notlessinteresting wewillseethattheforetoldregularitiesinth etimeseriesseemtobemorepresentonlargerti mescalethanonhighfrequencydata astheperformanceofthenetdegradesifwegofr ommonthlytominutesdata Multi layerPerceptronMulti layerperceptrons MLP aretheneuralnetsusuallyreferredtoasfunc tionapproximators AMLP isageneralizationofRosenblatt sperceptron niinputunits nhhiddenandnooutputunitswithallfeedforwa rdconnectionsbetweenadjacentlayers nointra layerconnectionsorloops Suchnet stopologyisspeci edasni nh no ANNmayperformvarioustasksconnectedtoclas si cationproblems
4 Herewearemainlyinterestedinexploitingwha tiscalledtheuniversalapproximationproper ty thatis theabilitytoapproximateanynonlinearfunct iontoanyarbi trarydegreeofaccuracywithasuitablenumber ofhiddenunits White Cybenko Theapproximationisperformed ndingthesetofweightsconnectingtheunits Thiscanbedonewithoneoftheavailablemethod sofnon parametricestimationtechniqueslikenonlin earleast squares Inparticularwechooseerrorbackpropagation whichisprobablythemostusedalgorithmtotra inMLPs Rumelhart HintonandWilliams Rumelhart HintonandWilliams Itisbasicallyagradientdescentalgorithmof theerrorcomputedonasuitablelearningset Avariationofitusebias termsandmomentumascharacteristicyseethev astbibliographywithmorethan entriesatwww stern nyu edu
5 Aweigend Time Series Biblio SFIbib htmlreportedfrom WeigendandGershenfeld ForecastingpriceincrementswithNN -15-10-5051015205008001000daylearning set and forecast on the test setLearningValidationCheckTestPdayGdayFi gure Eachtimeseriesisdividedinfourdatasets Learning validation checkingandtesting seetextforexplanation Adi cultyarisefromthefactthattheoscillations inthetestsetaremuchmorepronouncedthanint helearningset In gure dailyclosingpriceofIntelCorp parameters Moreoverwe xedthelearningrate themomentum andtheusualsigmoidalasnonlinearactivatio nfunction DetrendinganalysisWehavetrainedtheneural netson detrended timeseries Thedetrendinganalysiswasperformedtomitig atetheunbalancebetweenthelearningset andthetestset Infact subdividingtheavailabledatainlearningset andtestingsetasspeci edinthefollowingsection havealookat gure
6 Wetrainthenetsonadatasetcorrespondingtoa periodsmuchbackintimewhilewetestthenetso ndatasetcorrespondingtothemostrecentperi odoftime Thisproblemisknowinliteratureasnoise nonstationaritytradeo MoodyandNeuneier Zimmermann ItisknownthatduringthelasttenyearstheAme ricanmarkethasnoticeablychangedinthatalm ostallthetitlesconnectedtotheinformation technologyhavenotonlyjumpedtorecordvalue sbutalsothe uctuationsofpricetodayaremuchstrongertha nbeforey Ignoringthisfactwouldleadtoamistakebecau sethenetwouldnotlearnthecharacteristicso fthe actualsituation Todetrendatimeseriesweperformedanonlinea rleastsquares tusingyPtiswhatweusetotrainournets Consideringlog Pt insteadofPtwouldmitigatetheproblembutitw ouldintroducefurthernonlinearities FilippoCastiglione2004006008001000120014 0016000500100015002000-200-150-100-50050 100original time series with polynomial fitdetrended time seriesdaysDetrend analysisoriginal time seriesfitdetrendedFigure S P detrendedtimeseries Theplotshowstheoriginalseries thepolynomial tandtheresultingdetrendedtimeseriesobtai nedjustbydi erencebetweentheoriginalandthe ttingcurve Thedetrendedtimeseriesconsistof points theMarquardt Levenbergalgorithm Campbell LoandMacKinlay Press Teukolsky
7 VetterlingandFlannery withapolynomialofsixthdegree Thenwejustcomputedthedi erenceoftheserieswiththe ttingcurve Foreachtimeseriesconsideredweendedupwith adetrendedseriescomposedby pointscorrespondingtotheperiodfromaboutJ anuary toFebruary Forexample theplotin gure showsthedetrendedtimeseriesoftheindexS P alongwiththeoriginalseriesandthepolynomi al t Wechoosedailyclosingfor indexesand assetshistoricalseriesontheNYSEandNasdaq Inparticulartheassetswerechosenamongthem ostactivecompaniesinthe eldofinformationtechnology DeterminingthenettopologyOneoftheprimary goalsintrainingneuralnetworksistoensuret hatthenetworkwillperformwellondatathatit hasnotbeentrainedon called gen eralization Thestandardmethodofensuringgoodgeneraliz ationistodivideourtrainingdataintomultip ledatasets ThemostcommondatasetsarethelearningL crossvalidationV andtestingTdatasets Whilethelearningdatasetisthedatathatisac tuallyusedtotrainthenetworktheusageofthe othertwomayneedsomeexplanation Likethelearningdataset thecrossvalidationdatasetisalsousedbythe networkduringtraining Periodically whiletrainingonthelearningdataset
8 Thenetworkistestedforperformanceonthecro ssvalidationset DuringthisForecastingpriceincrementswith NN Pt Pt Pt Pt Figure Athreelayerperceptron withthreeinputs sevenhiddenandoneoutputunits testing theweightsarenottrained buttheperformanceofthenetworkonthecrossv alidationsetissavedandcomparedtopastvalu es Ifthenetworkisstartingtoovertrainonthetr ainingdata thecrossvalidationperformancewillbeginto degrade Thus thecrossvalidationdatasetisusedtodetermi newhenthenetworkhasbeentrainedaswellaspo ssiblewithoutovertraining e g maximumgeneralization Althoughthenetworkisnottrainedwiththecro ssvalidationset itusesthecrossvalidationsettochoosea best setofweights Therefore itisnottrulyanout of sampletestofthenetwork Foratruetestoftheperformanceofthenetwork thetestingdatasetTisused Thisdatasetisusedtoprovideatrueindicatio nofhowthenetworkwillperformonnewdata In gure anexampleofMLPwithni nh andoneoutputunittakesPt Pt Pt ininputandgivesthesuccessivevaluePt asforecast Thenumberoffreeparametersisgivenbythenum berofconnectionsbetweenunits ni no nh Whilethechoiceofoneoutputunitcomesfromth estraightforwardde ni tionoftheproblem acrucialquestionis howmanyinputandhiddenunitsshouldwechoose
9 Ingeneralthereisnowaytodetermineaprioria goodnet worktopology