Transcription of M.Tech. DATA ANALYTICS
1 (Applicableform2017-18onwards)Department ofComputerApplicationsNationalInstituteo fTechnologyTiruchirappalli 620015,Tamilnadu1 SYLLABUSS emesterSubjectCodeSubjectNameCreditsICA6 01 StatisticalComputing3CA603 BigDataAnalytics3CA605 MachineLearningTechniques3**Elective-13* *Elective-23**Elective-33CA609 BigDataManagementandDataAnalyticsLab2 IICS618 RealTimeSystems3CA602 NextGenerationDatabases3CA604 HighPerformanceComputing3**Elective-43** Elective-53**Elective-63CA610 MachineLearningLab2 IIICA647 Projectwork-PhaseI12 IVCA648 Projectwork-PhaseII12 TotalCredits642 LISTOFELECTIVESS emesterSubjectCodeSubjectNameCreditsICS6 55
2 DigitalForensics3CA611 CyberSecurityandInformationAssurance3CA6 12 NaturalLanguageComputing3CA613 MassiveGraphAnalysis3CA614 Bioinformatics3CA615 ParallelandDistributedComputing3CA616 DataAcquisitionandProductization3CA617 EssentialsofHumanResourceAnalytics3CA618 CustomerRelationshipandManagement3 IICA619 PrinciplesofDeepLearning3CA620 ImageandVideoAnalytics3CA621 SocialNetworkingandMining3CA622 WebIntelligence3CA623 InternetofThings3CA624 HealthcareDataAnalytics3CA625 LinkedOpenDataandSemanticWeb3CA626 FinancialRiskAnalyticsandManagement3CA62 7 LogisticsandSupplyChainManagement33 SEMESTER-ICA601 STATISTICALCOMPUTINGO bjectives: Tolearntheprobabilitydistributionsandden sityestimationstoperformanalysisofvariou skindsofdata.
3 Toexplorethestatisticalanalysistechnique susingPythonandRprogramminglanguages. :SampleSpaces-Events-Axioms Counting-ConditionalProbabilityandBayes Theorem TheBinomialTheorem :TheCentralLimitTheorem,distributionsoft hesamplemeanandthesamplevarianceforanorm alpopulation,Samplingdistributions(Chi-S quare,t,F,z).TestofHypothesis-Testingfor Attributes MeanofNormalPopulation One-tailedandtwo-tailedtests,F-testandCh i-Squaretest--AnalysisofvarianceANOVA , LearningR ,O Reilly, ,Peter, IntroductorystatisticswithR ,SpringerScience&BusinessMedia, , AHandbookofStatisticalAnalysisUsingR ,SecondEdition,4 LLC, , MasteringPythonforDataScience ,Packt, , IntroductiontoProbabilityandStatisticsfo rEngineersandScientists ,4thedition,AcademicPress.
4 , RCookbook,O Reilly, , LearningPython ,O Reilly,5thEdition,2013 Outcomes:Studentswillbeableto: Implementstatisticalanalysistechniquesfo rsolvingpracticalproblems. Performstatisticalanalysisonvarietyofdat a. : Tooptimizebusinessdecisionsandcreatecomp etitiveadvantagewithBigDataanalytics Toexplorethefundamentalconceptsofbigdata ANALYTICS . Tolearntoanalyzethebigdatausingintellige nttechniques. Tounderstandthevarioussearchmethodsandvi sualizationtechniques. Tolearntousevarioustechniquesforminingda tastream. TounderstandtheapplicationsusingMapReduc eConcepts.
5 TointroduceprogrammingtoolsPIG& :IntroductiontoBigDataPlatform ChallengesofConventionalSystems-Intellig entdataanalysis :IntroductionToStreamsConcepts StreamDataModelandArchitecture-StreamCom puting-SamplingDatainaStream FilteringStreams CountingDistinctElementsinaStream EstimatingMoments CountingOnenessinaWindow DecayingWindow-RealtimeAnalyticsPlatform (RTAP) :HistoryofHadoop-theHadoopDistributedFil eSystem ComponentsofHadoopAnalysingtheDatawithHa doop-ScalingOut-HadoopStreaming-Designof HDFS-JavainterfacestoHDFSB asics-DevelopingaMapReduceApplication-Ho wMapReduceWorks-AnatomyofaMapReduceJobru n-Failures-JobScheduling-ShuffleandSort :ApplicationsonBigDataUsingPigandHive DataprocessingoperatorsinPig Hiveservices HiveQL , , IntelligentDataAnalysis ,Springer, Hadoop.
6 TheDefinitiveGuide ThirdEdition,O reillyMedia, ,DirkDeRoos,TomDeutsch,GeorgeLapis,PaulZ ikopoulos, UnderstandingBigData:AnalyticsforEnterpr iseClassHadoopandStreamingData ,McGrawHillPublishing, , MiningofMassiveDatasets ,CUP, , TamingtheBigDataTidalWave:FindingOpportu nitiesinHugeDataStreamswithAdvancedAnaly tics ,JohnWiley&sons, , MakingSenseofData ,JohnWiley&Sons, , BigDataGlossary ,O Reilly, ,MichelineKamber DataMiningConceptsandTechniques ,2ndEdition,Elsevier, ,GuoquingChen, ,GeertWets, IntelligentDataMining ,Springer, ,DirkdeRoos,KrishnanParasuraman,ThomasDe utsch,JamesGiles,DavidCorrigan, HarnessthePowerofBigDataTheIBMBigDataPla tform ,TataMcGrawHillPublications, ,VijayMadisetti, BigDataScience& ANALYTICS :AHands-OnApproa ch ,VPT, AnalyticsinaBigDataWorld.
7 TheEssentialGuidetoDataScienceanditsAppl ications(WILEYBigDataSeries) ,JohnWiley&Sons,2014 Outcomes:Studentswillbeableto: Workwithbigdataplatformandexplorethebigd ataanalyticstechniquesbusinessapplicatio ns. Designefficientalgorithmsforminingthedat afromlargevolumes. AnalyzetheHADOOPandMapReducetechnologies associatedwithbigdataanalytics. ExploreonBigDataapplicationsUsingPigandH ive. Understandthefundamentalsofvariousbigdat aanalyticstechniques. Buildacompletebusinessdataanalyticssolut ionCA605 MACHINELEARNINGTECHNIQUESO bjectives: Tointroducethebasicconceptsandtechniques ofMachineLearning.
8 Todeveloptheskillsinusingrecentmachinele arningsoftwareforsolvingpracticalproblem s. Tobefamiliarwithasetofwell-knownsupervis ed, ;HierarchicalClustering-Agglomerative-Di visive-Distancemeasures;DensitybasedClus tering-DBScan; , , ElementsofStatisticalLearning ,Springer, , MachineLearning ,MITP ress, , MachineLearning:AProbabilisticPerspectiv e ,MITP ress, , PatternRecognitionandMachineLearning,Spr inger , ,ShaiBen-David, UnderstandingMachineLearning:FromTheoryt oAlgorithms ,CambridgeUniversityPress, , MachineLearningForDummies ,JohnWiley&Sons, :Studentswillbeableto: Selectreal-worldapplicationsthatneedsmac hinelearningbasedsolutions.
9 Implementandapplymachinelearningalgorith ms. Selectappropriatealgorithmsforsolvingapa rticulargroupofreal-worldproblems. : Optimizebusinessdecisionsandcreatecompet itiveadvantagewithBigDataanalytics ImpartingthearchitecturalconceptsofHadoo pandintroducingmapreduceparadigm IntroducingJavaconceptsrequiredfordevelo pingmapreduceprograms7 Derivebusinessbenefitfromunstructureddat a IntroduceprogrammingtoolsPIG&HIVEinHadoo pechosystem. DevelopingBigDataapplicationsforstreamin gdatausingApacheSparkLabExercises:1.(i)P erformsettingupandInstallingHadoopinitst wooperatingmodes: Pseudodistributed, Fullydistributed.
10 (ii) (i)Implementthefollowingfilemanagementta sksinHadoop: Addingfilesanddirectories Retrievingfiles Deletingfilesii) Findthenumberofoccurrenceofeachwordappea ringintheinputfile(s) PerformingaMapReduceJobforwordsearchcoun t(lookforspecifickeywordsinafile) : Input:oAlargetextualfilecontainingonesen tenceperlineoAsmallfilecontainingasetofs topwords(Onestopwordperline) ,whichisagoodcandidateforanalysiswithMap Reduce, : Findaverage,maxandmintemperatureforeachy earinNCDC dataset? Filterthereadingsofasetbasedonvalueofthe measurement, Insteadofbreakingthesalesdownbystore,giv eusasalesbreakdownbyproductcategoryacros sallofourstoresoWhatisthevalueoftotalsal esforthefollowingcategories?