Example: biology

Inferring user traits via unsupervised methods

Characterizing the Ethereum address spaceInferring user traits via unsupervised methodsJames Payette1, samuel Schwager2, Joseph Murphy31 Department of Computer Science, of MCS, of Physics, AcquisitionData and Feature SetModels and AnalysisResults and DiscussionOngoing InvestigationsReferencesSuccessful,effic ientdataacquisitionwasamajormilestonefor ourproject ,werecursivelyscrapeddatafromthepublical lyavailableblockchain,eventuallyaggregat ingadatasetof250,000uniqueaddresses. QueriedtheetherscanAPIforanaddress ethereumbalanceandalloftheirtransactions ( ).Wetriedtoselectfeaturesthat,whenaggreg ated, :TotalEther,numberoftransactions,transac tionspermonth,averageEthertransaction, ,yetanonymousledgers,or blockchains , ,knownonlybytheiraddresses,wouldhaveenor moussecurityimplications[1].Weexaminethe blockchainofEthereumwiththeobjectiveofcl usteringaddressesintodistinct behaviorgroups example transaction on the Ethereum blockchain [2]The Ethereumaddress spaceThemainobjectiveofourquantitativean alysiswastouseclusteringevaluationmetric sandPrincipalComponentAnalysis(PCA)todet ermineaninformedestimatefortheoptimalnum berofclusterswithwhichtoexamineasbehavio rgroups.

Characterizing the Ethereum address space Inferring user traits via unsupervised methods James Payette1, Samuel Schwager2, Joseph Murphy3 1Department of Computer Science, jpayette@stanford.edu 2Department of MCS, sams95@stanford.edu 3Department of Physics, murphyjm@stanford.edu Data Acquisition Data and Feature Set Models and Analysis

Tags:

  User, Methods, James, Traits, Samuel, Unsupervised, Inferring, Inferring user traits via unsupervised methods, Inferring user traits via unsupervised methods james

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Inferring user traits via unsupervised methods

1 Characterizing the Ethereum address spaceInferring user traits via unsupervised methodsJames Payette1, samuel Schwager2, Joseph Murphy31 Department of Computer Science, of MCS, of Physics, AcquisitionData and Feature SetModels and AnalysisResults and DiscussionOngoing InvestigationsReferencesSuccessful,effic ientdataacquisitionwasamajormilestonefor ourproject ,werecursivelyscrapeddatafromthepublical lyavailableblockchain,eventuallyaggregat ingadatasetof250,000uniqueaddresses. QueriedtheetherscanAPIforanaddress ethereumbalanceandalloftheirtransactions ( ).Wetriedtoselectfeaturesthat,whenaggreg ated, :TotalEther,numberoftransactions,transac tionspermonth,averageEthertransaction, ,yetanonymousledgers,or blockchains , ,knownonlybytheiraddresses,wouldhaveenor moussecurityimplications[1].Weexaminethe blockchainofEthereumwiththeobjectiveofcl usteringaddressesintodistinct behaviorgroups example transaction on the Ethereum blockchain [2]The Ethereumaddress spaceThemainobjectiveofourquantitativean alysiswastouseclusteringevaluationmetric sandPrincipalComponentAnalysis(PCA)todet ermineaninformedestimatefortheoptimalnum berofclusterswithwhichtoexamineasbehavio rgroups.

2 [1]Monaco,JohnV."Identifyingbitcoinusers bytransactionbehavior."SPIED efense+ ,2015.[2]Wood,Gavin."Ethereum:Asecuredec entralisedgeneralisedtransactionledger." EthereumProjectYe l l owPaper151(2014).[3]Kodinariya,TruptiM., "ReviewondeterminingnumberofClusterinK-M eansClustering." (2013):90-95.[4]Tibshirani,Robert,Guenth erWalther,andTrevorHastie." PCA finds that only 33%of the variance is explained by the first two components K-means clustering used over other methods for its scalability, versatility Use unsupervised metric CalinskiHarabaz Score as measure of cluster definition Elbow of Calinski Harabaz plot gives insight on optimal number of clusters [3] Further investigate optimal number of clusters via Silhouette ScoresAcknowledgementsDeterminingtheopti malnumberofK-meansclustersisnotalwaysawe ll-definedproblem[3],[4].Employingvariou sevaluationtechniques, "JournaloftheRoyalStatisticalSociety:Ser iesB(StatisticalMethodology) (2001):411-423.

3 [5]Meiklejohn,Sarah,etal."Afistfulofbitc oins:characterizingpaymentsamongmenwithn onames." , :Silhouettescoresrangefrom0to1(-1=miscla ssification).Scorescloserto1indicateacon fidentclustermapping( ,farfromneighbors).Left:Silhouettescores ofclusterswithsize>100,averagescore(dott edredline). , ,wewillqualitativelyanalyzetheclustersba sedontheirlocationsinfeaturespacetochara cterizetheirtraits[5].Thequalitativeanal ysiswillbeincludedinourfinalreport(inpre paration).Longtermapplicationsofthiswork includeexploringgenerativemodelstolearns pecificbehaviorgroupcharacteristicsinord erto impersonate is the sum of squared distances of samples to their closest cluster center. Elbow similar to CH , elbows ,consideringtheSilhouetteanalysis, ,asthereislikelyabiaseddistributionofuse rs.


Related search queries