Example: bankruptcy

Inferring user traits via unsupervised methods

Characterizing the ethereum address spaceInferring user traits via unsupervised methodsJames Payette1, Samuel Schwager2, Joseph Murphy31 Department of Computer Science, of MCS, of Physics, AcquisitionData and Feature SetModels and AnalysisResults and DiscussionOngoing InvestigationsReferencesSuccessful,effic ientdataacquisitionwasamajormilestonefor ourproject ,werecursivelyscrapeddatafromthepublical lyavailableblockchain,eventuallyaggregat ingadatasetof250,000uniqueaddresses. QueriedtheetherscanAPIforanaddress ethereumbalanceandalloftheirtransactions ( ).Wetriedtoselectfeaturesthat,whenaggreg ated, :TotalEther,numberoftransactions,transac tionspermonth,averageEthertransaction, ,yetanonymousledgers,or blockchains , ,knownonlybytheiraddresses,wouldhaveenor moussecurityimplications[1].Weexaminethe blockchainofEthereumwiththeobjectiveofcl usteringaddressesintodistinct behaviorgroups example transaction on the ethereum blockchain [2]The Ethereumaddress spaceThemainobjectiveofourquantitativean alysiswastouseclusteringevaluationmetric sandPrincipalComponentAnalysis(PCA)todet ermineaninformedestimatefortheoptimalnum berofclusterswithwhichtoexamineasbehavio rgroups.

feature vector for a single Ethereum address and each column to a single feature. The dataset is normalized to the sample ... "Ethereum: A secure decentralised generalised transaction ledger." Ethereum Project Yellow Paper 151 (2014). [3] Kodinariya, Trupti M., and Prashant R. Makwana. "Review on determining number of Cluster in K-Means

Tags:

  Secure, Decentralised, Ethereum, A secure decentralised

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Inferring user traits via unsupervised methods

1 Characterizing the ethereum address spaceInferring user traits via unsupervised methodsJames Payette1, Samuel Schwager2, Joseph Murphy31 Department of Computer Science, of MCS, of Physics, AcquisitionData and Feature SetModels and AnalysisResults and DiscussionOngoing InvestigationsReferencesSuccessful,effic ientdataacquisitionwasamajormilestonefor ourproject ,werecursivelyscrapeddatafromthepublical lyavailableblockchain,eventuallyaggregat ingadatasetof250,000uniqueaddresses. QueriedtheetherscanAPIforanaddress ethereumbalanceandalloftheirtransactions ( ).Wetriedtoselectfeaturesthat,whenaggreg ated, :TotalEther,numberoftransactions,transac tionspermonth,averageEthertransaction, ,yetanonymousledgers,or blockchains , ,knownonlybytheiraddresses,wouldhaveenor moussecurityimplications[1].Weexaminethe blockchainofEthereumwiththeobjectiveofcl usteringaddressesintodistinct behaviorgroups example transaction on the ethereum blockchain [2]The Ethereumaddress spaceThemainobjectiveofourquantitativean alysiswastouseclusteringevaluationmetric sandPrincipalComponentAnalysis(PCA)todet ermineaninformedestimatefortheoptimalnum berofclusterswithwhichtoexamineasbehavio rgroups.

2 [1]Monaco,JohnV."Identifyingbitcoinusers bytransactionbehavior."SPIED efense+ ,2015.[2]Wood,Gavin." ethereum :Asecuredec entralisedgeneralisedtransactionledger." EthereumProjectYe l l owPaper151(2014).[3]Kodinariya,TruptiM., "ReviewondeterminingnumberofClusterinK-M eansClustering." (2013):90-95.[4]Tibshirani,Robert,Guenth erWalther,andTrevorHastie." PCA finds that only 33%of the variance is explained by the first two components K-means clustering used over other methods for its scalability, versatility Use unsupervised metric CalinskiHarabaz Score as measure of cluster definition Elbow of Calinski Harabaz plot gives insight on optimal number of clusters [3] Further investigate optimal number of clusters via Silhouette ScoresAcknowledgementsDeterminingtheopti malnumberofK-meansclustersisnotalwaysawe ll-definedproblem[3],[4].Employingvariou sevaluationtechniques, "JournaloftheRoyalStatisticalSociety:Ser iesB(StatisticalMethodology) (2001):411-423.[5]Meiklejohn,Sarah,etal.

3 "Afistfulofbitcoins:characterizingpaymen tsamongmenwithnonames." , :Silhouettescoresrangefrom0to1(-1=miscla ssification).Scorescloserto1indicateacon fidentclustermapping( ,farfromneighbors).Left:Silhouettescores ofclusterswithsize>100,averagescore(dott edredline). , ,wewillqualitativelyanalyzetheclustersba sedontheirlocationsinfeaturespacetochara cterizetheirtraits[5].Thequalitativeanal ysiswillbeincludedinourfinalreport(inpre paration).Longtermapplicationsofthiswork includeexploringgenerativemodelstolearns pecificbehaviorgroupcharacteristicsinord erto impersonate is the sum of squared distances of samples to their closest cluster center. Elbow similar to CH , elbows ,consideringtheSilhouetteanalysis, ,asthereislikelyabiaseddistributionofuse rs.


Related search queries