Transcription of The igraph software package for complex network …
1 Theigraphsoftwarepackageforcomplexnetwor kresearchG abor Cs ardiCenterforComplexSystemsStudies,Kalam azoo College,Kalamazoo, MI,USAandDepartment of Biophysics,KFKIR esearch InstituteforParticleandNuclearPhysicsof theHungarianAcademy of as NepuszDepartment of Biophysics,KFKIR esearch InstituteforParticleandNuclearPhysicsof theHungarianAcademy of Sciences,Budapest,HungaryandDepartment of Measurement andInformationSystems,BudapestUniversity of is anopensourceportablelibrarycapableof handlinghugegraphswithmillionsof verticesandedgesandit is alsosuitableto containsroutinesforcreating,manipulating andvisualizingnetworks,calculatingvariou sstruc-turalproperties,importingfromande xportingto various leformatsandmany high-level languageslike GNUR andPythonit supportsrapiddevelopment does notpresent resultsof scienti cresearch.
2 Butintroducesa soft-warepackagewhich gives handytoolsinto thehandsof thatthetoolsscientistsuseareimportantbec ausetheycanincreaseproductivity by severalfactorsandthereby enhancescienti anothernetworkanalysispackage?Theigraphl ibrarywas developed becauseof thelack of networkanalysissoftwarewhich (1)canhandlelargegraphse ciently, (2)canbe embeddedinto a higherlevel programor programminglanguage(like Python,Perlor GNUR)and(3)canbe of handlinglargegraphswas important becausetheauthorswereconfrontedwithgraph swithmillionsof Pythonor GNUR createsa veryproductive researchenvironment, ofGNUR (orotherhigherlevel languange)
3 Is readilyavailablein a convenientintegratedenvironment forgenerating,manipulatingandmeasuringgr aphs, meansof softwareusageis nowadays consideredas superiortonon-interactive interfaces,which is di erent though{ if it takes threemonthsto calculatethediameterof a graph,nobodywants thatto be theadditionto thethreegoalfeaturesin theprevioussection,othersshowedupas a side-e open source,it is freefornon-commercialor commer-cialuseanddistributedaccordingto additionto thebinaryformatof theprogram,theusercanalways a veryimportant ,youcanaddnewfunctionality andcorrectde cienciesor hiresomebodyto cient cient datastruc-turesandimplements thecurrent ledto writtenin ANSIC,it is is testedondi erent Linux avors,MacOSX, andPythoninterfacesarealsoportableto many layeredarchitecture,thethreelayersarecon nectedthroughwellde layer , anopensystem,itcanbe embeddedinto higherlevel languagesor distri-butioncontainsinterfacesto two highlevel languages.}
4 GNUR , libraryis verywell documented,thedocumentationis availablein each functionitstimerequirements functionality in someareascomparedto areais graphvisualization,anotheroneisvariousso cialnetworkanalysismethodslike block-modeling,p methods, softwareis heavilyunderdevelopment, so expectmuchmorefunctionality in nothave a graphicaluserinterface,buta Python-basedGUIis underdevelopment andwillbe availablefordownloadsoon andtheR interfacealsoprovidesa facility forvisualmanipulationof thissectionwe give an to calculatethediameterof grantedbetween1963and2000andtwo patents areconnectedif of thenetworkcontainsmorethan3 (undirected)diameterof a networkis thelargestundirectedshortestpathconnecti ngtwo calculatingthediameterof a graphyouneedto calculatethelengthoftheshortestpathbetwe enallpossiblepairsof nodes,so thisis computationallyveryexpensive.
5 We usedthefollowingapproach .. : Thearchitectureof thesystemusedforcalculatingthediameterof a worker node (1)downloadsthenetworkdatafromthedataweb -server,then(2)it requestsa sourcevertexid fromthetaskweb-server,(3)calculatesthesh ortestpathsfromthatsourcevertexand(4) newsourcevertexid is requested, wrotea simpleprogramin C which downloadsthedatasetfromawebserver andthenstartscalculatingtheshortestpaths froma givensourcenodeto allothernodesin thenetworkby usingDijkstra'salgorithm[5] imple-mentedin theigraphlibrary. We downloadstheid of thesourcenodefroma server simplygives a di erent sourcenode id everytimeoneis requestedby theworker has nishedwiththecalculationof theshortestpathsto allnodesit storestheresultona thirdweb server andasksthesecondweb-server fora newsourcenode id, thesystemcanbe seenin veryrobustin thesensethatthereis no singlepoint of runin any grid-basedenvironment fromwhich runondi erent platformsas.
6 RapiddevelopmentNewman'scommunity ndingalgorithmThesecondexamplewe presentis verydi erent fromthe willusetheGNUR interfaceto theigraphlibraryto implement andapplyNewman'sspectralcommunity ndingalgorithm[10].Firstwe loadtheigraphpackageintoR anddownloadtheZacharyKarate-clubnetworkd ata[17] ( igraph )2g <- (" ~ ",format="pajek")Now we implement thecommunity <- function(g){4deg<- degree(g)5ec <- ecount(g)6B <- (g)- outer(deg,deg,function(x,y)x*y/2/ec)7dia g(B)<- 08eigen(B)$vectors[,1]9}Thisalgorithmcre atesa modularity matrixwhich is thedi erenceof theadja-cencymatrixof nodesareconnectedin a randomgraphif dividedinto two communitiesbasedontheeigenvectorassociat edwiththelargestpositive eigenvalueof themodularitymatrix.
7 Allverticeshavingthesamesignin thiseigenvectorbelongto we arereadyto applythisalgorithmto theKarate-clubdataandsetthecolorof <- (g)11V(g)$color<- ifelse(mem< 0, "grey","green")We alsosetthesizeof theverticesbasedonthe rsteigenvector,thefartherthisvalueis fromzerothemorethegiven vertexis in thecoreof alsosetthecolorof theedgesacrossthetwo communitiesto <- function(v,a, b) {13v <- v-min(v); v <- v/max(v); v <- v * (b-a); v+a14}15V(g)$size<- scale(abs(mem),15,25)16E(g)$color<- "grey"17E(g)[V(g)[color=="grey"]%--%V(g)[color=="green"]]$color<- "red"18plot(g,layout= , "a:color", "a:size", "a:color")Seetheresultingplotin 19 linesUsingthePythoninterfaceof igraph ,onecaneasilycreatea prototype of theoriginalPageRankalgorithmin only19 linesof code (notcountingempty lines):1fromigraphimport*2fromcopyimportcopy34defpagerank(g,damping= ,epsilon= ,iters=100):5pageranks= [1-damping]* ()6outlinks= (type=OUT)7mindiff= epsilon8newprs= [0]* ()910whilemindiff>= epsilonanditers> 0.
8 Thetwo communitiesidenti edcorrectlyin theverticesis proportionalto theabsolutevalueof theircoordinatein the rsteigenvectorandexpresseshow stronglytheybelongto a community. Alledgesacrossthetwo iters- 112forn in range( ()):13neis= (n,IN)14pr = len(neis)> 0:16forn2 in neis:pr = pr + pageranks[n2]/ outlinks[n2]17pr = pr*damping18newprs[n]= pr+1-damping1920mindiff= min([abs(newprs[n]-pageranks[n])forn in range( ())])21pageranks= copy(newprs) containstheverybasicoperationsonly, andis implementedin C. It is onlythislayer which canbe easilyreplacedwithandalternategraphrepre sentationif (Clibrary)Basicgraphoperations(Clibrary) : Thearchitectureof containsalmostallnetworkanalysisfunction s,thisis alsoimplementedin containsthehigherlevel interfaces,so farinterfacesto GNUR functionalityPleasenotethatnewfunctional ity is addedto thelibraryeveryweek,so check youcannotseeherethealgorithmsor measuresyou' : regularstructures.
9 Star,ringandfullgraphs,circularandnon-ci rcularlatticeswithany number of dimensions,regulartrees graphsbasedonBarab asi'spreferentialattachment model[1], alsowithnonlinearattachment exponent andvariousvariations Random(Erd}os-R enyi)graphs,bothG(n; p) andG(n; m)types [6],directedandundirectedones graphshavinga given degreesequence,directedorundirectedones[ 11] growingrandomgraphs,alsoformodelingcitat ionnetworks[3] growingrandomgraphswheretheconnectionpro babilitydependsonsomevertexproperties graphsfromtheGraphAtlas[13] allnon-isomorphicgraphsof a given measuresThefollowingcentrality measures[7] canbe calculated: degree closeness vertexandedgebetweenness eigenvectorcentrality pagerank[12].
10 PathlengthbasedpropertiesOneor allshortestpathsbetweenverticescanbe calculated,andalsobasedonthisthediameter andtheaveragepathlengthof canbe cal-culated,andalsotheminimumspanningfor estof a threeor fourcomponents canbe calculated,bothundirectedanddirectedmoti fs[16].RandomrewiringExistinggraphscanbe rewiredrandomlywhilepreservingtheirdegre edistribution,allowingtheuserto generatean arbitrarysetof simpleway to manipulatesubsetsof verticesand/oredgesof a graph, non-numericattributescanbe as-signedto theverticesandedgesof a graph,andqueriedandsetby usinga simplenotation, lesandalsoPajek[4]andGraphML[2] igraph .