Maximizing the Spread of Inﬂuence through a Social Network

MaximizingtheSpreadofInfluencethrougha SocialNetworkDavidKempe ComputerScienceCornellUniversity, Eva ComputerScienceCornellUniversity, socialnetworkhave beenstudiedina numberofdo-mains,includingthediffusionof medicalandtechnologicalinnova-tions,thes uddenandwidespreadadoptionofvariousstrat egiesingame-theoreticsettings,andtheeffe ctsof wordofmouth , motivatedbythedesignofviralmarketingstra tegies,DomingosandRichardsonposeda fun-damentalalgorithmicproblemforsuchsoc ialnetworkprocesses:if wecantrytoconvincea subsetofindividualstoadopta newproductorinnovation,andthegoalis totriggera largecascadeoffurtheradoptions,whichseto findividualsshouldwetarget?We NP-hardhere, ,weshow thata naturalgreedystrategyobtainsa solutionthatisprovablywithin63%ofoptimal forseveralclassesofmodels;ourframeworksu ggestsa alsoprovidecomputationalexperimentsonlar gecollabora-tionnetworks,showingthatinad ditiontotheirprovableguaran-tees, [AnalysisofAlgorithmsandProblemComplexit y].

Non-numericalAlgorithmsandProblems DavidandLucilePackardFoundationFel- partbyNSFITR grantCCR-011337, digitalorhardcopiesofallorpartofthiswork forpersonalorclassroomuseis copy otherwise,torepublish,topostonserversort oredistributetolists,requirespriorspecif icpermissionand/ora 03 Washington,DC,USAC opyright2003 ACM1-58113-737-0/03 $ ,socialnetworks,viralmarketing, thegraphofrelationshipsandinteractionswi thina groupofindividuals playsa fundamentalroleasamediumforthespreadofin formation,ideas, forexample,theuseofcellphonesamongcolleg estudents,theadoptionofa newdrugwithinthemedicalprofession,orther iseofa politicalmove-mentinanunstablesociety andit caneitherdieoutquicklyormake wewanttoun-derstandtheextenttowhichsuchi deasareadopted,it canbeim-portanttounderstandhowthedynamic sofadoptionarelikelytounfoldwithintheund erlyingsocialnetwork:theextenttowhichpeo plearelikelytobeaffectedbydecisionsofthe irfriendsandcolleagues,ortheextenttowhic h word-of-mouth effectswilltake a [8,27,29];inothercontexts,researchhasinv es-tigateddiffusionprocessesfor word-of-mouth and viralmarket-ing effectsin thesuccessofnew products[4,7, 10,13,14,20,26],thesuddenandwidespreadad optionofvariousstrategiesingame-theoreti csettings[6,12,21,32,33],andtheproblemof cascadingfailuresinpowersystems[2,3].

Inrecentwork,motivatedbyapplicationstoma rketing,Domin-gosandRichardsonposeda fundamentalalgorithmicproblemforsuchsyst ems[10,26].Supposethatwehave dataona socialnetwork,withestimatesfortheextentt owhichindividualsinflu-enceoneanother, andwewouldlike tomarketa newproductthatwehopewillbeadoptedbya thatbyinitiallytargetinga few influ-ential membersofthenetwork say, givingthemfreesamplesoftheproduct wecantriggera cascadeofinfluencebywhichfriendswillreco mmendtheproducttootherfriends,andmany individualstouseforseedingthisprocess?In [10,26],thisquestionwasconsideredina probabilisticmodelofinteraction;heuristi csweregivenforchoosingcustomerswitha largeoveralleffectonthenetwork,andmethod swerealsodevelopedto , weconsidertheissueofchoosinginfluentials etsofindividualsasa NP-hardformostmodelsthathave beenstudied,includingthemodelof[10].Thef rameworkproposedin[26],ontheotherhand,is basedona simplelinearmodelwherethesolutiontotheop timizationproblemcanbeobtainedbysolvinga collectionofrelated,NP-hardmod-elsthatha ve beenextensivelystudiedinthesocialnetwork scom-munity, andobtainthefirstprovableapproximationgu aranteesforefficientalgorithmsina [26]andtheverygeneralmodelof[10], beginbydepartingsomewhatfromtheDomingos- Richardsonframeworkinthefollowingsense:w heretheirmodelsareessen-tiallydescriptiv e, specifyinga jointdistributionoverallnodes be-haviorina globalsense,wefocusonmoreoperationalmode lsfrommathematicalsociology[15,28]andint eractingparticlesys-tems[11,17] show thatapproximationalgorithmsformaximiz-in gthespreadofinfluenceinthesemodelscanbed evelopedina generalframeworkbasedonsubmodularfunctio ns[9,23].

Wealsoprovidecomputationalexperimentsonl argecollaborationnet-works,showingthatin additionto theirprovableguarantees,oural-gorithmssi gnificantlyout-performnode-selectionheur isticsbasedonthewell-studiednotionsofdeg reecentralityanddistancecen-trality[30] socialnetworkG, representedbya directedgraph,wewillspeakofeachindi-vidu alnodeasbeingeitheractive(anadopterofthe innovation)orinactive. We willfocusonsettings,guidedbythemotivatio ndiscussedabove,inwhicheachnode s tendency ,wewillfocusfornow ontheprogressivecaseinwhichnodescanswitc hfrombeinginactive tobeingactive, butdonotswitchintheotherdirection;it turnsoutthatthisassumptioncaneasilybelif tedlater. Thus,theprocesswilllookroughlyasfollowsf romtheperspective ofaninitiallyinactive nodev: astimeunfolds,moreandmoreofv s neighborsbecomeactive;atsomepoint,thisma ycausevtobecomeactive, andv s decisionmayinturntriggerfur-therdecision sbynodestowhichvis process;theirapproachwasbasedontheuseofn ode-specificthresholds[15,28].

Many modelsofthisflavorhave sincebeeninvestigated( [5,15,18,19,21,25,28,29,31,32,33]) ,a nodevisinfluencedbyeachneighborwaccordin gto aweightbv;wsuchthatXwneighbor ofvbv;w 1. vuniformlyatran-domfromtheinterval[0;1]; thisrepresentstheweightedfractionofv s neighborsthatmustbecomeactive randomchoiceofthresholds,andaninitialset ofactive nodesA0(withallothernodesinactive),thedi ffusionpro-cessunfoldsdeterministicallyi ndiscretesteps: instept, allnodesthatwereactive instept 1remainactive,andweactivateanynodevforwh ichthetotalweightofitsactive neighborsis atleast v:Xwactive neighbor ofvbv;w v:Thus,thethresholds vintuitivelyrepresentthedifferentlatentt en-denciesofnodestoadopttheinnovationwhe ntheirneighborsdo;thefactthattheseareran domlyselectedis intendedtomodelourlackofknowledgeoftheir values weareineffectaveragingoverpossiblethresh oldvaluesforallthenodes.(Anotherclassofa pproacheshard-wiresallthresholdsat a knownvaluelike1=2; seeforexampleworkbyBerger[5],Morris[21], andPeleg [25].)

Basedonworkin interactingparticlesystems[11,17]frompro b-abilitytheory, , investi-gatedrecentlyinthecontextofmarke tingbyGoldenberg,Libai,andMuller[13,14]. We againstartwithaninitialsetofactivenodesA 0, , it is givena singlechancetoactivateeachcurrentlyinac- tive neighborw; it succeedswitha probabilitypv;w a parameterofthesystem independentlyofthehistorythusfar. (Ifwhasmultiplenewlyactivatedneighbors,t heirattemptsaresequencedinanarbitraryord er.) Ifvsucceeds,thenwwillbecomeactive in stept+ 1; butwhetherornotvsucceeds,it cannotmake any , ,but ofcoursemany willturntothisissuelaterinthepaper, proposinga ofcon-cretenessintheintroduction,wewilld iscussourresultsintermsofthesetwo arenowina positiontoformallyexpresstheDomingos-Ric hardsonstyleofoptimizationproblem choosinga goodinitialsetofnodestotarget inthecontextoftheabove (aswellasthegeneralizationstofollow)invo lve aninitialsetofactive definetheinfluenceofa setofnodesA, denoted (A), tobetheexpectednumberofactivenodesatthee ndoftheprocess,giventhatAisthisinitialac tivesetA0.

Theinfluencemaximizationproblemasks,fora parameterk, tofindak-nodesetofmaximuminfluence.(When dealingwithalgorithmsforthisproblem,wewi llsaythatthechosensetAofkinitialactive nodeshasbeentargetedforactivationbytheal go-rithm.)Forthemodelsweconsider, it is NP-hardtodeterminetheoptimumforinfluence maximization,aswewillshow factorof(1 1=e "), inboththeLinearThresholdandIndependentCa scademodels;hereeisthebaseofthenaturallo garithmand"isany positive realnumber. (Thus,thisisa performanceguar-anteeslightlybetterthan6 3%.)Thealgorithmthatachievesthisperforma nceguaranteeisa naturalgreedyhill-climbingstrategyrelate dtotheapproachconsideredin[10],andsothem aincon-tentofthisresultis theanalysisframeworkneededforobtainingap rovableperformanceguarantee,andthefairly surprisingfactthathill-climbingis alwayswithina factorofatleast63% prove thisresultinSection2 usingtechniquesfromthetheoryofsubmodular functions[9,23],whichwedescribeindetailb elow, andwhichturnouttoprovidea ,thisanalysisframeworkallowsustodesignan dproveguaranteesforapproximationalgorith msinmuchricherandmorerealisticmodelsofth eprocessesbywhichwemarket to highlysimplifiedmodel.

Anissuealsoconsideredin[10,26]is thatwemayinrealityhave a largenumberofdifferentmarketingactionsav ailable,eachofwhichmayinfluencenodesin show howto extendtheanalysisto thata generalizationofthehill-climbingal-gorit hmstillprovidesapproximationguaranteesar bitrarilycloseto(1 1=e).It is ,theinfluencemax-imizationproblemis NP-complete,but it [26],ontheotherhand,boththepropagationof influenceaswellastheeffectoftheinitialta rgetingarelinear. Initialmarketingdecisionsherearethuslimi tedintheireffectonnodeactivations;eachno de s probabilityofactivationis obtainedasa ,theinfluencecanbemaximizedbysolvinga ,wecanshowthatgeneralmodelslike thatofDomingosandRichardson[10],andevens implemodelsthatbuildina fixedthreshold(like1=2) at allnodes[5,21,25],leadtoinflu-encemaximi zationproblemsthatcannotbeapproximatedto withinany non-trivialfactor, assumingP6= wayoftracingouta moredelicatebound-aryoftractabilitythrou ghthesetofpossiblemodels,byhelpingtodist inguishamongthoseforwhichsimpleheuristic sprovidestrongperformanceguaranteesandth oseforwhichthey ,andthedesignofaccuratemodelsthatsimulta neouslyallow ,wede-scribeinSection3 theresultsofcomputationalexperimentswith boththeLinearThresholdandIndependentCasc adeModels,show-ingthatthehill-climbingal gorithmsignificantlyout-performsstrate-g iesbasedontargetinghigh-degreeor central nodes[30].

InSec-tion4 wethendevelopa generalmodelofdiffusionprocessesinsocial networksthatsimultaneouslygeneralizesthe LinearThresh-oldandIndependentCascadeMod els,aswellasa numberofothernaturalcases,andweshowhowto obtainapproximationguaran-teesfora and6,wealsoconsiderextensionsofourapprox imationalgorithmstomod-elswithmorerealis ticscenariosinmind:morecomplex market-ingactionsasdiscussedabove,andnon -progressiveprocesses,inwhichactive nodesmaybecomeinactive ( )thatmapssubsetsofa finitegroundsetUtonon-negative saythatfissubmodularif it satisfiesa natural di-minishingreturns property:themarginalgainfromaddinganele- menttoa setSis at leastashighasthemarginalgainfromadding1 Notethattheinfluencefunction ( )definedabove hasthisform;it mapseachsubsetAofthenodesofthesocialnetw orktoa realnumberdenotingtheexpectedsizeoftheac tivatedsetifAis supersetofS. Formally, a submodularfunc-tionsatisfiesf(S[fvg) f(S) f(T[fvg) f(T);forallelementsvandallpairsofsetsS a numberofverynicetractabilityproperties;t heonethatis relevanttoushereis a functionfthatissubmodular, takesonlynon-negative values,andismonotoneinthesensethatadding anele-menttoa setcannotcauseftodecrease:f(S[fvg) f(S)forallelementsvandsetsS.]]]

We wishtofindak-elementsetSforwhichf(S) (itcanbeshowntocontaintheHittingSetprobl emasasimplespecialcase),buta resultofNemhauser, Wolsey, andFisher[9,23]showsthatthefollowinggree dyhill-climbingalgorithmap-proximatesthe optimumtowithina factorof(1 1=e)(whereeis thebaseofthenaturallogarithm):startwitht heemptyset, [9,23]Fora non-negative, monotonesubmod-ularfunctionf, letSbea setofsizekobtainedbyselectingele-mentson eata time, each timechoosinganelementthatprovidesthelarg estmarginalincreaseinthefunctionvalue. LetS (S) (1 1=e) f(S ); inotherwords,Sprovidesa(1 1=e) , thisresulthasfoundapplicationsina num-berofareasofdiscreteoptimization( [22]);theonlydirectuseofit thatweareawareofinthedatabasesanddatamin inglit-eratureis ina contextverydifferentfromours,fortheprobl emofselectingdatabaseviewstomaterialize[ 16].Ourstrategywillbetoshowthatforthemod elsweareconsid-ering,theresultinginfluen cefunction ( )is submodular.

Maximizing the Spread of Inﬂuence through a Social Network

Tags:

Information

Transcription of Maximizing the Spread of Inﬂuence through a Social Network

Related search queries

Maximizing the Spread of Inﬂuence through a Social Network

Tags:

Information

Documents from same domain

Related documents

Related search queries