Example: quiz answers

SPSS Tutorial - Multivariate Solutions

spss TutorialSPSS TutorialAEB 37 / AE 802 Marketing Research MethodsWeek 7 Cluster analysis Cluster analysis Lecture / Tutorial outline Cluster analysis Example of cluster analysis Work on the assignmentCluster AnalysisCluster Analysis It is a class of techniques used to classify cases into groups that are relatively homogeneous within themselves and heterogeneous between each other, on the basis of a defined set of variables. These groups are called clusters. Cluster Analysis and Cluster Analysis and marketing researchmarketing research Market segmentation. clustering of consumers according to their attribute preferences Understanding buyers behaviours. Consumers with similar behaviours/characteristics are clustered Identifying new product opportunities.

Linkage methods – Single linkage (minimum distance) – Complete linkage (maximum distance) – Average linkage • Ward’s method 1. Compute sum of squared distances within clusters 2. Aggregate clusters with the minimum increase in the overall sum of squares • Centroid method – The distance between two clusters is defined as the

Tags:

  Tutorials, Spss, Linkages, Spss tutorial

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of SPSS Tutorial - Multivariate Solutions

1 spss TutorialSPSS TutorialAEB 37 / AE 802 Marketing Research MethodsWeek 7 Cluster analysis Cluster analysis Lecture / Tutorial outline Cluster analysis Example of cluster analysis Work on the assignmentCluster AnalysisCluster Analysis It is a class of techniques used to classify cases into groups that are relatively homogeneous within themselves and heterogeneous between each other, on the basis of a defined set of variables. These groups are called clusters. Cluster Analysis and Cluster Analysis and marketing researchmarketing research Market segmentation. clustering of consumers according to their attribute preferences Understanding buyers behaviours. Consumers with similar behaviours/characteristics are clustered Identifying new product opportunities.

2 Clusters of similar brands/products can help identifying competitors / market opportunities Reducing data. in preference mappingSteps to conduct a Steps to conduct a Cluster AnalysisCluster a distance a clustering the number of the analysisREGR factor score 2 for analysis 143210-1-2-3 REGR factor score 1 for analysis 13210-1-2-3-4 Defining distance: the Defining distance: the Euclidean distanceEuclidean distanceDijdistance between cases iand jxkivalue of variable Xkfor case jProblems: Different measures = different weights Correlation between variables (double counting)Solution:Principal component analysis 21nijkikjkDxx Clustering proceduresClustering procedures Hierarchical procedures Agglomerative (start from nclusters, to get to 1cluster) Divisive (start from 1cluster, to get to ncluster) Non hierarchical procedures K-means clusteringAgglomerative clusteringAgglomerative clusteringAgglomerative Agglomerative clusteringclustering Linkage methods Single linkage (minimum distance) Complete linkage (maximum distance)

3 Average linkage Ward s sum of squared distances within clusters with the minimum increase in the overall sum of squares Centroid method The distance between two clusters is defined as the difference between the centroids (cluster averages)KK--means clusteringmeans number kof cluster is initial set of k seeds (aggregation centres) is provided First kelements Other a certain treshold, all units are assigned to the nearest cluster seeds are back to step 3 until no reclassification is necessaryUnits can be reassigned in successive steps (optimising partioning)Hierarchical vs Non Hierarchical vs Non hierarchical methodshierarchical methodsHierarchical clustering No decision about the number of clusters Problems when data contain a high level of error Can be very slow Initial decision are more influential (one-step only)Non hierarchical clustering Faster, more reliable Need to specify the number of clusters (arbitrary) Need to set the initial seeds (arbitrary)Suggested approachSuggested perform a hierarchical method to define the number of use the k-means procedure to actually form the clustersDefining the number of Defining the number of clusters: elbow rule (1)clusters.

4 Elbow rule (1)Agglomeration 1 Cluster 2 Cluster CombinedCoefficientsCluster 1 Cluster 2 Stage Cluster FirstAppearsNext StageStageNumber of clusters01211121039485766758493102111nEl bow rule (2): the Elbow rule (2): the screescreediagramdiagram0246810121110987 654321 Number of clustersDistanceValidating the Validating the analysisanalysis Impact of initial seeds / order of cases Impact of the selected method Consider the relevance of the chosen set of variablesSPSS ExampleSPSS 1 Cluster 2 Cluster CombinedCoefficientsCluster 1 Cluster 2 Stage Cluster FirstAppearsNext StageNumber of clusters: 10 6 = Number of Ca 4 3 2 1 LUCYJULIAFREDARTHURJENNIFERTHOMASMATTHEW NICOLEPAMELAJOHNOpen the dataset Open the dataset your N: directory (if you saved it there last timeOr download it from: ~aes02 Open it in SPSSThe The Principal Run Principal Components Analysis Components Analysis and save scoresand save scores Select the variables to perform the analysis Set the rule to extract principal components Give instruction to save the principal components as new variablesCluster analysis: Cluster analysis.)

5 Basic stepsbasic steps Apply Ward s methods on the principal components score Check the agglomeration schedule Decide the number of clusters Apply the k-means methodAnalyse / ClassifyAnalyse / ClassifySelect the component Select the component scoresscoresSelect from hereUntickthisSelect WardSelect Ward s algorithms algorithmClick here firstSelect method hereOutput: Agglomeration Output: Agglomeration schedulescheduleNumber of clustersNumber of clustersIdentify the step where the distance coefficients makes a bigger jumpThe The screescreediagram diagram (Excel needed)(Excel needed)Distance0100200300400500600700800 1181201221241261281301321341361381401421 44146148 StepNumber of clustersNumber of clustersNumber of cases150 Step of elbow 144_____Number of clusters6 Now repeat the Now repeat the analysisanalysis Choose the k-means technique Set 6as the number of clusters Save cluster number for each case Run the analysisKK--meansmeansKK--means dialog boxmeans dialog boxSpecify number of clustersSave cluster membershipSave cluster membershipClick here firstThick hereFinal outputFinal outputCluster membershipCluster membershipComponent amount spentMeat expenditureFish expenditureVegetables expenditure% spent in own-brandproductOwn a car% spent

6 In organic foodVegetarianHousehold SizeNumber of kidsWeekly TV watching(hours)Weekly Radio listening(hours)Surf the webYearly household incomeAge of respondent12345 ComponentExtraction Method: Principal Component components Component meaningComponent meaning( Tutorial week 5)( Tutorial week 5)1. Old Rich Big Spender 3. Vegetarian TV lover4. Organic radio listener2. Family shopper5. Vegetarian TV and web haterFinal Cluster factor score1 for analysis 1 REGR factor score2 for analysis 1 REGR factor score3 for analysis 1 REGR factor score4 for analysis 1 REGR factor score5 for analysis 1123456 ClusterCluster interpretation Cluster interpretation through mean component valuesthrough mean component values Cluster 1 is very far from profile 1 ( ) and more similar to profile 2 ( ) Cluster 2 is very far from profile 5 ( )

7 And not particularly similar to any profile Cluster 3 is extremely similar to profiles 3 and 5 and very far from profile 2 Cluster 4 is similar to profiles 2 and 4 Cluster 5 is very similar to profile 3 and very far from profile 4 Cluster 6 is very similar to profile 5 and very far from profile 3 Which cluster to Which cluster to target?target? Objective: target the organic consumer Which is the cluster that looks more organic ? Compute the descriptive statistics on the original variables for that clusterRepresentation of factors 1 Representation of factors 1 and 4and 4(and cluster membership)(and cluster membership)REGR factor score 1 for analysis 1210-1-2-3 REGR factor score 4 for analysis 13210-1-2-3 Cluster Number of Ca 6 5 4 3 2 1


Related search queries