About this Course

Duration 2 Days

The course looks at the theoretical and practical implications of a wide array of clustering techniques currently available in SAS. The techniques considered include cluster pre-processing, variable clustering, k-means clustering, and hierarchical clustering


Before attending this course, you should

  • be able to execute SAS programs and create SAS data sets. You can gain this experience by completing the SAS Programming 1: Essentials course.
  • have completed a graduate-level course in statistics or the Introduction to Statistics Using SAS 9.2: ANOVA, Linear Regression and Logistic Regression course.
  • have an understanding of matrix algebra.

Who should attend

Intermediate or senior level statisticians, data analysts, and data miners

Delegates will learn how to

  • prepare and explore data for a cluster analysis
  • distinguish among many different clustering techniques, making informed choices about which to use
  • evaluate the results of a cluster analysis
  • determine the appropriate number of clusters to retain
  • profile and describe clustered observations
  • score observations into clusters.


Introduction to Clustering

  • identifying types of clustering
  • measuring similarity
  • assessing multivariate normality
  • using classification matrices

Preparation for Clustering

  • preparing data for cluster analysis
  • using variable clustering for variable selection
  • using graphical clustering aids
  • making elongated clusters more spherical
  • viewing the impact of input standardization

Partitive Clustering

  • k-means clustering for segmentation
  • outlining the advantages of nonparametric clustering

Hierarchical Clustering

  • comparing hierarchical clustering methods

Assessing Clustering Results

  • determining the number of clusters
  • profiling a cluster solution
  • scoring new observations

Cluster Analysis Case Study

  • variable selection
  • graphical exploration of selected variables
  • hierarchical clustering and determining the number of clusters
  • profiling the seven-cluster solution
  • modelling cluster membership
  • scoring the customer database

Canonical Discriminant Analysis (CDA)Plots

  • canonical discriminant plots

Fuzzy Clustering

  • Q-methodology

Assessing Multivariate Normality

  • assessing multivariate normality

This course addresses SAS/STAT software.

2 Days


This is a QA approved partner course

Delivery Method

Delivery method


Face-to-face learning in the comfort of our quality nationwide centres, with free refreshments and Wi-Fi.

Find dates and prices

Online booking is currently not available for this course, to find out more please call us on 0113 220 7150 or email us at info@qa.com to discuss how we can help.

Trusted, awarded and accredited

Fully accredited to ensure we provide the highest possible standards in learning

All third party trademark rights acknowledged.