About this course

Course type Performance Plus
Course code QAIML
Duration 4 Days

Machine Learning is a well understood process. We typically start with some existing data and pass it through an algorithm. The algorithm ‘learns’ from that specific data and produces a ‘data model’. This model has learnt from the data and now encapsulates information derived from the raw data. We then have to test the model (to see how good it is) and try to incrementally improve it. Finally, we evaluate the finished model and deploy it.


  • An understanding of data
  • A good logical mind
  • We do not expect people to have a background in mathematics


Module 1: Introduction
This module introduces the background to Machine Learning.
  • Definition of Machine Learning (ML)
  • Origins of ML
  • Rule deduction (Expert Systems) vs induction (ML)
  • Why do we want machines to learn?
  • Case studies
  • Regression as a classic example of ML
Module 2: Data collection and preparation
Collecting the correct data for the training and testing phases is crucial. The data is often ‘dirty’ and needs to be cleaned. But more than that, the way in which the data is pre-processed is often the difference between poor and highly effective ML.
  • Types of Data
  • Data understanding
  • Data selection
  • Data sampling
  • Data volume reduction
  • Removing ambiguities
  • Normalisation
  • Discretisation
  • Cleansing
  • Missing values
  • Outliers
  • Data and dimensional reduction
  • Principal Component Reduction (PCA)
  • Generalisation of hierarchies
Module 3: Creating or choosing an algorithm
Building a new algorithm for the data modelling (or, as is often done, choosing an existing one) is a vital part of the process. We also look at the common ML algorithms
  • Examples of creating algorithms
  • The use of data mining algorithms
  • Classes and examples of data mining
  • Regression
  • Clustering
  • Decision trees
  • Support Vector Machines (SVMs)
  • Classification
  • Segmentation
  • Association
  • Sequence analysis
  • Neural nets
  • Deep Learning
  • KNN (K Nearest Neighbour)
Module 4: Training and Test data
Data models and how to create training and test data
  • Selecting the training data
  • Ratio of training to test data
  • How to make an unbiased selection
  • How to use training data to create the model
Module 5: Testing and improving the data model
Testing is a vital (and complex) part of the process.
  • Type 1, 2 and 3 errors
  • False positives vs False negatives
  • PCC
  • Classification models
  • Confusion matrices
  • Measuring efficiency
  • ROC curves
  • More about efficiency
  • Overfitting and bias
Module 6: Introduction to ML in R
R is a well-established open source language with many built-in ML algorithms. This module introduces the language and provides some practical ML work.
  • Introduction to R
  • Lab : ML with R
Module 7: Combining data models
Any one ML system that we build will have a certain level of efficiency. But we can build a number of different data models and combine them in various ways so that the efficiency of the whole is greater than the sum of the parts.
  • Ensemble
  • Boosting
  • Gradient boosting

Performance Plus

4 Days


This course is authored by QA

Delivery Method

Delivery method


Face-to-face learning in the comfort of our quality nationwide centres, with free refreshments and Wi-Fi.

Trusted, awarded and accredited

Fully accredited to ensure we provide the highest possible standards in learning

All third party trademark rights acknowledged.