Overview

R is the most popular environment and language for statistical analyses, data mining, and machine learning. Managed and scalable version of R runs in SQL Server, Power BI, and Azure ML. The main topic of this 4-day course is the R language. However, the course also shows how to use the languages and tools available in MS BI suite for data science applications, including Python, T-SQL, Power BI, Azure ML, and Excel. The labs focus on R; the demos also show the code in other languages.

Read more

Prerequisites

Attendees should have basic understanding of data analysis and basic familiarity with SQL Server tools.

Course Format

This seminar consists of instructor presentations and individual work during labs. During labs, the attendees use mainly the R language.

Every attendee gets a .PDF printout of all slides and all code and solutions for the demos presented and for the lab exercises.

In addition, every attendee gets an electronic version of the Data Science with SQL Server Quick Start Guide book by Dejan Sarka, Packt, 2018.

Each attendee works on a pre-prepared computer on a virtual machine with the following software pre-installed:

  • SQL Server 2017 or 2019 Database Engine with ML Services (In-Database)
  • AdventureWorksDW2017 demo database
  • Microsoft R Client
  • RStudio IDE
  • SQL Server Management Studio
Read more

Learning Outcomes

Attendees of this course learn to program with R from the scratch. Basic R code is introduced using the free R engine and RStudio IDE. A lifecycle of a data science project is explained in details. The attendees learn how to perform the data overview and do the most tedious task in a project, the data preparation task. After data overview and preparation, the analytical part begins with intermediate statistics in order to analyze associations between pairs of variables. Then the course introduces more advanced methods for researching linear dependencies.

Too many variables in a model can make its own problem. The course shows how to do feature selection, starting with the basics of matrix calculations. Then the course switches more advanced data mining and machine learning analyses, including supervised and unsupervised learning. The course also introduces the currently modern topics, including forecasting, text mining, and reinforcement learning.

Finally, the attendees also learn how to use the R code in SQL Server, Azure ML, and Power BI through labs, and how to use Python for inside all of the tools mentioned through demos.

Read more

Course Outline

Following an introduction the modules will be as follows:

  1. Introducing data science and R
    1. What are statistics, data mining, machine learning…
    2. Data science projects and their lifetime
    3. Introducing R
    4. R tools
    5. R data structures

Lab 1

  1. Introducing Python
    1. Basic syntax and objects
    2. Data manipulation with NumPy and Pandas
    3. Visualizations with matplotlib and seaborn libraries
    4. Data science with Scikit-Learn

Discussion: R vs Python

  1. Data overview
    1. Datasets, cases and variables
    2. Types of variables
    3. Introductory statistics for discrete variables
    4. Descriptive statistics for continuous variables
    5. Basic graphs
    6. Sampling, confidence level, confidence interval

Lab 2

  1. Data preparation
    1. Derived variables
    2. Missing values and outliers
    3. Smoothing and normalization
    4. Time series
    5. Training and test sets

Lab 3

  1. Associations between two variables and visualizations of associations
    1. Covariance and correlation
    2. Contingency tables and chi-squared test
    3. T-test and analysis of variance
    4. Bayesian inference
    5. Linear models

Lab 4

  1. Feature selection and matrix operations
    1. Feature selection in linear models
    2. Basic matrix algebra
    3. Principal component analysis
    4. Exploratory factor analysis

Lab 5

  1. Unsupervised learning
    1. Hierarchical clustering
    2. K-means clustering
    3. Association rules

Lab 6

  1. Supervised learning
    1. Neural Networks
    2. Logistic Regression
    3. Decision and regression trees
    4. Random forests
    5. Gradient boosting trees
    6. K-nearest neighbors

Lab 7

  1. Modern topics
    1. Support vector machines
    2. Time series
    3. Text mining
    4. Deep learning
    5. Reinforcement learning

Lab 8

  1. R in SQL Server and MS BI
    1. ML Services (In-Database) structure
    2. Executing external scripts in SQL Server
    3. Storing a model and performing native predictions
    4. R in Azure ML and Power BI

Lab 9

Read more

Why choose QA

Dates & Locations

Frequently asked questions

See all of our FAQs

How can I create an account on myQA.com?

There are a number of ways to create an account. If you are a self-funder, simply select the "Create account" option on the login page.

If you have been booked onto a course by your company, you will receive a confirmation email. From this email, select "Sign into myQA" and you will be taken to the "Create account" page. Complete all of the details and select "Create account".

If you have the booking number you can also go here and select the "I have a booking number" option. Enter the booking reference and your surname. If the details match, you will be taken to the "Create account" page from where you can enter your details and confirm your account.

Find more answers to frequently asked questions in our FAQs: Bookings & Cancellations page.

How do QA’s virtual classroom courses work?

Our virtual classroom courses allow you to access award-winning classroom training, without leaving your home or office. Our learning professionals are specially trained on how to interact with remote attendees and our remote labs ensure all participants can take part in hands-on exercises wherever they are.

We use the WebEx video conferencing platform by Cisco. Before you book, check that you meet the WebEx system requirements and run a test meeting (more details in the link below) to ensure the software is compatible with your firewall settings. If it doesn’t work, try adjusting your settings or contact your IT department about permitting the website.

Learn more about our Virtual Classrooms.

How do QA’s online courses work?

QA online courses, also commonly known as distance learning courses or elearning courses, take the form of interactive software designed for individual learning, but you will also have access to full support from our subject-matter experts for the duration of your course. When you book a QA online learning course you will receive immediate access to it through our e-learning platform and you can start to learn straight away, from any compatible device. Access to the online learning platform is valid for one year from the booking date.

All courses are built around case studies and presented in an engaging format, which includes storytelling elements, video, audio and humour. Every case study is supported by sample documents and a collection of Knowledge Nuggets that provide more in-depth detail on the wider processes.

Learn more about QA’s online courses.

When will I receive my joining instructions?

Joining instructions for QA courses are sent two weeks prior to the course start date, or immediately if the booking is confirmed within this timeframe. For course bookings made via QA but delivered by a third-party supplier, joining instructions are sent to attendees prior to the training course, but timescales vary depending on each supplier’s terms. Read more FAQs.

When will I receive my certificate?

Certificates of Achievement are issued at the end the course, either as a hard copy or via email. Read more here.

Contact Us

Please contact us for more information