This course gives business analysts and data scientists a seamless platform to profile, integrate, cleanse, and move big data without writing code in a Hadoop environment using an intuitive web-based interface.
Working with SAS R Data Loader 2 3 for Hadoop
There are currently no prerequisites for this course.
Who should attend
Business Users who interact with data, perform data discovery, query data, and ensure that data is in the proper place and format for other users; Data Analysts, Data Scientists, and Statisticians who review results of data discovery activities, create new tables, create new data elements, change the format/structure of data tables to view it in a variety of ways, manipulate and score data elements, and load data for use by other users; and Data Management Specialists who apply enterprise standa
- move data in and out of Hadoop
- interrogate and profile data for quality issues
- transform, transpose, and join data that is fit-for-purpose
- cleanse and integrate data suitable for analysis and reporting
- load data into the SAS In-Memory Analytics Server for analytics and exploration
- execute custom SAS and HiveQL code inside the Hadoop cluster.
- why SAS?
- why SAS and Hadoop?
- why SAS Data Loader for Hadoop?
- introduction to the Big Data Era
- why Hadoop?
SAS Data Loader Overview
- introduction to virtual applications
- introduction to SAS Data Loader (vApp)
SAS Data Loader functionality
- navigating in SAS Data Loader interface
- steps common to most directives
Methodology and Course Flow
- SAS Data Loader use cases
- preparing data for analytics methodology
- course overview and demo/exercise logistics
Acquiring and Discovering Data
- copy tables to Hadoop
- import text files into Hadoop
- profile data in Hadoop for data quality issues
- query data in Hadoop to understand structure and content
Transforming and Transposing Data
- transform data in Hadoop
- transpose data in Hadoop
- parse data into meaningful subsets to provide a basis for analysis
- standardize data into consistent format and structure
- generate match codes to support fuzzy matches for joining tables
- identify and categorize data in Hadoop
- filter data rows using business rules or Hive expressions
- create queries to select and join tables using inner, outer, left, and right join types
- join tables using generated match codes for dissimilar table
- sort, de-duplicate, and manage columns and data
- execute SAS programs in Hadoop using ultra-efficient SAS DS2 language elements
- run a Hive program using an expression builder or copy in your code
- load data to LASR
- copy data from Hadoop
- SAS Data Loader vApp settings
- SAS Data Loader configurations
- SAS and Hadoop data processing
- SAS DS2 programs
- debugging Hadoop jobs
- debugging Hadoop jobs
This is a QA approved partner course
Face-to-face learning in the comfort of our quality nationwide centres, with free refreshments and Wi-Fi.
Find dates and prices
Online booking is currently not available for this course, to find out more please call us on 0113 220 7150 or email us at firstname.lastname@example.org to discuss how we can help.
Fully accredited to ensure we provide the highest possible standards in learning