Hadoop is more popular than ever and is generating data-driven business value across every industry. This course gives attendees the essential skills to build Big Data applications using Hadoop technologies such as HDFS, YARN, Apache Kafka, Apache Hive and Apache Spark, in an analytical ecosystem with Teradata components such as Teradata Database, Teradata Viewpoint, and Teradata QueryGrid.
In this course, students will have access to their own cluster to gain hands-on experience. Students will use the Hadoop's distributed file system and process distributed datasets with Hive. In addition, students will develop applications in Spark using Scala and Python via RDDs and DataFrames.
Students will write applications using Hive and Spark and learn about common issues encountered when processing vast datasets in distributed systems.
A discussion of additional tools, Hadoop distributions, and the opportunity to ask questions of experts in Hadoop technology make this popular course an essential grounding for companies looking to implement Hadoop effectively within their enterprise.
Hive Developers, Spark Developers, Hadoop Developers, Data Scientists, Business Analysts/Data Analysts, and Data Engineers