The Hands-On Guide to Hadoop and Big Data
February 28, 2018
This 18-part hands-on course introduces you to the basics of Hadoop and Big Data through code samples and walkthroughs.
The primary objectives of this course are:
- Install and develop with a real Hadoop installation using Docker on either your local machine or on the Digital Ocean Cloud.
- Set up the backbone of your own big data cluster using HDFS and MapReduce.
- Analyze large data sets by writing programs on Pig and Spark.
- Store and query your data using Sqoop, Hive, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto.
- Manage your cluster and workflows using Oozie, YARN, Mesos, Zookeeper, and Hue.
- Stream real-time data using Kafka, Flume, Spark Streaming, Flink, and Storm.
Everything in this guide is 100% free. You can think of this guide as a “Free Online Nano Book”. Tutorials are easy to understand and make you productive at learning Big Data concepts.
Pro-tip: If you already know some of the concepts below, you can skip them by marking them as completed.
Introduction to Big Data
Introduction to Hadoop
Writing Programs on Hadoop
Storing and Querying Data
- Hive Tutorial
- Sqoop Tutorial
- HBase Tutorial — Hadoop and NoSQL Part 1
- Cassandra Tutorial — Hadoop and NoSQL Part 2
- MongoDB — Hadoop and NoSQL Part 3
- Data Querying Tools Tutorial — Zeppelin, Drill, Phoenix, and Presto
Cluster and Workflow Management
- Oozie Tutorial — Workflow Management
- Cluster Management Tools Tutorial — YARN, Tez, Mesos, Zookeeper, and Hue
Streaming Data
- Kafka, Flume, and Flafka Tutorial
- Streaming Tools Tutorial —Spark Streaming, Apache Flink, and Storm