Section 1 : Introduction

Lecture 1 You, this course and Us copy 00:01:52 Duration

Section 2 : Why is Big Data a Big Deal

Lecture 1 The Big Data Paradigm 00:14:18 Duration
Lecture 2 Serial vs Distributed Computing 00:08:32 Duration
Lecture 3 What is Hadoop
Lecture 4 HDFS or the Hadoop Distributed File System 00:10:51 Duration
Lecture 5 MapReduce Introduced 00:11:33 Duration
Lecture 6 YARN or Yet Another Resource Negotiator 00:03:58 Duration

Section 3 : Installing Hadoop in a Local Environment

Lecture 1 Hadoop Install Modes 00:08:22 Duration
Lecture 2 Hadoop Standalone mode Install 00:15:39 Duration
Lecture 3 Hadoop Pseudo-Distributed mode Install

Section 4 : The MapReduce Hello World

Lecture 1 The basic philosophy underlying MapReduce 00:08:45 Duration
Lecture 2 MapReduce - Visualized And Explained 00:09:00 Duration
Lecture 3 MapReduce - Digging a little deeper at every step 00:10:17 Duration
Lecture 4 Hello World in MapReduce 00:10:23 Duration
Lecture 5 The Mapper 00:09:46 Duration
Lecture 6 The Reducer 00:07:44 Duration
Lecture 7 The Job 00:12:21 Duration

Section 5 : Run a MapReduce Job

Lecture 1 Get comfortable with HDFS 00:10:45 Duration
Lecture 2 Run your first MapReduce Job 00:14:22 Duration

Section 6 : Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API

Lecture 1 Parallelize the reduce phase - use the Combiner 00:14:30 Duration
Lecture 2 Not all Reducers are Combiners 00:13:30 Duration
Lecture 3 How many mappers and reducers does your MapReduce have
Lecture 4 Parallelizing reduce using Shuffle And Sort 00:14:32 Duration
Lecture 5 MapReduce is not limited to the Java language - Introducing the Streaming API 00:05:02 Duration
Lecture 6 Python for MapReduce 00:12:13 Duration

Section 7 : HDFS and Yarn

Lecture 1 HDFS - Protecting against data loss using replication 00:15:30 Duration
Lecture 2 INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM
Lecture 3 HDFS - Checkpointing to backup name node information 00:11:10 Duration
Lecture 4 Yarn - Basic components 00:08:32 Duration
Lecture 5 Yarn - Submitting a job to Yarn
Lecture 6 Yarn - Plug in scheduling policies 00:14:11 Duration
Lecture 7 Yarn - Configure the scheduler 00:12:29 Duration

Section 8 : MapReduce Customizations For Finer Grained Control

Lecture 1 Setting up your MapReduce to accept command line arguments 00:13:43 Duration
Lecture 2 The Tool, ToolRunner and GenericOptionsParser 00:12:30 Duration
Lecture 3 Configuring properties of the Job object 00:10:39 Duration
Lecture 4 INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM

Section 9 : The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!

Lecture 1 INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM
Lecture 2 Generating the inverted index using MapReduce 00:10:29 Duration
Lecture 3 Custom data types for keys - The Writable Interface
Lecture 4 Represent a Bigram using a WritableComparable 00:13:15 Duration
Lecture 5 MapReduce to count the Bigrams in input text 00:08:27 Duration
Lecture 6 Setting up your Hadoop project
Lecture 7 Test your MapReduce job using MRUnit 00:13:42 Duration

Section 10 : Input and Output Formats and Customized Partitioning

Lecture 1 Introducing the File Input Format 00:12:19 Duration
Lecture 2 Text And Sequence File Formats 00:10:19 Duration
Lecture 3 Data partitioning using a custom partitioner 00:06:56 Duration
Lecture 4 Make the custom partitioner real in code 00:10:22 Duration
Lecture 5 Total Order Partitioning 00:10:07 Duration
Lecture 6 Input Sampling, Distribution, Partitioning and configuring these 00:09:02 Duration
Lecture 7 Secondary Sort 00:14:24 Duration

Section 11 : Recommendation Systems using Collaborative Filtering

Lecture 1 Introduction to Collaborative Filtering 00:07:20 Duration
Lecture 2 Friend recommendations using chained MR jobs 00:17:12 Duration
Lecture 3 Get common friends for every pair of users - the first MapReduce 00:14:43 Duration
Lecture 4 Top 10 friend recommendation for every user - the second MapReduce 00:13:39 Duration

Section 12 : Hadoop as a Database

Lecture 1 Structured data in Hadoop 00:14:00 Duration
Lecture 2 Running an SQL Select with MapReduce 00:15:27 Duration
Lecture 3 Running an SQL Group By with MapReduce 00:13:55 Duration
Lecture 4 A MapReduce Join - The Map Side 00:14:13 Duration
Lecture 5 A MapReduce Join - The Reduce Side 00:13:01 Duration
Lecture 6 A MapReduce Join - Sorting and Partitioning 00:08:44 Duration
Lecture 7 A MapReduce Join - Putting it all together 00:13:41 Duration

Section 13 : K-Means Clustering

Lecture 1 What is K-Means Clustering 00:13:59 Duration
Lecture 2 A MapReduce job for K-Means Clustering 00:16:28 Duration
Lecture 3 K-Means Clustering - Measuring the distance between points 00:13:22 Duration
Lecture 4 K-Means Clustering - Custom Writables for InputOutput 00:08:13 Duration
Lecture 5 K-Means Clustering - Configuring the Job 00:10:45 Duration
Lecture 6 K-Means Clustering - The Mapper and Reducer 00:11:15 Duration
Lecture 7 K-Means Clustering The Iterative MapReduce Job 00:03:35 Duration

Section 14 : Setting up a Hadoop Cluster

Lecture 1 Manually configuring a Hadoop cluster (Linux VMs) 00:12:52 Duration
Lecture 2 Getting started with Amazon Web Servicies 00:06:20 Duration
Lecture 3 Start a Hadoop Cluster with Cloudera Manager on AWS 00:12:59 Duration

Section 15 : Appendix

Lecture 1 Setup a Virtual Linux Instance (For Windows users) 00:15:50 Duration
Lecture 2 [For LinuxMac OS Shell Newbies] Path and other Environment Variables 00:08:21 Duration