Section 1 : Introduction

Lecture 1 INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM Pdf
Lecture 2 Using labs for preparation 8:55
Lecture 3 Setup Development Environment (Windows 10) - Introduction 2:26
Lecture 4 Setup Development Environment - Python and Spark - Pre-requisites 4:12
Lecture 5 Setup Development Environment - Python Setup on Windows 3:8
Lecture 6 Setup Development Environment - Configure Environment Variables 2:32
Lecture 7 Setup Development Environment - Setup PyCharm for developing Python applications 5:29
Lecture 8 Setup Development Environment - Pass run time arguments or parameters 2:32
Lecture 9 Setup Development Environment - Download Spark compressed tar ball 1:38
Lecture 10 Setup Development Environment - Install 7z for uncompress and untar on windows 1:0
Lecture 11 Setup Development Environment - Setup Spark 2:27
Lecture 12 Setup Development Environment - Install JDK 6:5
Lecture 13 Setup Development Environment - Configure environment variables for Spark 3:47
Lecture 14 Setup Development Environment - Install WinUtils - integrate Windows and HDFS 6:31
Lecture 15 Setup Development Environment - Integrate PyCharm and Spark on Windows 10 7:6

Section 2 : Python Fundamentals

Lecture 16 Introduction and Setting up Python 9:34
Lecture 17 Basic Programming Constructs 13:16
Lecture 18 Functions in Python 14:5
Lecture 19 Python Collections 16:12
Lecture 20 Map Reduce operations on Python Collections 12:52
Lecture 21 Setting up Data Sets for Basic IO Operations 4:24
Lecture 22 Basic IO operations and processing data using Collections 16:36

Section 3 : Getting Started

Lecture 23 Get revenue for given order id - as application 12:21
Lecture 24 About Certification Pdf
Lecture 25 Setup Environment - Locally 2:4
Lecture 26 Setup Environment - using Cloudera Quickstart VM 7:23
Lecture 27 Using Itversity platforms - Big Data Developer labs and forum 7:20
Lecture 28 Using itversity's big data labs 8:37
Lecture 29 Using Windows - Putty and WinSCP 10:34
Lecture 30 Using Windows - Cygwin 14:46
Lecture 31 HDFS Quick Preview 20:25
Lecture 32 YARN Quick Preview 9:53
Lecture 33 Setup Data Sets 7:54

Section 4 : Apache Spark 1

Lecture 34 Introduction 6:5
Lecture 35 Introduction to Spark 2:22
Lecture 36 Setup Spark on Windows 23:15
Lecture 37 Quick overview about Spark documentation 4:38
Lecture 38 Connecting to the environment 3:49
Lecture 39 Initializing Spark job using pyspark 4:54
Lecture 40 Create RDD from HDFS files 18:28
Lecture 41 Create RDD from collection - using parallelize 4:53
Lecture 42 Read data from different file formats - using sqlContext 8:6
Lecture 43 Row level transformations - String Manipulation 11:0
Lecture 44 Row Level Transformations - map 12:25
Lecture 45 Row Level Transformations - flatMap 5:50
Lecture 46 Filtering data using filter 10:9
Lecture 47 Joining Data Sets - Introduction 5:17
Lecture 48 Joining Data Sets - Inner Join 10:34
Lecture 49 Joining Data Sets - Outer Join 14:39
Lecture 50 Aggregations - Introduction 3:1
Lecture 51 Aggregations - count and reduce - Get revenue for order id 12:52
Lecture 52 Aggregations - reduce - Get order item with minimum subtotal for order id 5:47
Lecture 53 Aggregations - countByKey - Get order count by status 5:58
Lecture 54 Aggregations - understanding combiner 6:51
Lecture 55 Aggregations - groupByKey - Get revenue for each order id 8:17
Lecture 56 groupByKey - Get order items sorted by order_item_subtotal for each order id 11:59
Lecture 57 Aggregations - reduceByKey - Get revenue for each order id 10:26
Lecture 58 Aggregations - aggregateByKey - Get revenue and count of items for each order id 14:30
Lecture 59 Sorting - sortByKey - Sort data by product price 9:59
Lecture 60 Sorting - sortByKey - Sort data by category id and then by price descending 10:48
Lecture 61 Ranking - Introduction 1:18
Lecture 62 Ranking - Global Ranking using sortByKey and take 2:49
Lecture 63 Ranking - Global using takeOrdered or top 7:29
Lecture 64 Ranking - By Key - Get top N products by price per category - Introduction 3:54
Lecture 65 Ranking - By Key - Get top N products by price per category - Python collections 4:41
Lecture 66 Ranking - By Key - Get top N products by price per category - using flatMap 3:6
Lecture 67 Ranking - By Key - Get top N priced products - Introduction 3:0
Lecture 68 Ranking - By Key - Get top N priced products - using Python collections API 13:6
Lecture 69 Ranking - By Key - Get top N priced products - Create Function 5:3
Lecture 70 Ranking - By Key - Get top N priced products - integrate with flatMap 4:16
Lecture 71 Set Operations - Introduction 1:5
Lecture 72 Set Operations - Prepare data 8:22
Lecture 73 Set Operations - union and distinct 5:14
Lecture 74 Set Operations - intersect and minus 8:4
Lecture 75 Saving data into HDFS - text file format 11:46
Lecture 76 Saving data into HDFS - text file format with compression 5:52
Lecture 77 Saving data into HDFS using Data Frames - json 11:18

Section 5 : Apache Spark 1

Lecture 78 Problem Statement 1:54
Lecture 79 Launching pyspark 11:45
Lecture 80 Reading data from HDFS and filtering 8:14
Lecture 81 Joining orders and order_items 7:44
Lecture 82 Aggregate to get daily revenue per product id 6:53
Lecture 83 Load products and convert into RDD 10:1
Lecture 84 Join and sort the data 11:39
Lecture 85 Save to HDFS and validate in text file format 7:24
Lecture 86 Saving data in avro file format 11:58
Lecture 87 Get data to local file system using get or copyToLocal 4:51
Lecture 88 Develop as application to get daily revenue per product 7:27
Lecture 89 Run as application on the cluster 5:8

Section 6 : Apache Spark 1

Lecture 90 Different interfaces to run SQL - Hive, Spark SQL 9:26
Lecture 91 Create database and tables of text file format - orders and order_items 25:0
Lecture 92 Create database and tables of ORC file format - orders and order_items 10:19
Lecture 93 Running SQLHive Commands using pyspark 5:17
Lecture 94 Functions - Getting Started 5:11
Lecture 95 Functions - String Manipulation 22:23
Lecture 96 Functions - Date Manipulation 13:44
Lecture 97 Functions - Aggregate Functions in brief 5:49
Lecture 98 Functions - case and nvl 14:10
Lecture 99 Row level transformations 8:31
Lecture 100 Joining data between multiple tables 18:10
Lecture 101 Group by and aggregations 11:41
Lecture 102 Sorting the data 7:27
Lecture 103 Set operations - union and union all 5:39
Lecture 104 Analytics functions - aggregations 15:54
Lecture 105 Analytics functions - ranking 8:40
Lecture 106 Windowing functions 7:49
Lecture 107 Creating Data Frames and register as temp tables 18:46
Lecture 108 Write Spark Application - Processing Data using Spark SQL 9:14
Lecture 109 Write Spark Application - Saving Data Frame to Hive tables 9:35
Lecture 110 Data Frame Operations 13:42

Section 7 : Setup Hadoop and Spark Environment for Practice

Lecture 111 About Proctor Testing Pdf
Lecture 112 Overview of ITVersity Boxes GitHub Repository 3:11
Lecture 113 Creating Virtual Machine 10:31
Lecture 114 Starting HDFS and YARN 4:29
Lecture 115 Gracefully Stopping Virtual Machine 5:42
Lecture 116 Undertanding Datasets provided in Virtual Machine 5:39
Lecture 117 Using GitHub Content for the practice 5:12
Lecture 118 Using Resources for Practice 3:55

Section 8 : Apache Spark 2

Lecture 119 Introduction 2:10
Lecture 120 Review of Setup Steps for Spark Environment 8:40
Lecture 121 Using ITVersity labs 3:20
Lecture 122 Apache Spark Official Documentation (Very Important) 7:21
Lecture 123 Quick Review of Spark APIs 12:30
Lecture 124 Spark Modules 5:2
Lecture 125 Spark Data Structures - RDDs and Data Frames 14:49
Lecture 126 Develop Simple Application 14:26
Lecture 127 Apache Spark - Framework 22:20

Section 9 : Apache Spark 2

Lecture 128 Introduction 1:43
Lecture 129 Data Frames - Overview 12:22
Lecture 130 Create Data Frames from Text Files 16:18
Lecture 131 Create Data Frames from Hive Tables 5:50
Lecture 132 Create Data Frames using JDBC 17:14
Lecture 133 Data Frame Operations - Overview
Lecture 134 Spark SQL - Overview 4:0
Lecture 135 Overview of Functions to manipulate data in Data Frame fields or columns 5:52

Section 10 : Apache Spark 2

Lecture 136 Define Problem Statement - Get Daily Product Revenue 6:54
Lecture 137 Selection or Projection of Data in Data Frames 10:27
Lecture 138 Filtering Data from Data Frames 16:33
Lecture 139 Joining multiple Data Frames
Lecture 140 Perform Aggregations using Data Frames 12:24
Lecture 141 Sorting Data in Data Frames 10:24
Lecture 142 Development Life Cycle using Data Frames 14:36
Lecture 143 Run applications using Spark Submit 8:58

Section 11 : Apache Spark 2

Lecture 144 Data Frame Operations - Window Functions - Overview 4:35
Lecture 145 Data Frames - Window Functions APIs - Overview 5:2
Lecture 146 Define Problem Statement - Get Top N Daily Products 2:58
Lecture 147 Data Frame Operations - Creating Window Spec 4:21
Lecture 148 Data Frame Operations - Performing Aggregations using sum, avg etc 11:41
Lecture 149 Data Frame Operations - Time Series Functions such as Lead, Lag etc 15:5
Lecture 150 Data Frame Operations - Ranking Functions - rank, dense_rank, row_number etc 8:45

Section 12 : Apache Spark using SQL - Getting Started

Lecture 151 Getting Started - Overview 2:1
Lecture 152 Overview of Spark Documentation 2:29
Lecture 153 Launching and using Spark SQL CLI 4:8
Lecture 154 Overview of Spark SQL Properties 8:51
Lecture 155 Running OS Commands using Spark SQL 3:19
Lecture 156 Understanding Warehouse Directory 4:13
Lecture 157 Managing Spark Metastore Databases 10:2
Lecture 158 Managing Spark Metastore Tables 3:21
Lecture 159 Retrieve Metadata of Tables 2:19
Lecture 160 Role of Spark Metastore or Hive Metastore 5:1
Lecture 161 Exercise - Getting Started with Spark SQL 8:57

Section 13 : Apache Spark using SQL - Basic Transformations using Spark SQL

Lecture 162 Basic Transformations using Spark SQL - Introduction 3:20
Lecture 163 Spark SQL - Overview 6:42
Lecture 164 Define Problem Statement 3:20
Lecture 165 Prepare Tables 5:6
Lecture 166 Projecting Data 4:1
Lecture 167 Filtering Data
Lecture 168 Joining Tables - Inner 7:30
Lecture 169 Joining Tables - Outer 7:22
Lecture 170 Aggregating Data 11:6
Lecture 171 Sorting Data 4:49
Lecture 172 Conclusion - Final Solution 4:22

Section 14 : Apache Spark using SQL - Basic DDL and DML

Lecture 173 Introduction 2:48
Lecture 174 Create Spark Metastore Tables 10:33
Lecture 175 Overview of Data Types 9:51
Lecture 176 Adding Comments 2:2
Lecture 177 Loading Data Into Tables - Local 4:17
Lecture 178 Loading Data Into Tables - HDFS 6:10
Lecture 179 Loading Data - Append and Overwrite 2:41
Lecture 180 Creating External Tables 3:6
Lecture 181 Managed Tables vs External Tables 4:39
Lecture 182 Overview of File Formats 8:1
Lecture 183 Drop Tables and Databases 4:17
Lecture 184 Truncating Tables 2:17
Lecture 185 Exercise - Managed Tables 7:10

Section 15 : Apache Spark using SQL - DML and Partitioning

Lecture 186 Introduction 3:27
Lecture 187 Introduction to Partitioning 1:22
Lecture 188 Creating Tables using Parquet 4:41
Lecture 189 Load vs 4:25
Lecture 190 Inserting Data using Stage Table 4:52
Lecture 191 Creating Partitioned Tables
Lecture 192 Adding Partitions to Tables 4:1
Lecture 193 Loading Data into Partitioned Tables 8:1
Lecture 194 Inserting Data into Partitions 3:19
Lecture 195 Using Dynamic Partition Mode 4:52
Lecture 196 Exercise - Partitioned Tables 3:34

Section 16 : Apache Spark using SQL - Pre-defined Functions

Lecture 197 Introduction - Overview of Spark SQL Functions 1:45
Lecture 198 Overview of Functions 2:48
Lecture 199 Validating Functions
Lecture 200 String Manipulation Functions 11:3
Lecture 201 Date Manipulation Functions 16:48
Lecture 202 Overview of Numeric Functions 9:24
Lecture 203 Data Type Conversion 4:2
Lecture 204 Dealing with Nulls 7:52
Lecture 205 Using CASE and WHEN 7:32
Lecture 206 Query Example - Word Count 7:16

Section 17 : Sample Scenarios with Solutions

Lecture 207 Remove - INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM Pdf
Lecture 208 Problem Statements - General Guidelines 5:53
Lecture 209 Initializing the job - General Guidelines 13:32
Lecture 210 Exercise 01 - Get Monthly Crime Count By Type - Understanding Problem Statement 3:41
Lecture 211 Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Design 4:2
Lecture 212 Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Read Data into RDD 8:45
Lecture 213 Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Perform Aggregation 9:49
Lecture 214 Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Sort and Save output 11:16