Section 1 : Introduction
Section 2 : Python Fundamentals
|
Lecture 1 | Introduction and Setting up Python | 00:09:34 Duration |
|
Lecture 2 | Basic Programming Constructs | 00:13:16 Duration |
|
Lecture 3 | Functions in Python | 00:14:05 Duration |
|
Lecture 4 | Python Collections | 00:16:12 Duration |
|
Lecture 5 | Map Reduce operations on Python Collections | 00:12:52 Duration |
|
Lecture 6 | Setting up Data Sets for Basic IO Operations | 00:04:24 Duration |
|
Lecture 7 | Basic IO operations and processing data using Collections | 00:16:36 Duration |
Section 3 : Getting Started
|
Lecture 1 | Get revenue for given order id - as application | 00:12:21 Duration |
|
Lecture 2 | About Certification | |
|
Lecture 3 | Setup Environment - Locally | 00:02:04 Duration |
|
Lecture 4 | Setup Environment - using Cloudera Quickstart VM | 00:07:23 Duration |
|
Lecture 5 | Using Itversity platforms - Big Data Developer labs and forum | 00:07:20 Duration |
|
Lecture 6 | Using itversity's big data labs | 00:08:37 Duration |
|
Lecture 7 | Using Windows - Putty and WinSCP | 00:10:34 Duration |
|
Lecture 8 | Using Windows - Cygwin | 00:14:46 Duration |
|
Lecture 9 | HDFS Quick Preview | 00:20:25 Duration |
|
Lecture 10 | YARN Quick Preview | 00:09:53 Duration |
|
Lecture 11 | Setup Data Sets | 00:07:54 Duration |
Section 4 : Apache Spark 1
Section 5 : Apache Spark 1
|
Lecture 1 | Problem Statement | 00:01:54 Duration |
|
Lecture 2 | Launching pyspark | 00:11:45 Duration |
|
Lecture 3 | Reading data from HDFS and filtering | 00:08:14 Duration |
|
Lecture 4 | Joining orders and order_items | 00:07:44 Duration |
|
Lecture 5 | Aggregate to get daily revenue per product id | 00:06:53 Duration |
|
Lecture 6 | Load products and convert into RDD | 00:10:01 Duration |
|
Lecture 7 | Join and sort the data | 00:11:39 Duration |
|
Lecture 8 | Save to HDFS and validate in text file format | 00:07:24 Duration |
|
Lecture 9 | Saving data in avro file format | 00:11:58 Duration |
|
Lecture 10 | Get data to local file system using get or copyToLocal | 00:04:51 Duration |
|
Lecture 11 | Develop as application to get daily revenue per product | 00:07:27 Duration |
|
Lecture 12 | Run as application on the cluster | 00:05:08 Duration |
Section 6 : Apache Spark 1
Section 7 : Setup Hadoop and Spark Environment for Practice
|
Lecture 1 | About Proctor Testing | |
|
Lecture 2 | Overview of ITVersity Boxes GitHub Repository | 00:03:11 Duration |
|
Lecture 3 | Creating Virtual Machine | 00:10:31 Duration |
|
Lecture 4 | Starting HDFS and YARN | 00:04:29 Duration |
|
Lecture 5 | Gracefully Stopping Virtual Machine | 00:05:42 Duration |
|
Lecture 6 | Undertanding Datasets provided in Virtual Machine | 00:05:39 Duration |
|
Lecture 7 | Using GitHub Content for the practice | 00:05:12 Duration |
|
Lecture 8 | Using Resources for Practice | 00:03:55 Duration |
Section 8 : Apache Spark 2
|
Lecture 1 | Introduction | 00:02:10 Duration |
|
Lecture 2 | Review of Setup Steps for Spark Environment | 00:08:40 Duration |
|
Lecture 3 | Using ITVersity labs | 00:03:20 Duration |
|
Lecture 4 | Apache Spark Official Documentation (Very Important) | 00:07:21 Duration |
|
Lecture 5 | Quick Review of Spark APIs | 00:12:30 Duration |
|
Lecture 6 | Spark Modules | 00:05:02 Duration |
|
Lecture 7 | Spark Data Structures - RDDs and Data Frames | 00:14:49 Duration |
|
Lecture 8 | Develop Simple Application | 00:14:26 Duration |
|
Lecture 9 | Apache Spark - Framework | 00:22:20 Duration |
Section 9 : Apache Spark 2
|
Lecture 1 | Introduction | 00:01:43 Duration |
|
Lecture 2 | Data Frames - Overview | 00:12:22 Duration |
|
Lecture 3 | Create Data Frames from Text Files | 00:16:18 Duration |
|
Lecture 4 | Create Data Frames from Hive Tables | 00:05:50 Duration |
|
Lecture 5 | Create Data Frames using JDBC | 00:17:14 Duration |
|
Lecture 6 | Data Frame Operations - Overview | |
|
Lecture 7 | Spark SQL - Overview | 00:04:00 Duration |
|
Lecture 8 | Overview of Functions to manipulate data in Data Frame fields or columns | 00:05:52 Duration |
Section 10 : Apache Spark 2
|
Lecture 1 | Define Problem Statement - Get Daily Product Revenue | 00:06:54 Duration |
|
Lecture 2 | Selection or Projection of Data in Data Frames | 00:10:27 Duration |
|
Lecture 3 | Filtering Data from Data Frames | 00:16:33 Duration |
|
Lecture 4 | Joining multiple Data Frames | |
|
Lecture 5 | Perform Aggregations using Data Frames | 00:12:24 Duration |
|
Lecture 6 | Sorting Data in Data Frames | 00:10:24 Duration |
|
Lecture 7 | Development Life Cycle using Data Frames | 00:14:36 Duration |
|
Lecture 8 | Run applications using Spark Submit | 00:08:58 Duration |
Section 11 : Apache Spark 2
|
Lecture 1 | Data Frame Operations - Window Functions - Overview | 00:04:35 Duration |
|
Lecture 2 | Data Frames - Window Functions APIs - Overview | 00:05:02 Duration |
|
Lecture 3 | Define Problem Statement - Get Top N Daily Products | 00:02:58 Duration |
|
Lecture 4 | Data Frame Operations - Creating Window Spec | 00:04:21 Duration |
|
Lecture 5 | Data Frame Operations - Performing Aggregations using sum, avg etc | 00:11:41 Duration |
|
Lecture 6 | Data Frame Operations - Time Series Functions such as Lead, Lag etc | 00:15:05 Duration |
|
Lecture 7 | Data Frame Operations - Ranking Functions - rank, dense_rank, row_number etc | 00:08:45 Duration |
Section 12 : Apache Spark using SQL - Getting Started
|
Lecture 1 | Getting Started - Overview | 00:02:01 Duration |
|
Lecture 2 | Overview of Spark Documentation | 00:02:29 Duration |
|
Lecture 3 | Launching and using Spark SQL CLI | 00:04:08 Duration |
|
Lecture 4 | Overview of Spark SQL Properties | 00:08:51 Duration |
|
Lecture 5 | Running OS Commands using Spark SQL | 00:03:19 Duration |
|
Lecture 6 | Understanding Warehouse Directory | 00:04:13 Duration |
|
Lecture 7 | Managing Spark Metastore Databases | 00:10:02 Duration |
|
Lecture 8 | Managing Spark Metastore Tables | 00:03:21 Duration |
|
Lecture 9 | Retrieve Metadata of Tables | 00:02:19 Duration |
|
Lecture 10 | Role of Spark Metastore or Hive Metastore | 00:05:01 Duration |
|
Lecture 11 | Exercise - Getting Started with Spark SQL | 00:08:57 Duration |
Section 13 : Apache Spark using SQL - Basic Transformations using Spark SQL
|
Lecture 1 | Basic Transformations using Spark SQL - Introduction | 00:03:20 Duration |
|
Lecture 2 | Spark SQL - Overview | 00:06:42 Duration |
|
Lecture 3 | Define Problem Statement | 00:03:20 Duration |
|
Lecture 4 | Prepare Tables | 00:05:06 Duration |
|
Lecture 5 | Projecting Data | 00:04:01 Duration |
|
Lecture 6 | Filtering Data | |
|
Lecture 7 | Joining Tables - Inner | 00:07:30 Duration |
|
Lecture 8 | Joining Tables - Outer | 00:07:22 Duration |
|
Lecture 9 | Aggregating Data | 00:11:06 Duration |
|
Lecture 10 | Sorting Data | 00:04:49 Duration |
|
Lecture 11 | Conclusion - Final Solution | 00:04:22 Duration |
Section 14 : Apache Spark using SQL - Basic DDL and DML
|
Lecture 1 | Introduction | 00:02:48 Duration |
|
Lecture 2 | Create Spark Metastore Tables | 00:10:33 Duration |
|
Lecture 3 | Overview of Data Types | 00:09:51 Duration |
|
Lecture 4 | Adding Comments | 00:02:02 Duration |
|
Lecture 5 | Loading Data Into Tables - Local | 00:04:17 Duration |
|
Lecture 6 | Loading Data Into Tables - HDFS | 00:06:10 Duration |
|
Lecture 7 | Loading Data - Append and Overwrite | 00:02:41 Duration |
|
Lecture 8 | Creating External Tables | 00:03:06 Duration |
|
Lecture 9 | Managed Tables vs External Tables | 00:04:39 Duration |
|
Lecture 10 | Overview of File Formats | 00:08:01 Duration |
|
Lecture 11 | Drop Tables and Databases | 00:04:17 Duration |
|
Lecture 12 | Truncating Tables | 00:02:17 Duration |
|
Lecture 13 | Exercise - Managed Tables | 00:07:10 Duration |
Section 15 : Apache Spark using SQL - DML and Partitioning
|
Lecture 1 | Introduction | 00:03:27 Duration |
|
Lecture 2 | Introduction to Partitioning | 00:01:22 Duration |
|
Lecture 3 | Creating Tables using Parquet | 00:04:41 Duration |
|
Lecture 4 | Load vs | 00:04:25 Duration |
|
Lecture 5 | Inserting Data using Stage Table | 00:04:52 Duration |
|
Lecture 6 | Creating Partitioned Tables | |
|
Lecture 7 | Adding Partitions to Tables | 00:04:01 Duration |
|
Lecture 8 | Loading Data into Partitioned Tables | 00:08:01 Duration |
|
Lecture 9 | Inserting Data into Partitions | 00:03:19 Duration |
|
Lecture 10 | Using Dynamic Partition Mode | 00:04:52 Duration |
|
Lecture 11 | Exercise - Partitioned Tables | 00:03:34 Duration |
Section 16 : Apache Spark using SQL - Pre-defined Functions
|
Lecture 1 | Introduction - Overview of Spark SQL Functions | 00:01:45 Duration |
|
Lecture 2 | Overview of Functions | 00:02:48 Duration |
|
Lecture 3 | Validating Functions | |
|
Lecture 4 | String Manipulation Functions | 00:11:03 Duration |
|
Lecture 5 | Date Manipulation Functions | 00:16:48 Duration |
|
Lecture 6 | Overview of Numeric Functions | 00:09:24 Duration |
|
Lecture 7 | Data Type Conversion | 00:04:02 Duration |
|
Lecture 8 | Dealing with Nulls | 00:07:52 Duration |
|
Lecture 9 | Using CASE and WHEN | 00:07:32 Duration |
|
Lecture 10 | Query Example - Word Count | 00:07:16 Duration |
Section 17 : Sample Scenarios with Solutions
|
Lecture 1 | Remove - INTRODUCTION TO BRAINMEASURES PROCTOR SYSTEM | |
|
Lecture 2 | Problem Statements - General Guidelines | 00:05:53 Duration |
|
Lecture 3 | Initializing the job - General Guidelines | 00:13:32 Duration |
|
Lecture 4 | Exercise 01 - Get Monthly Crime Count By Type - Understanding Problem Statement | 00:03:41 Duration |
|
Lecture 5 | Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Design | 00:04:02 Duration |
|
Lecture 6 | Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Read Data into RDD | 00:08:45 Duration |
|
Lecture 7 | Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Perform Aggregation | 00:09:49 Duration |
|
Lecture 8 | Exercise 01 - Get Monthly Crime Count By Type - Core APIs - Sort and Save output | 00:11:16 Duration |