Skillatwill

Bigdata

What is Bigdata
Why Bigdata came into picture
What is Hadoop
Why spark came into picture?
Limitations in Hadoop and RDBMS

Hadoop, & Spark installation in Ubuntu Hands on

Create Hadoop & Scala environment in Intellij
Run sample scala program inIntellij

Scala Basics

Variables, Strings & Numbers
Arrays, List, tuple, type hierarchy
Scala: Expressions and Conditionals
For loop & match if else
Functions & Objects, class methods
HDFS: Responsibilities of Namenode, Datanode
How HDFS replicated data?
Read/Write data from HDFS/local
Namenode, Application manager internals
Power of YARN
How Resource Master functioning
Node manager responsibilities
How Application master work?
How Yarn communicate HDFS
Spark on Yarn
How Spark run on Yarn
What is Mesos?
Power of containers & Executors
In-memory concept

AWS Intro

EC2 creation,
Hortonworks installation in ec2
Cloudera installation in ec2
image, windows, linux servers
Autoscala ec2

AWS RDS

Create and insert data in Oracle, mysql, mssql,
postgre sql databases.
Sqoop import export examples

AWS IAM

Users
Groups
Roals,
Policies
S3 Cli commands
S3 Bucket privileges
Emr
Create multi node cluster

Redshift & datapipe line

create and process large amount of data
Get data from oracle to redshift
Get data from s3 to redshift

AWS Glue and Athena

Pyspark script in glue
Scala spark script in glue
Hive script in Athena

Sqoop Introduction

Import data from oracle mysql mssql
Store data in hive
Delemeter change
Incremental data lode
Performance tuning
Sqoop automate using shell script

Sqoop export

Problems to export
Clean data

Hive Introduction

Create table to csv, json data. Serdes, process
orc, parquet datasets.
Hive Partition, bucketing, advanced techniques

Introduction Why Spark?

What is RDD?
RDD properties
Spark Architecture
Why spark is Fast?
Key-Value Pair RDDs
DAG
Rdd Operations (Transformations &
actions)
RDD advanced topics (debugging, web
UI)
Most frequently used spark functions
RDD easily process
rdd to dataframe
SparkSQL:
Ways to create Dataframes
CaseClass
Process CSV data using RDD
Process sample json & complex jsondata
Process xml, avro, parquet, orc, process hive
data using spark.

Process different type of datasets

Create a jar and submit API
Dataframe operations
Memory management and catalyst optimizer
internals.
Spark Cassandra integration
In dev env & aws env

Introduction about AWS

IAM & EMR how to create and practice spark
in EMR
How to submit a job in EMR
Spark Hbase Phoenix integration
DataSet API
Power of Decoder
Serialization concept in DataSet
Detaset APz
Dag Scheduler
Memory management in Spark
Web UI & debugging
Spark streaming Architecture
DStreams & micro batching
Batch vs Streaming
Spark Streaming Architecture
Kafka introduction,
How kafka working
Spark & Kafka integration
Spark Kafka Nifi integration
Spark Structure streaming introduction
Spark Structure streaming Kafka
Optional Training
Flink introduction
Flink table API
Flink streaming
Spark Overview Training Curriculum - Confidential
Cloudera, certification and AWS certification tips
How to practice Hortonworks, cloudera and
databricks.Commercials

Batch	Start Date	End Date	Timings	Batch Type
Batch 1	04-01-2021	05-03-2021	Mon-Fri 11:00 AM-1:00 PM	Weekday
Batch 2	01-02-2021	02-04-2021	Mon-Fri 1:00 PM-3:00 PM	Weekday
Batch 3	01-03-2021	30-04-2021	Mon-Fri 8:00 AM-10:00 AM	Weekday

Spark using AWS

Description

Specifications

Description

Specifications

Map location

Course Provider

SAW Freelance Trainer

Contact Info

No Comments

Please login to leave a review

Related Classes

₹28,000

Big Data Hadoop Spark

₹28,000

Big Data Python Spark

₹23,000

Big Data Hadoop

₹28,000

Big Data Scala Spark

₹23,000

Big Data Hadoop

₹25,000

Spark using AWS

₹25,000

Big Data with Spark

₹23,000

Big Data Hadoop

Course Provider

SAW Freelance Trainer

Contact Info

Map location

Shares

Other Courses by Institute

Frontend web development

Backend web development

SAP - Materials Management

SAP FICO Training

SAP HR Module

SAP Sales & Distribution - Advanced Training

Full stack web development

Full Stack Web Development with Java Spring Framework

Machine Learning

Artificial Intelligence - Machine learning program

Machine Learning

Spark using AWS

Microservices and Spring Boot

Linux

Full Stack Java Development

Practical Approach to Machine Learning

Salesforce Admin

Data Science with R, Python and Tableau

Data Science & Overview of Machine learning with R

Big Data with Spark

Data Science and Overview of Machine Learning with Python

DevOps Training

Lean Six Sigma Yellow Belt

SQL Server

SQL Server Integration Services

Power BI

Data-Science and ArtificiaI Intelligence

SAP FI