Data Science with R and Python
- Statistical Learning
- Measures of central tendency, Measures of dispersion, Probability theory, Hypothesis testing, ANOVA, Types of graphs and plots.
- Python Environment Setup and Essentials
- Hands-on Exercise – Installing Python Anaconda for the Windows, Linux and Mac.
- R Environment Setup and Essentials
- Hands-on Exercise – Installing R for the Windows, Linux and Mac, Exploratory data analysis, Basic operators in R, Data Manipulation, Data visualisation.
- Python language Basic Constructs
- OOP concepts in Python
- Hands-on Exercise – important concepts in OOP like polymorphism, inheritance, encapsulation, Python functions, return types, and parameters, Lambda expressions,
- NumPy for mathematical computing
- Hands-on Exercise – How to import NumPy module, creating array using ND-array, calculating standard deviation on array of numbers, calculating correlation between two variables.
- SciPy for scientific computing
- Hands-on Exercise – Importing of SciPy, applying the Bayes theorem on the given dataset.
- Matplotlib for data visualization
- Hands-on Exercise – deploying MatPlotLib for creating Pie, Scatter, Line, Histogram.
- Pandas for data analysis and machine learning
- Hands-on Exercise – working on importing data files, selecting record by a group, applying filter on top, viewing records, analyzing with linear regression, and creation of time series.
- Introduction to Machine Learning with R and Python
- The need for Machine Learning, Introduction to Machine Learning, types of Machine Learning, such as supervised, unsupervised and reinforcement learning, why Machine Learning with Python, R and applications of Machine Learning.
- Supervised Learning and Linear Regression
- Hands-on Exercise – Implementing linear regression from scratch with R and Python, Using Python library Scikit-learn to perform simple linear regression and multiple linear regression, Implementing train–test split and predicting the values on the test set.
- Classification and Logistic Regression
- Hands-on Exercise – Implementing logistic regression from scratch with R and Python, Using Python library Scikit-learn to perform simple logistic regression and multiple logistic regression, Building a confusion matrix to find out the accuracy, true positive rate, and false-positive rate.
- Decision Tree and Random Forest
- Hands-on Exercise – Implementing a decision tree from scratch in R and Python, Using Python library Scikit-learn to build a decision tree and a random forest, Visualizing the tree and changing the hyperparameters in the random forest.
- Naïve Bayes and Support Vector Machine
- Hands-on Exercise – Using Python library Scikit-learn to build a Naïve Bayes classifier and a support vector classifier.
- Unsupervised Learning
- Hands-on Exercise – Using Python library Scikit-learn to implement K-means clustering, Implementing PCA (principal component analysis) on top of a dataset.
- Natural Language Processing and Text Mining
- Project
- Time Series Analysis
- Hands-on Exercise – Analyzing time series data, the sequence of measurements that follow a non-random order to recognize the nature of the phenomenon, and forecasting the future values in the series.
Certification Project
- The learner will have to submit a certification project and will be rewarded with a certificate once this project is completed.
- Multiple Choice Questions & Answers:
- Learners will be asked with multiple choices Q&A during the training sessions and points will be provided.
- Scenario-based Questions & Answers:
- Learners will have to submit scenario-based Q&A and points will be provided.
- Sample Project:
- A Sample Project will be discussed and shown to the learners that will help learners to start working in a project.