Why Data Science?

The New York Times called it one of the sexiest careers, and countless surveys are calculating both the need and the upward trajectory for data scientists. Every industry needs data scientists to help find answers to their most challenging questions.

Job postings for data scientists increased exponentially over the last 5 years and businesses across sectors are looking to fill positions for data analysts, data scientists, and statistical analysts.

The gap between demand and supply of Data Scientists has been increasing each day.

That’s where we step in with a vision to create skilled and trained Data Scientists for the World!

About our Diploma in Data Science

  • Post Graduate Diploma in Data Science certification from Sankhya leading to global opportunities.
  • Course design based on 12+ years industry experience, guided by highly skilled and experienced trainers.
  • Exposure to real world analytics. Business case examples from domains such as BFSI, FMCG, Retail, HR, Telecom, Pharmaceutical, etc.
  • Hands-on working on statistical software like R and Python.
  • A unique course which blends Classroom Sessions, Self-Learning, Assignments and Project Work designed for training the candidate to become a complete DATA SCIENTIST!
  • Personal mentor assigned to each student for assistance throughout the course duration.
  • The entire course will be available on mobile app as a quick reference.

Eligibility Criteria

  • Must have studied Mathematics at 10+2 level (or during Graduation years)
  • Final Year Graduation/Graduate/Post Graduate/Ph.D. students from any background
  • Currently Pursuing Post Graduation/Ph.D. from any background
  • Working Professionals from any background

Course Duration : 6 Months (Weekly session on each Saturday)

This course will professionally train candidate right from the very basic of Statistics and Programming.
We provide placement assistance to all students completing the course successfully.




Program Content

Module 0: Introduction
  • Introduction to Data Science 
  • Business Analytics success stories
  • Exposure to tools like R, RStudio, MS Excel, MySQL 
  • Scope and opportunities
  • Fundamentals of Statistics
Module I: Exploratory Data Analysis
  • Critical for successful analytics implementation
  • Good data management helps to
    • Asses quality of data
    • Improve the quality of data
    • Make data analysis ready
  • Provides data insights
  • Guides towards business research problem solution using advanced analytics.
  • Data Management
    • Import, sort, merge, aggregate, subset, derive
    • Introduction to MySql
  • Descriptive Statistics
    • Central tendency
    • Variation
    • Shape
  • Visualization
    • Bar charts/Histogram
    • Box-Whiskers plot
    • Contour plot
    • Motion Chart
Module II: Statistical Inference
  • Powerful tool for testing researcher’s claim in the planned experiment
  • Wide application in clinical, market and social research
  • Marketing campaigns can be designed and tested before full fledged implementation
  • Distribution Theory & Hypothesis Testing
    • Discrete Distributions
    • Continuous Distributions
    • Parametric Tests
    • Non- parametric Tests
    • Analysis of Variance
    • Analysis of Covariance
Module III: Predictive Modeling - Fundamentals
  • Growing area in Risk Management & Marketing
  • Cross selling/up-selling can be done scientifically
  • Financial institutions can predict “BAD” customers
  • Huge scope in ecommerce business
  • Basics of Modeling
    • Modeling framework
    • Best practices
  • Multiple Linear Regression
    • Mathematical model
    • Validating assumptions
    • Residual analysis
    • Multicollinearity problem
    • Out of sample validation
Module IV: Predictive Modeling - Advanced
  • Categorical response variables are frequently incorporated in real world scenarios
  • Most widely used class of predictive modeling
  • Response to offer – Yes/No
  • Brand preference – iPhone/Samsung/Sony
  • Categorical Response Variable
    • Binary Logistic regression
    • Multinomial Logistic Regression
    • Ordinal Logistic Regression
    • Poisson Regression (modeling count response variable)
    • Cox Regression
Module V: Time Series Analysis
  • Set of models of forecasting sales, financial indices, economy indices
  • Inflation rate ,GDP are predicted using time series modeling
  • Nifty/Sensex future values can be estimated
  • Complex financial models are developed using ARCH & GARCH
  • Time Series Modeling
    • AR Models
    • ARIMA
    • ARCH
    • GARCH
    • Time Series Regression
    • Exponential Smoothing
Module VI: Unsupervised Multivariate Methods
  • Provide exploratory segments of customer, stores & agents
  • PCA/Factor Analysis are powerful techniques for dimension reduction and scoring models
  • PCA is used to resolve multicollinearity problem in regression models
  • Segmentation & Data Reduction
    • k-means clustering(algorithm and selection of best cluster solution)
    • Principal Component Analysis and Principal Component Regression
    • Factor Analysis
    • Multidimensional Scaling
Module VII: Data Mining
  • New generation algorithms
  • Multiple methods can be used to decide best predictive model
  • Discover hidden pattern which may not be revealed by classical methods
  • Machine Learning Algorithms
    • Naïve Bayes
    • Support Vector Machines
    • Decision Tree
    • Random Forest Algorithm
    • Neural Networks
    • Association Rules
Module VIII: Big Data Analytics
  • Volume & velocity of the data is humungous
  • Platform for analytics implementation in cloud environment
  • Combines unstructured data with structured data
  • Big Data Analytics
    • Hadoop Introduction
    • R-Hadoop Integration
  • Sentiment Analysis of Facebook and Twitter Data (Text Mining)