MTECH_Handbook

`
Master of Technology
Data Science and Machine Learning
(Weekend Classroom Mode)
*Syllabus is subject to revision

Table of Contents
Program Overview
Program Educational Objectives
Program Outcomes
Syllabus Structure
UE20CS901 Python for Data Science (2-0-0-4-3)

UE20CS902 Statistical Methods for Decision Making (2-0-0-4-3)
UE20CS903 Databases and SQL (2-0-0-4-3)
UE20CS904 Mathematical Foundation (1.5-0-0-2-2)
UE20CS905 Machine Learning- I (2-0-0-4-3)
UE20CS906 Data Visualization using Tableau (1.5-0-0-2-2)
UE20CS907 Structuring Visualization and Analytical Problems (SVAP) (0.5-0-0-2-1)
UE20CS931 Machine Learning - II (2-0-0-4-3 )

UE20CS932 Machine Learning - III ((1.5-0-0-2-2)
UE20CS933 Natural Language Processing (1.5-0-0-2-2)
UE20CS934 Time series Forecasting (1.5-0-0-2-2)
UE20CS935 Introduction to Deep Learning and Applications (3-0-0-4-4)
UE20CS936 Introduction to Big Data (2-0-0-4-3)

UE20CS937 Business Analysis and Communication (0.5-0-0-2-1)
UE20CS960 Capstone Project
UE20CS970 MTech Project
Assessment Structure- ISA/ ESA
Teaching Methodology
Pedagogy
Program Support
Technology enabled Learning
Continuous monitoring and Evaluation
1. Program Overview
The Master of Technology (M.Tech) – Weekend Classroom program in Data Science and Machine
Learning Program is a 2.5 year program, offered in Weekend Classroom Mode. It shall have exclusive
weekend classroom teaching, group capstone project and Individual project.
The program is primarily for working professionals and hence shall have 8 hours of Weekends
(Saturday and Sunday). The classes shall be conducted on alternative weeks, so that the students are
provided adequate time to engage in assessments, mini projects and there shall also be learning
intervention sessions by our academic counselors based on the need of each batch.
● Alternate week classes on Saturday & Sunday for 8 hours (32 hrs/month)
● Self-Study via online content – every week 3-4 hours of online content to be released;
submissions on Daily Assignments, Mini-Projects & Report submission
● Online industry sessions and faculty sessions to be conducted from time to time
2. Program Educational Objectives

The program is aimed at equipping students with advanced knowledge and understanding of data
science and engineering tools & techniques widely sought after by organizations across different
industries.
The Program Educational Objectives help prepare the students to:
● Train and prepare students to be Data Science and Engineering professionals, strong and sound
in fundamentals of mathematics and programming, that facilitates innovative skills and
strategies to help solve problems of industry and society
● Prepare students to be professionals in hands-on design and development skills on data driven
projects
● Develop an in-depth understanding of the complete lifecycle of Artificial Intelligence projects
with model deployment strategies.
● Learn about Big data architectures and real time data processing of data.
● Present oneself as an ideal candidate for data analyst, data engineer, and data scientist roles
within leading analytics companies.
● To prepare students to secure valuable placements in global industries, be competent both as
employees and entrepreneurs.
3. Program Outcomes
● Critical Thinking : Leverage the latest data science tools and techniques in the Data Science
field. Be able to structure business problems in machine learning and data science frameworks
● Major Technology and Tools : To be well versed in analytics tools and technologies such as
Python, SQL, Scikit-learn, Tensorflow, Tableau, Hadoop, Kafka, MongoDB, Spark, Databricks
● Machine Learning : Enable working on real-world problems using Industry grade machine
learning techniques such as Predictive Modelling, Forecasting, Recommender Systems,
● Deep learning and Natural Language processing : Enable students to become proficient in
Computer Vision, Convolutional Neural Networks, Transfer learning, GANs, LSTMs, Text
classification and analytics, Information retrieval, Sentiment analysi
● Big data processing : Learn the concepts of unstructured data management, Real time data
processing techniques, Build industry grade Extract-Transform-Load data pipelines
● Research skills :To build research skills by conducting in depth understanding and literature
review of real world problems, learn ethics behind writing
● Presentation and Communication skills : Learn the best practices of project management and
presentation skills, ethics behind presentation, do’s and don’ts in communication
● Project management : To develop Industrial acumen by learning about best practices of data
science and engineering project management
4. Syllabus Structure
UE20CS901 Python for Data Science (2-0-0-4-3)

Course Objectives
● To introduce the students to the Python programming

● To introduce the students to various programming practices
● To introduce the students to Python libraries for data science
● To expose the students to the basic data visualization using python
Course Outcome
The students by the end of the course will be able to:
● Learn python program for data science and engineering

● Perform basic operations related to data analysis
● Work with advanced python function and learn data visualization
Course Content
1. Python Overview : Advantages and Disadvantages of Python, Basics of Python, Installation

and IDE Overview, Programming Basics
2. Data types : String, float, numbers, List, Tuples, Sets & Dictionaries
3. Conditional execution : The Simple if Statement, The if/else, Nested Conditionals, Conditional
Expressions, Errors in Conditional Statements, The while Statement, Loops, for Statement,
Nested Loops
4. User defined functions : Function Basics, Writing Functions, Using Functions, Main Function,
Variable scope, Lambda and recursive function
5. Working with Numpy : Create array, attributes, indexing and slicing, iterating through array
6. Working with Pandas: Data frame introduction, Components of Pandas, Arithmetic operations,
Data frame creation
7. Advanced functions : Concatenating, Join and merge, Pivot tables
8. Visualization : Python Libraries, Plotly, Seaborn & Matplotlib packages, Distribution plots,
Scatter plots, Heatmaps
Pre-requisite Courses : None
Reference books
1. Grus, J. (2019). Data Science from Scratch, 2nd Edition. Place of publication not identified:
O'Reilly Media, Inc.
2. McKinney, W. (2018). Python for data analysis: Data wrangling with pandas, NumPy, and
IPython.
3. Vanderplas, J. T. (2017). Python data science handbook: Essential tools for working with data.
UE20CS902 Statistical Methods for Decision Making (2-0-0-4-3)
Course Objectives
● To introduce to the students the basics of statistics

● To introduce the students to probability and hypothesis testing
● To introduce the students to perform statistical inferential analysis on data sets
● To expose the students to the various statistical techniques
Course Outcome
● Understand probability distributions and key terminologies

● Build the hypothesis testing for various real world scenarios
● Understand the statistical test to be selected for various data sets
● Present the statical inference
Course content:
1. Introduction to Statistics : Types of Statistics - Descriptive and Inferential Statistics , Data

Sources and Types of Datasets , Measures of Central Tendency - Measures of Dispersion -
Range, IQR, Standard Deviation, coefficient of variation ,Five number summary, Visualization
plots and Correlation analysis
2. Probability and Bayes Theorem : Probability and Distributions, Probability - Meaning and
concepts , Marginal Probability , Bayes' theorem
3. Probability Distributions and Sampling : Introduction to Probability distributions, Discrete and
continuous distributions , Introduction to the Normal distribution, Sampling distribution and
population parameters, Comparing sample and population - The concept of testing hypotheses,
Sampling error, Defining a confidence interval
4. Hypothesis Testing : Introduction to hypothesis testing, Hypothesis testing applications, One
sample & two Samples Hypothesis testing for large samples test (Z test) Hypothesis testing for
small samples - the t distribution two sample t-test – unpaired and paired t-test - Hands -on
with any ML data set
5. ANOVA & Chi Square : Intro to the Chi Square test. Tests of independence. Goodness of fit.
Introduction to ANOVA. Analysing variance between and within groups. Simple one-way
ANOVA. Post hoc analysis
6. Simulation Techniques : Uncertainty and the Monte Carlo Simulation, Random Variables: The
Key to the Monte Carlo Simulation, Monty hall problem, Stock market portfolio optimisation
Pre-requisite Courses : None.
Reference books
1. Kathryn A Szabat David M. Levine, P. K. Viswanathan, David Stephan (2017). Business Statistics :
A First Course, 7 th Edition
2. HASLWANTER, T. H. O. M. A. S. (2018). INTRODUCTION TO STATISTICS WITH PYTHON: With
applications in the life sciences. Place of publication not identified: SPRINGER.
3. Downey, A., & Green Tea Press. (2012). Think Bayes: Bayesian statistics made simple. Needham,
Massachusetts: Green Tea Press.
UE20CS903 Databases and SQL (2-0-0-4-3)
Course Objectives
● To provide comprehensive introduction to SQL from several perspectives

● To introduce the methods for designing a database and learn various query methods.
Course Outcome
● Write your SQL queries for data warehousing and analytics

● Navigate your way through a MYSQL Workbench environment
● Perform basic operations related to database and tables
● Apply data modeling concepts and their applications in design
● Construct a typical enterprise database
Course Content
1. Introduction DBMS : What is Database, RDBMS Basics, SQL Commands: DDL, DML, DQL, Key
Attributes:primary, Foreign, candidate, Data Types SELECT with predicates: Where clause,
Comparison Operator, Missing data.More operations on DDL.
2. Constraints and Functions: Constraints with Alter, check, Default, Not Null, Unique,Where
clause predicates: pattern Matching, Between, IN, Not Null, Set Operations: Unique, Intersect,
Minus, Duplicates, Operator Precedence, SQL Built- in function: String, Numeric, Date, Bin,
Cast, Coalesce.
3. Aggregate Functions: Count, Sum, Average, Min, Max, Aggregate with Group by clause:
Multiple grouping columns, Null Values , Aggregate with having clause, Having without Group
by.
4. Multiple Table queries: Introduction to Joins, Introduction to ER Diagram, Types of Joins:
Simple, Natural, Equi, Non Equi, Self , Left, Right, Inner and Outer joins.
5. Subqueries & Query Expressions : Introduction to subquery, Subqueries: types and its
Properties, benefits, subqueries in where clause, subqueries using: where clause, outer
reference, subquery search condition: comparison test, Membership test, Existence test,
quantified test, from clause, Nested sub queries: various test using subqueries
6. Subqueries with joins and Advanced Aggregate Function: Subqueries with joins: Row values
expression, query expression, windows Function: Rank(), Dense_Rank(), percent_rank(), Lead(),
Lag(), First_value(), Last_value(), NTile(), Cume_Dist(), windows with aggregate function,
Recursive query expression
7. Data Integrity : What is Data Integrity, classification:Row level, Column level, Referential , what
is ACID properties, Normalisation, Constraints on single relation, Column check constraints,
Referential Integrity problem, Delete and update rule, Cascading, Transaction processing,
8. Transaction Processing & Views: What is Transaction, Transaction Model, Isolation:
Classification of Isolation, Save point, Release point,commit and Rollback, Locking: Different
levels of locking, Introduction to views, Types of Views: Horizontal, Vertical, Grouped views,
joined views, Materialised view
Pre-requisite Courses : None

Reference books
1. James, J. (2018). Advanced applied SQL for business intelligence and analytics.
2. Nield, T., & Safari, an O'Reilly Media Company. (2019). SQL for Analytics.
3. Sullivan, D., & linkedin.com (Firm). (2017). Advanced SQL for Data Scientists.
UE20CS904 Mathematical Foundation (1.5-0-0-2-2)
Course Objectives
● To provide comprehensive introduction to mathematical foundation for machine learning

● To introduce the basic concepts of linear algebra, functions and optimization techniques.
Course Outcome

● Have solid foundation of basic mathematical concepts useful for machine learning
● Learn the basics of optimization techniques
● Solve problems and visualize fundamental principles.
Course Content
1. Vector and Matrices: Vector operations and transformations , Matrix operations and
Transformations , Eigen Values and Vectors, Eigen basis and transformations.
2. Linear Algebra for ML : Mathematical modelling of Machine Learning problem, Linear
Regression Overview , ML problem formulation, Explanation through an example, Image Pre-
processing
3. PCA and SVD : Concept of decomposition, Dimensionality reduction, Principle components,
eigen vector and singular vector, eigen value and singular value. Application of PCA and SVD
4. Optimizations methods : Introduction to Optimization - Function Differentiation and
Integration, Rules of Derivatives - Maxima & Minima, First order optimization algorithms ,
Gradient Descent and Stochastic Gradient Descent
Pre-requisite Courses : Elementary matrix properties, Basic calculus
Reference books
1. Deisenroth, M., Faisal, A., & Ong, C. (2020). Mathematics for Machine Learning. Cambridge:
Cambridge University Press. doi:10.1017/9781108679930
2. G. Strang (2016). Introduction to Linear Algebra, Wellesley-Cambridge Press, Fifth edition, USA.
3. David G. Luenberger (1969). Optimization by Vector Space Methods, John Wiley & Sons (NY)
UE20CS905 Machine Learning- I (2-0-0-4-3)
Course Objectives
● Offers an in-depth overview of Machine learning regression topics including working with real
world data
● Develop algorithms using supervised learning methods
● Learn Python libraries for Exploratory data analysis and regression algorithms for machine
learning
Course Outcome

● Explore the data using various python libraries.
● Understand Machine Learning and its applications.
● Build regression models and analyze model metrics/performance.
● Learn model deployment strategies and create an API endpoint.
Course Content
1. Machine learning overview : Concept of ML, Types of ML, Applications of ML

2. Data processing : Missing Value Treatment, Outlier Treatment, Scaling, Encoding and
Transformation, Feature Engineering, Train-Test Split
3. Regression Analysis : Bivariate and Multiple Linear Regression, Ordinary Least Squares
Method, Measures of Variation, Inferences about slope, Assumptions of Linear Regression,
Model Evaluation Metrics, Model Performance Evaluation, Optimization Algorithm – SGD.
4. Feature Selection and Regularization: Backward Elimination, Forward Selection, Recursive
Feature Elimination, Under Fitting Vs Over-Fitting, RIDGE, LASSO and ELASTIC NET
Regularization.
5. Model Deployment : End to End case study, Model serialization, Introduction to Flask and
model deployment
Pre-requisite Courses : Python for data science, Statistics and Mathematical foundation.
Reference books
1. Mitchell, T. M. (2017). Machine learning. New York: McGraw Hill.

2. Geron, A. (2017). Hands-on Machine Learning with Scikit-Learn & TensorFlow: Concepts, Tools,
and Techniques to build Intelligent Systems.
3. Hastie, T., Friedman, J., & Tisbshirani, R. (2018). The Elements of statistical learning: Data
mining, inference, and prediction. New York: Springer.
UE20CS906 Data Visualization using Tableau (1.5-0-0-2-2)
Course Objectives
● To prepare students to use various tools for data visualization

● The course will teach best practices in data visualization and become expert in visualizing
knowledge from data.
Course Outcome
● Use business intelligence (BI) and analytics software like Tableau and PowerBI
● Explore the user-friendly drag-and-drop functionality to visualize data.
● Build impactful data dashboards and present the data.
Course Content
1. Introduction to Tableau: Line charts and trends, Cascading and filters, A custom action in
dashboard
2. KPI chart: Bubble, dual-axis charts, Creating a storyboard
3. Statistical analysis: Pareto charts, Custom graphs, Maps, RFM analysis, Data interpreter and
transformation
Pre-requisite Courses: None
Reference books
1. Ryan Sleeper (2018). Practical Tableau: 100 Tips, Tutorials, and Strategies from a Tableau Zen
Master
2. Rajpurohit, M. (2018). Hands-on data visualization with Microsoft Power BI.
UE20CS907 Structuring Visualization and Analytical Problems (SVAP)

(0.5-0-0-2-1)
Course Objectives
● To prepare students to critically analyze the data using visual representation methods and
analytical techniques.
● The course will teach best design practices for data visualization.
Course Outcome

● Critically infer data using various analytical frameworks
● Become data storytellers using visualization principles
Course Content
1. Data visualization principles : Problem structuring , Root cause analysis ,Cause and effect
modelling, Storytelling arc
2. Analytical frameworks : BCG Matrix, Porters five force models, SWOT analysis, Ansoff Growth
matrix, Business case study
Pre-requisite Courses: None
Reference books
1. Business Information Visualization by Tegarden, D. P.. Communications of the AIS, 1(4): 1-

38.1999.
2. Visual Representation: Implications for Decision Making by Lurie, N.H. and C.H. Mason.Journal
of Marketing, 71(1): 160-177. 2007
UE20CS931 Machine Learning - II (2-0-0-4-3 )
Course Objectives
● Offers an in-depth overview of Machine learning classification topics including working with
real world data
● Develop algorithms using supervised learning methods
● Learn Python libraries for classification models algorithms for machine learning
Course Outcome

● Explore the data using various python libraries.
● Understand Machine Learning and its applications.
● Build classification models and analyze model metrics/performance.
Course Content
1. Supervised Learning – Classification – Logistic Regression : Introduction to classification

model. Binomial Logistic Regression, Assumptions of Logistic Regression, Significance of
Coefficients, Model Evaluation Metrics, Model Performance Measures , Imbalanced Data
2. K - Nearest Neighbours and Naïve Bayes : Fitting the K-NN algorithm to the Training set, Test
accuracy of the result, Visualizing the result, Proximity Measures, Visiting Bayesian Basics,
Naïve Bayes for classification
3. Decision trees : Understanding Terminologies, Decision Trees for Classification, The measure of
Purity of Node, Construction of Decision Tree , Decision Tree Algorithms, Model Evaluation,
Model Performance Measures , Overfitting in a Decision Tree
4. Ensemble Learning: Random Forest Classifier, Feature Importance, Bagging
5. Boosting Algorithms : Ada Boost, Gradient Boosting, AdaBoost Vs Gradient Boosting, XGBoost
6. Machine Learning Case study : Hands-on with classification dataset, Build Model and Evaluate
the test score, Create a deployable model
Reference books
1. Müller, A. C., Guido, S., & O'Reilly Media. (2018). Introduction to machine learning with Python:
A guide for data scientists. Sebastopol: O'Reilly Media
2. Brunton, S. L., & Kutz, J. N. (2019). Data-driven science and engineering: Machine learning,
dynamical systems, and control.
3. Shan, C., Wang, H. H., Chen, W., Song, M., & Klamka, J. (2015). The data science handbook:
Advice and insights from 25 amazing data scientists.
UE20CS932 Machine Learning - III ((1.5-0-0-2-2)
Course Objectives
● Offers an in-depth overview of Machine learning unsupervised topics including working with
real world data
● Formulate a well - defined Unsupervised machine learning problem with clear metrics
● Familiarize with Clustering techniques, Dimensionality Reduction and recommendation system
Course Outcome
● Understand business applications for unsupervised learning. Learn to cluster and classify data.
● Extract rules and associations and provide impactful recommendations from data.
● Decide on the data that matters for the learning problem at hand.
Course Content
1. Overview of Unsupervised learning : Concept of clustering, Application of clustering, Types of

Unsupervised models
2. Clustering algorithm : Measures of distance – Euclidean, Manhattan and Minowski distance. K-
mean clustering – Hierarchal Clustering- Density based clustering - Clustering metrics – WCSS
– Silhouette score
3. Dimensionality Reduction methods: Principal Component Analysis, Singular Value
Decomposition, LDA
4. Recommender Systems : Types of Recommender Systems, Popularity based recommendation,
Content Based Filtering, Collaborative based recommendation engine - Evaluating
Recommender and model refinement, Market basket analysis
5. Hands-on case study: Case study on clustering and recommendation systems
Reference books
1. Müller, A. C., Guido, S., & O'Reilly Media. (2018). Introduction to machine learning with Python:
A guide for data scientists. Sebastopol: O'Reilly Media
2. Brunton, S. L., & Kutz, J. N. (2019). Data-driven science and engineering: Machine learning,
dynamical systems, and control.
3. Shan, C., Wang, H. H., Chen, W., Song, M., & Klamka, J. (2015). The data science handbook:
Advice and insights from 25 amazing data scientists.
UE20CS933 Natural Language Processing (1.5-0-0-2-2)
Course Objectives
● Offers an in-depth overview of Text analytics topics including working with real world data
● Expose students to the Information Extraction problems and end to end Natural Language
Generation problems as applications.
● Introduce students to the various Neural Network methods for Natural Language Processing.
Course Outcome
● Perform text analysis involving classification and clustering.

● Get technical insights to the state of the art framework in Natural language processing
● Understand business applications for Natural language processing
● Implement meaningful course or research projects using current Natural Language Processing
technology.
Course Content
1. Introduction to Text analytics and Text classification: Text analytics framework, Lexical and
syntactic processing, Word Cloud, Stemming, Stop words List, N-gram, Bag of Words, Part-of-
Speech tagging (POS), Named Entity Recognition (NER), TF-IDF, Parsing, Word2vec,
Vectorization
2. Sequential Models : Recurrent Neural Network and LSTM based text analytics model
3. Transformer based Models: Encoder – Decoder, Attention model, BERT, Neural Machine
Translation
4. NLP Hands-on case studies : End to end practical case studies using the above concepts
Reference books
1. Mugan, J. (2017). Natural language text processing with Python: Hands-on NLP in Python using
NLTK, spaCy, gensim, and scikit-learn.
2. In Loonycorn (Firm), & Packt Publishing,. (2017). From 0 to 1: Machine learning, NLP & Python :
cut to the chase.
UE20CS934 Time series Forecasting (1.5-0-0-2-2)
Course Objectives
● The course will provide a sound theoretical understanding and empirical grounding in
fundamental of time series forecasting
● Build time series models and use the model for prediction
Course Outcome

● Students expertise themselves in deterministic, forecasting, and classification models using
large cross section and time series data.
● Differentiate between the mathematical models to be used for various scenarios
● Understand business applications for time series forecasting
Course Content
1. Time Series and Forecasting: Introduction to Time Series, Data and Time Series Analysis based
Forecasting, Exponential smoothing, and Forecasting Accuracy
2. Fundamental concepts : Components of Time Series, Stationarity, Auto-correlation, Partial
Autocorrelation
3. Time Series Econometrics: Key concepts - Ergodic and Stochastic processes, Stationary
processes, non-stationary processes unit root, stochastic process, Tests of stationarity
-Autocorrelation function
4. Various models in Time series forecasting : Auto Regressive models , Naive Model, Moving
Average models, Auto Regressive Moving Average Model, ARIMA, SARIMA
5. Forecasting and Decision Analysis: Components of a Time Series, Moving averages and
exponential smoothing, trend projection, seasonality and trend
6. Hands-on business case study : Understand Predictive maintenance and energy consumption
business use cases. Build models for Machine monitoring and Energy analytics.
Reference books
1. Pal, A. (2017). Practical time series analysis: Master time series data processing, visualization,
and modeling using Python.
2. Yang, K., & Safari, an O'Reilly Media Company. (2020). Time Series Analysis with Python 3.x.
UE20CS935 Introduction to Deep Learning and Applications (3-0-0-4-4)
Course Objectives
● Offers an in-depth overview of deep learning topics including working with real world data
● Introduce students to Deep Learning techniques – like CNN, Various models for CNN and Image
segmentation
● Students will learn the technology behind video analytics, Image recommendation, self-driving
cars, chatbots, visual intelligence and many more
● Implement Machine Learning techniques with TensorFlow and Keras.

● Learn how to work with different deep learning algorithms for image classification,Image
segmentation and object detection
● Understand the basics of Transfer learning and Generative Adversarial Network
● Understand business applications for deep learning and deploy the deep learning models
Course Content
1. Introduction to Neural Network and Tensorflow: Building blocks of NN, Activation functions,
Loss functions, Learning rate, Single layer and multi-layer perceptron.
2. Deep Neural Network Modelling and Tuning : Layers of Deep Neural networks, Building DNN
models for tabular and image data, tuning the model for better performance , Regularization.
3. Convolution Neural Network : Introduction to Image processing, Architecture of CNNs, Filters,
Feature Maps, Max-Pool Layers, Other Pooling Types, Model compilation, Case Study: Image
Classification Using CNN – Hands-On
4. Transfer learning : Introduction to Transfer learning, Prediction using ConvNet pretrained
models like mobile net, Resnet etc., Tuning the Transfer models
5. Object Detection and Segmentation: Object localization as regression problem, Region based
object Detection, Object detection using tensor flow api and YOLO. Object segmentation,
Encoder- Decoder, U-Net.
6. Siamese Network : Twin network for image matching, concept of image pairs, Loss function
and metrics.
7. Deep learning Case study : Build compete AI project for use cases in Healthcare, Biometric,
Face-recognition etc.,
Pre-requisite Courses : Python for data science, Statistics, Mathematical foundation, Machine
Learning- I
Reference books
1. Chollet, F. (2019). Deep learning with Python.

2. Vasilev, I., Slater, D., Spacagna, G., Roelants, P., & Zocca, V. (2019). Python Deep Learning:
Exploring Deep Learning Techniques and Neural Network Architectures with Pytorch, Keras,
and TensorFlow, 2nd Edition. Birmingham: Packt Publishing Ltd.
3. Bakker, I. (2018). Python deep learning solutions.
UE20CS936 Introduction to Big Data (2-0-0-4-3)
Course Objectives
● Offers an in-depth overview of handling big data topics including working with real world data
● Use tools and techniques to analyze a large data corpus
● Understanding database architecture and differentiate unstructured data with structured data
● Learn to build industry grade data processing pipeline

● Explain trade-offs in big data processing technique design and analysis
● Understand business applications for big data analytics
● Learn and differentiate various architectures and tools to be used in big data processing
applications.
● Communicate the design through a presentation and build a prototype to showcase the design.
● Learn to build industry grade ETL or ELT data processing architectures and Develop an
application for a real-life case study.
Course Content
1. Big data programming models: Introduction to Big Data, HDFS,MapReduce programming

model, YARN architecture, Kafka for streaming
2. Big Data Algorithms and tradeoffs: Relational operators, Matrix multiplication, Computational
complexity of Hadoop, Introduction to HIVE and HBase and their programming model, Sqoop
and Flume.
3. NoSQL : Types of NoSQL databases,CAP Theorem,ACID vs BASE
Cassandra, MongoDB, Learn PyMongo (Python driver)
4. Introduction to Spark : Learn Spark framework and architecture, Learn Pyspark (Python driver),
Use Azure Databricks to build end-to-end data models
5. Building ETL/ELT Pipelines : Learn data ingestion and data processing pipeline, Build Industry
grade data pipelines for data streaming and real time analytics.
Pre-requisite Courses : Databases and SQL
Reference books
1. Simon, P., & Dexter, S. (2018). Too big to ignore: The business case for big data.
2. Baesens, B. (2014). Analytics in a big data world: The essential guide to data science and its
applications.
3. Manoochehri, M. (2014). Data just right: Introduction to large-scale data & analytics.
UE20CS937 Business Analysis and Communication (0.5-0-0-2-1)
Course Objectives
● To prepare students understand the need of business communication and project

documentation.
● Strategically assess how data science and engineering can enable process transformation and
business value in an organization.
● The course will teach best documentation practises to handle the complete life cycle of data
driven projects.
Course Outcome
● Understand the role played by data scientists and engineers in an Organisation.

● Critically build documentation for the project and follow best writing practices
● Evaluate how Data engineering, Artificial Intelligence and Big data can deliver business agility.
● Be effective in analysing and presenting the project case study
Course Content
1. Life cycle of data driven projects : Need of project design, Important concepts relating to
project design and methods, Selection of technical tools, Cost analysis and ROI calculation
2. Business writing - Report writing for any data analytics problem- Problem explanation, Data
exploration, Modelling insights, Providing Recommendations, Effective Technical writing-
Formal research proposal, Introduction to IP designs
3. Research methodology : Thesis writing, best practices in writing journals and conference
papers, understanding the ethics behind research writing
Pre-requisite Courses : Structuring Visualization and Analytical Problems
Reference books
1. Kothari C. R, New Age International (P) Limited, Second Edition, 2004. Research Methodology:
Methods & Techniques
2. Kirill Dubovikov (2019). Managing data science, Packt Publishing
3. Ivan Valiela, Oxford University Press, (2009). Doing Science: Design, Analysis and
Communication of Scientific Research
6. Assessment Structure- ISA/ ESA
There shall be 60% weightage for Internal assessment spread across Quiz, Assessments,
Assignments and Project. The End Semester Assessment for 40% weightage. The first year
program evaluation consists of a combination of attendance/ Class participation, quiz,
assessments, projects and semester exams. The evaluation scheme would be as below
Assessment Particulars Percentage Split

Class participation 10%
Internal Quiz 20%
Assessment (60%) Project 20%
Assessment/ Project 10%
Semester (40%) Semester Exam 40%
Total 100%
Class Participation: Marking scheme shall be as per attendance and participation of students
in lecture hours
Quiz: Quiz on the topics handled during the week
Assessment: The assessments shall be conducted in class / proctored environment
Project: The projects have to be completed by students as take home assignments and
submitted for evaluation.
7. Teaching Methodology
7.1 Pedagogy
The program follows a unique pedagogy of learning along with real life examples, code
implementations, which are built along with the concepts covered in the course. Every lecture session
has a mix of theory concepts, example case study and implementation code. This provides a holistic
view to the students and also helps them link the theory, use cases and coding/implementation
aspects. Further, students are provided with assignments specific for each lecture session, so that
they can practice the coding elements in parallel, along with the session.
The faculty for the program, comprising a mix of academicians and industry practitioners, put a lot of
emphasis on relating concepts to applications in various domains. Also, while delivering a course, with
the help of academic counsellor, they collect constant feedback to ensure the understanding of the
batch. They also provide inputs for class participation by students hence observing how each student
is engaging in the class.
We also put a lot of emphasis on peer learning. Participants are encouraged to help each other in the
in-class assignments and provide perspectives from their respective domains. There is also a
presentation session in the theory subjects, where students collaborate in teams and present their
work in the specific subject.
7.2 Program Support
There is a dedicated academic counsellor for each batch to observe participant’s engagement in class
and with assignments. Counsellors and faculty work with each student to mentor them through the
assignments so that every student is on track on a weekly basis. Our learning management system
also is designed to look at the data of each participants’ activity to analyse and highlight students who
might be at risk, and accordingly, personalized help is provided to learners. Each student's progress,
feedback is reviewed every week for early intervention ensuring students' concerns are addressed.
7.3 Technology enabled Learning
a) Our LMS with the help of a 360-degree evaluation data of students uses analytics to highlight
students who are at risk and are falling off the learning curve. With daily interventions and
measurements, we understand how the student is learning and provide personalized interventions to
ensure that all students are on track.
b) Lecture sessions recorded: The classroom sessions are recorded and these recordings are made
available to all the students of the batch. This is available in their LMS for them to revisit and revise the
curriculum covered in the class. Further, though we encourage students to attend all sessions, in an
unavoidable circumstance if they are not able to, these recordings help them to understand the
concepts covered in the lecture session.
c) Video content for each course: For each session, there is pre-class and post-class video content
made available to each student. With basic topics covered in pre-class content, each participant is set
up well for the classroom session and hence, each classroom session is designed to maximize the
effectiveness of physical classroom interaction. Post-class content builds upon what’s covered in class
and gives additional context.
d) Detailed Track Record : For each student, we have a detailed track record of their attendance,
internal assessments, program requests and their overall progress captured and the same is monitored
regularly. This ensures a good spot-check for each participant.
e) Program requests: Students are encouraged to clarify their doubts with the faculty and academic
counsellors while in class. Additionally, if they have some more queries or doubts while studying by
themselves, they can also raise requests through LMS which is addressed through the ticketing system.
f) Feedback: We collate feedback after each session and at regular intervals of the program through the
LMS. This gives us a sense of how effective the faculty was in delivering the concepts. By keeping track
of the feedback, we address the concerns of the students and also take proactive action on the way
forward.
7.4 Continuous monitoring and Evaluation
There is a continuous 360-degree assessment process measuring day-to-day comprehension as well as

medium and long term learning goals. Assessments focus on ensuring deep theoretical understanding
as well as application of concepts in real-world scenarios. In addition to measurement, the assessments
are designed to understand unique learning curves for each participant.
Assessments consist of quizzes, in-class participation on a daily basis to measure comprehension,

assignments and case studies/projects on a weekly basis, periodic hackathons and semester exams.
Further, the project works comprising M.Tech thesis and a capstone project are designed for
participants to pick meaningful industry live problems and solve tough and relevant problems applying
their learning in the program. We believe that each learner has a unique learning curve and measures
are designed to understand where each participant stands in their learning journey.
In addition to the above, classroom recordings, faculty learning aids, additional video content and
practice exercises are provided for each course.

MTECH_Handbook

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MTECH_Handbook

Uploaded by

Copyright:

Available Formats

`

*Syllabus is subject to revision

Program Educational Objectives

UE20CS901 Python for Data Science (2-0-0-4-3)

UE20CS931 Machine Learning - II (2-0-0-4-3 )

UE20CS936 Introduction to Big Data (2-0-0-4-3)

UE20CS970 MTech Project

Assessment Structure- ISA/ ESA

2. Program Educational Objectives

The Program Educational Objectives help prepare the students to:

UE20CS901 Python for Data Science (2-0-0-4-3)

● To introduce the students to the Python programming

The students by the end of the course will be able to:

● Learn python program for data science and engineering

1. Python Overview : Advantages and Disadvantages of Python, Basics of Python, Installation

Pre-requisite Courses : None

UE20CS902 Statistical Methods for Decision Making (2-0-0-4-3)

● To introduce to the students the basics of statistics

The students by the end of the course will be able to:

● Understand probability distributions and key terminologies

1. Introduction to Statistics : Types of Statistics - Descriptive and Inferential Statistics , Data

Pre-requisite Courses : None.

UE20CS903 Databases and SQL (2-0-0-4-3)

● To provide comprehensive introduction to SQL from several perspectives

The students by the end of the course will be able to:

● Write your SQL queries for data warehousing and analytics

Pre-requisite Courses : None

UE20CS904 Mathematical Foundation (1.5-0-0-2-2)

● To provide comprehensive introduction to mathematical foundation for machine learning

The students by the end of the course will be able to:

Pre-requisite Courses : Elementary matrix properties, Basic calculus

The students by the end of the course will be able to:

1. Machine learning overview : Concept of ML, Types of ML, Applications of ML

1. Mitchell, T. M. (2017). Machine learning. New York: McGraw Hill.

● To prepare students to use various tools for data visualization

The students by the end of the course will be able to:

Pre-requisite Courses: None

UE20CS907 Structuring Visualization and Analytical Problems (SVAP)

The students by the end of the course will be able to:

Pre-requisite Courses: None

1. Business Information Visualization by Tegarden, D. P.. Communications of the AIS, 1(4): 1-

UE20CS931 Machine Learning - II (2-0-0-4-3 )

The students by the end of the course will be able to:

1. Supervised Learning – Classification – Logistic Regression : Introduction to classification

UE20CS932 Machine Learning - III ((1.5-0-0-2-2)

1. Overview of Unsupervised learning : Concept of clustering, Application of clustering, Types of

UE20CS933 Natural Language Processing (1.5-0-0-2-2)

● Perform text analysis involving classification and clustering.

The students by the end of the course will be able to:

UE20CS935 Introduction to Deep Learning and Applications (3-0-0-4-4)

The students by the end of the course will be able to:

● Implement Machine Learning techniques with TensorFlow and Keras.

1. Chollet, F. (2019). Deep learning with Python.

UE20CS936 Introduction to Big Data (2-0-0-4-3)

The students by the end of the course will be able to:

1. Big data programming models: Introduction to Big Data, HDFS,MapReduce programming

Pre-requisite Courses : Databases and SQL

UE20CS937 Business Analysis and Communication (0.5-0-0-2-1)

● To prepare students understand the need of business communication and project

The students by the end of the course will be able to:

● Understand the role played by data scientists and engineers in an Organisation.

Pre-requisite Courses : Structuring Visualization and Analytical Problems

6. Assessment Structure- ISA/ ESA

Assessment Particulars Percentage Split