BDBC Content

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Executive M.

Tech in
Blockchain & Big Data

SEMESTER COURSE NAME L-T-P-C

FIRST Foundations of Computer Systems 3-0-0-6

Algorithm Design & implementation + Lab 2-0-2-6

Foundations of Blockchain 3-0-0-6

Database system & Design + Lab 2-0-2-6

BigData framework 2-0-2-6

SECOND Blockchain components & architecture 3-0-0-6

Data virtualization & dashboards 3-0-0-6

Cryptocurrency & cyber security 3-0-0-6

Elective 1 3-0-0-6

Elective 2 3-0-0-6

Capstone Project 0-0-6-6

THIRD Advanced statistical Methods 3-0-0-6

Elective 3 3-0-0-6
Minor Project 0-0-20-20

FOURTH Major Project 0-0-24-24

BLOCKCHAIN ELECTIVES

COURSE CODE COURSE TITLE CREDITS

Blockchain technologies – platform & Applications 3-0-0-6

Web development for blockchain applications 3-0-0-6

Blockchain policy – Legal, social and economic impact 3-0-0-6

Modern Cryptography 3-0-0-6

Smart contracts and solidity programming 3-0-0-6

BIG DATA ELECTIVES

COURSE CODE COURSE TITLE CREDITS

Security and privacy for big data 3-0-0-6

Time series analysis 3-0-0-6

Data engineering 3-0-0-6

Exploratory data analytics 3-0-0-6


SEMESTER 1

Name: Foundation of Computer


Course No.: Credits: 3-0-0-6 Prerequisites: NIL
Systems

Course Objectives:
1. To provide an understanding of computer architecture, operating systems, and
computer networks
2. To develop skills in assembly language programming, control unit design, and
network configuration
3. To explore advanced concepts of distributed networked systems, such as
virtualization and fault tolerance

Course Outcomes:

1. Understand the components and functions of a computer system, including CPU,


memory, and I/O devices
2. Write and debug assembly language programs and design control units using
hardwired and microprogrammed methods
3. Understand the basics of computer networking and network protocols, and be able
to configure a simple network

Module 1: Computer Architecture (13 HOURS)

 Study of an existing CPU: architecture, instruction set, and addressing modes


 Control unit design: instruction interpretation, hardwired and microprogrammed
methods
 Pipelining and parallel processing, RISC, and CISC paradigms
 I/O transfer techniques: programmed, interrupt-driven, and DMA
 Memory organization: hierarchical memory systems, cache memories, cache
coherence, virtual memory

Module 2: Operating Systems (13 HOURS)

 Processes & threads : Process creation & termination, Scheduling, synchronization


, IPC, thread creation & termination and thread synchronization.
 Concurrency : Critical sections & mutual exclusion, semaphores and monitors,
deadlocks and livelocks, Parallelism and concurrency control.
 Memory management : Allocation & deallocation, Paging & segmentation, VMM ,
Memory protection and access control.
 File Management : File organization and access methods, Directory structures, File
system implementation, Input/output (I/O) operations & file sharing and protection.

Module 3: Computer Networks (16 HOURS)

 Link layer protocols, local area networks (Ethernet and variants)


 Interconnecting networks with IP, routing, transport layer protocols
 Advanced concepts of distributed networked systems: Virtualization, distributed file
systems, mass storage systems, recovery, and fault tolerance, content networking
including multimedia delivery

TEXTBOOKS :

1. A. Silberschatz, P. B. Galvin and G. Gagne, Operating System Concepts, 7th


Ed, John Wiley and Sons, 2004.
2. J. Kurose and K. W. Ross, Computer Networking: A Top down approach, 3rd
Ed, Pearson India, 2004.
3. John L Hennessy, David A.Patterson , Computer architecture, A quantitative
approach ,5th Edition Morgan Kaupmann publications.

Algorithms : Design & Credits: 2-0-2-


Course No.: Prerequisites : NIL
Implementation 6

Course Objectives:

1. Develop a deeper understanding of the theoretical foundations of algorithm design


and analysis, including key concepts such as asymptotic analysis and NP-
hardness.
2. Learn to apply various algorithmic design paradigms, including backtracking,
dynamic programming, and greedy algorithms, to solve a range of real-world
problems.
3. Understand the basic principles of approximation algorithms and learn to apply
them to a variety of optimization problems.
4. Gain practical experience in implementing and analyzing algorithms through
programming assignments and projects.

Course Outcomes:

1. Students will be able to analyze the complexity of an algorithm and make decisions
about which algorithms are best suited for a given problem.
2. Students will be able to design algorithms using a variety of techniques, including
backtracking, dynamic programming, and greedy algorithms, and apply them to
solve problems in various domains.
3. Students will be able to apply approximation algorithms to optimization problems,
and understand the tradeoffs between approximation quality and runtime.
4. Students will be able to implement and analyze algorithms using programming
languages such as Python and Java, and use this knowledge to develop and
evaluate algorithms for real-world applications.

Module 1: Introduction and Analysis of Algorithms (6 hours)


 Basic concepts of algorithms
 Asymptotic analysis and Big-O notation
 Solving recurrence relations using recursion tree and Master Theorem
 Randomization as an algorithm design technique

Module 2: Backtracking and Dynamic Programming (6 hours)

 Backtracking algorithms: N queens, Game Trees, Subset sum


 Dynamic programming principles: memorization or iteration over subproblems
 Dynamic programming algorithms: Longest increasing subsequence, Optimal
binary search trees, Shortest paths in a graph, Negative cycles in a graph

Module 3: Greedy Algorithms and Approximation Algorithms (8 hours)

 Greedy algorithms: Scheduling classes, Huffman codes, Stable matching,


Minimum spanning tree problem
 Introduction to approximation techniques
 Deterministic rounding algorithm and rounding a dual solution
 Constructing a dual solution-primal dual method and randomized rounding
algorithm

Module 4: NP-Hardness, Network Flow, and Randomized Algorithms (6 hours)

 P vs NP, NP-hard, and NP-complete


 Reduction and SAT: 3 SAT, Clique and Vertex Cover, Graph coloring
 Network flow: The maximum flow problem, Ford-Fulkerson algorithm, Max flow and
Min cut in a Network
 Randomized algorithms: Randomized min cut algorithm, Randomized find,
Birthday paradox

Module 5: Advanced Approximation Algorithm and Applications (4 hours)

 Advanced approximation techniques: LP relaxation, randomized rounding, primal-


dual algorithm
 Applications of approximation algorithm: Scheduling jobs with deadlines on a
single machine, the k-center problem, the traveling salesman problem, scheduling
jobs on identical parallel machines, minimizing the sum of completion times on a
single machine.

List of Experiments : (12 Hours)

Experiment 1: Implementing and Analyzing the Merge Sort Algorithm


Experiment 2: Backtracking - Solving N-Queens Problem
Experiment 3: Implementing Longest Increasing Subsequence using Dynamic
Programming
Experiment 4: Greedy Algorithm - Implementing Huffman Coding
Experiment 5: Dijkstra's Shortest Path Algorithm
Experiment 6: Implementing the Ford-Fulkerson Algorithm for Maximum Flow
Experiment 7: Deterministic Rounding Approximation Algorithm
Experiment 8: Randomized Algorithm - Implementing QuickSort
Experiment 9: Implementing the Traveling Salesman Problem using Dynamic
Programming
Experiment 10: Graph Coloring using Greedy Algorithm

TEXTBOOKS :
1. Thomas H Cormen, Charles E Lieserson, Ronald L Rivest and Clifford
Stein,Introduction to Algorithms, MIT Press, 2009.
2. Jon Kleinberg and Éva Tardos, Algorithm Design, Pearson, 2005.
3. David P. Williamson and David B. Shmoys, The Design of Approximation
Algorithms, Cambridge University Press, 2010.
4. Jeff Erickson, Algorithms, 2019.

Course No.: Foundations of Blockchain Credits: 3-0-0-6 Prerequisites : NIL

Course Objectives:

1. Understand the fundamentals of blockchain technology, including its architecture,


functionality, and various cryptographic elements.
2. Explore the potential applications of blockchain technology in different sectors such
as finance, e-governance, and healthcare.
3. Analyze the challenges and constraints of implementing blockchain technology and
evaluate its implications on privacy, security, and trust.
4. Develop a foundational knowledge of the consensus problem and different
consensus mechanisms used in blockchain technology.

Course Outcomes:

1. Demonstrate a working knowledge of blockchain technology, including its


components, operation, and cryptographic elements.
2. Evaluate the potential applications and limitations of blockchain technology in
different sectors and industries.
3. Analyze the challenges and constraints of implementing blockchain technology,
including its impact on privacy, security, and trust, and propose potential solutions.
4. Develop the ability to critically evaluate and compare different consensus
mechanisms used in blockchain technology and analyze their advantages and
disadvantages.
Module 1: Introduction to Blockchain and Hash Functions (12 hours)

 Overview of blockchain technology


 Introduction to Distributed Ledger Technology (DLT)
 Functionality, Applications, and Use Cases of DLT
 Challenges and Constraints in implementing DLT
 Philosophy and Implications of DLT
 Introduction to Hashes and their Significance
 Hash Functions and Digital Signatures
 Introduction to Transactions, Blocks, and Blockchain

Module 2: Decentralization and Security in Blockchain (12 Hours)

 Technological and Cryptographic Elements in Blockchain


 Need for a Decentralized Ledger System
 Advantages and Disadvantages of Centralized Trusted Parties
 Security, Integrity, and Privacy Issues of a Decentralized System
 Trust and Coordination in Blockchain
 Barriers to Blockchain Adoption

Module 3: Bitcoin and Cryptocurrencies (9 Hours)

 Introduction to Bitcoin
 Bitcoin Protocol and Architecture
 Byzantine Generals Problem and Fault Tolerance
 Mining Mechanism and Incentives
 Distributed Consensus and Merkle Trees
 Transactions, Fees, and Anonymity in Bitcoin
 Public and Private Blockchains
 Double Spending Problem and Solutions
 Privacy in Blockchains
 Legal Considerations for Bitcoin and Cryptocurrencies

Unit 4: Consensus Mechanisms in Blockchain (9 Hours)

 Introduction to Consensus Problem


 Distributed Consensus and its Challenges
 Nakamoto Consensus and Proof of Work
 Proof of Stake, Delegated Proof of Stake, and Leased Proof of Stake
 Proof of Elapsed Time, Tangle, and Proof of Burn
 Difficulty Level and Energy Utilization in Blockchain
 Consensus in Ethereum

Text Books

1. Arvind Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller and Steven
Goldfeder, Bitcoin and Cryptocurrency Technologies: A Comprehensive
Introduction, Princeton University Press, July 2016.
2. Imran Bashir, “Mastering Blockchain: Distributed Ledger Technology,
decentralization, and smart contracts explained”, 2nd Edition, Packt Publishing Ltd,
March 2018.
3. Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction by Arvind
Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller, Steven Goldfeder,
Princeton University Press, 2016, ISBN 9780691171692

Name: DBMS , Design & Credits: 2-0-2-


Course No.: Prerequisites: NIL
Implementation 6

Course Objectives:
1. To emphasize the underlying principles of Relational Database Management
System.
2. To model and design advanced data models to handle threat issues and
countermeasures.
3. To implement and maintain the structured, semi-structured and unstructured data
in an efficient
4. database system using emerging trends.

Course outcomes:
1. Design and implement database depending on the business requirements and
considering various
2. design issues.
Select and construct appropriate parallel and distributed database architecture and
formulate the
3. cost of queries accordingly.
4. Understand the requirements of data and transaction management in mobile and
spatial database
5. and differentiate those with RDBMS.
Categorize and design the structured, semi-structured and unstructured
databases.
6. Characterize the database threats and its countermeasures.
7. Review cloud, streaming and graph databases.
8. Comprehend, design and query the database management system.

Module 1: Relational Model (8 hours)


 Introduction to Database System Architecture
 EER Modeling
 Indexing
 Normalization
 Query processing and optimization
 Transaction Processing

Module 2: Parallel Databases (8 hours)

 Architecture and Data Partitioning Strategies


 Interquery and Intraquery Parallelism
 Parallel Query Optimization

Module 3: Distributed Databases (8 hours)

 Features of Distributed Databases


 Distributed Database Architecture
 Fragmentation and Replication
 Distributed Query Processing
 Distributed Transactions Processing

Module 4: Spatial and Mobile Databases (4 hours)

 Introduction to Spatial Databases


 Types of Spatial Data
 Indexing in Spatial Databases
 Mobile Databases
 Transaction Model in MDS

Lab(14 hours)

Experiment 1: Modeling a scenario into ER/EER Model using ERD Plus, ER Win, or
Oracle SQL developer.
Experiment 2: Creating applications with RDBMS
Experiment 3: Partitioning a database and comparing execution speed with/without
parallelism.
Experiment 4: Creating an XML document and validating it against an XML Schema/DTD.
Experiment 5: Representing football games results in XML, DTD, and XQuery.
Experiment 6: Implementing parallel join and parallel sort algorithms.
Experiment 7: Creating a distributed database scenario and fragmenting the database.
Experiment 8: Importing a spatial dataset into Postgresql (PostGIS) and querying the
database.
Experiment 9: Investigation of spatial analysis techniques using Toxic Release Inventory
data.
Experiment 10: Visualizing and interpreting results of sample datasets from the healthcare
domain.

TEXT BOOKS:
1. Avi Silberschatz, Hank Korth, and S.Sudarshan,”Database System Concepts”, 6th
Ed.McGraw Hill, 2010.
2. Ramez Elmasri B.Navathe: “Fundamentals of database systems”, 7th
edition,Addison Wesley,2014

REFERENCE BOOKS:
1. S.K.Singh, “Database Systems: Concepts, Design Applications”, 2nd
edition,Pearson education, 2011.
2. Joe Fawcett, Danny Ayers, Liam R. E. Quin: “Beginning XML”, Wiley India Private
Limited 5th Edition, 2012.
3. Thomas M. Connolly and Carolyn Begg “Database Systems: A Practical Approach
to Design, Implementation, and Management”, 6th edition, Pearson India, 2015

Course No.: Name: BigData Frameworks Credits: 2-0-2-6 Prerequisites: NIL

Course objectives:

1. To provide a comprehensive understanding of Big Data and its ecosystem.


2. To familiarize students with the design and implementation of Big Data frameworks
and tools.
3. To introduce techniques for distributed data processing, storage, and analysis.

Course outcomes:

1. Understand the concepts and principles of Big Data and its ecosystem.
2. Design and implement Big Data frameworks using distributed processing systems.
3. Apply various data storage and processing techniques for handling large-scale
datasets.

Module 1: Introduction to Big Data (8 hours)

 Overview of Big Data


 Big Data challenges and opportunities
 Big Data ecosystem and architecture
 Data storage and management techniques
 Introduction to Hadoop and MapReduce

Module 2: Big Data Processing Frameworks (10 hours)

 Hadoop and its components: HDFS, MapReduce, YARN


 Apache Spark: RDD, DataFrames, and Datasets
 Apache Flink and Stream Processing
 Apache HBase and NoSQL databases
 Apache Cassandra: Data Model, Distribution, and Architecture

Module 3: Data Storage and Processing Techniques (10 hours)

 Understanding the principles of cloud-native development


 Building and deploying cloud-native Java applications using popular frameworks
such as Spring Boot and Quarkus

Experiments

1. Install and configure Hadoop, and develop a simple MapReduce program for word
count analysis.
2. Set up Apache Spark, create Resilient Distributed Datasets (RDDs), and write a
Spark application for data processing tasks.
3. Utilize Spark SQL and DataFrames to connect to databases, manipulate data, and
run SQL queries.
4. Explore Spark MLlib, implement a basic machine learning algorithm, and evaluate
its performance on a dataset.
5. Integrate Spark with visualization libraries to create insightful data visualizations.
6. Set up a cloud-based Hadoop or Spark environment, and deploy a sample big data
application.
7. Monitor and manage the deployed application using the cloud provider's
management console.
8. Gain proficiency in various aspects of big data processing, analysis, and
visualization using Hadoop and Spark frameworks.

REFERENCE BOOKS:
1. Mike Frampton, “Mastering Apache Spark”, Packt Publishing, 2015.
2. Tom White, “Hadoop: The Definitive Guide”, O’Reilly, 4th Edition, 2015.
3. Nick Pentreath, Machine Learning with Spark, Packt Publishing, 2015.
4. Mohammed Guller, Big Data Analytics with Spark, Apress, 2015
5. Donald Miner, Adam Shook, “Map Reduce Design Pattern”, O’Reilly, 2012
Course Name: Blockchain Credits: 3- Prerequisites: Fundamentals
No.: Components & Architecture 0-0-6 of Blockchain (Sem 1)

Course Objectives:

1. To provide an in-depth understanding of the key concepts and components of


blockchain technology.
2. To explore the different types of blockchain architectures and design
considerations, including security and consensus protocols.
3. To examine the use of blockchain in various sectors, such as financial software
and systems, government, and trade supply chains.
4. To provide students with the knowledge and skills to develop secure cryptographic
protocols on blockchain and analyze existing blockchain ecosystems.

Course Outcomes:
1. Students will be able to explain the core concepts and components of blockchain
technology.
2. Students will be able to design and implement basic blockchain architectures and
understand the security and consensus mechanisms required for their
development.
3. Students will be able to analyze the use of blockchain in various sectors and
identify opportunities for its implementation.
4. Students will be able to develop secure cryptographic protocols on blockchain and
compare and contrast different blockchain ecosystems, such as Bitcoin,
Hyperledger, and Ethereum.

Module 1: Blockchain Fundamentals (6 hours)

 Basic crypto primitives: hash, signature, hashchain to blockchain


 Basic consensus mechanisms
 Blockchain architecture and design considerations
 Requirements for consensus protocols
 Scalability aspects of blockchain consensus protocols

Module 2: Consensus Mechanisms (9 hours)

 Proof of Work (PoW) consensus mechanism


 Alternative consensus mechanisms: Proof of Stake (PoS), Delegated Proof of
Stake (DPoS), Byzantine Fault Tolerance (BFT), and more
 Decomposing the consensus process
 Consensus protocols for permissioned blockchains

Module 3: Permissioned Blockchains and Applications (9 hours)

 Design goals for permissioned blockchains


 Introduction to Hyperledger Fabric
 Hyperledger Fabric components
 Chaincode design and implementation
 Beyond chaincode: Fabric SDK and front end, Hyperledger Composer tool
 Settlements, KYC, and capital markets on blockchain
 Blockchain in insurance

Module 4: Blockchain for Supply Chain and Government (10 hours)

 Use case: Blockchain in trade supply chain


 Provenance of goods and visibility on blockchain
 Trade supply chain finance on blockchain
 Invoice management and discounting on blockchain
 Digital identity and records on blockchain
 Record keeping between government entities on blockchain
 Public distribution system and social welfare systems on blockchain

Module 5: Blockchain Cryptography, Privacy, and Security (8 hours)

 Overview of blockchain cryptography and security


 Privacy on blockchain
 Recent works on scalability
 Secured multi-party computation on blockchain
 Blockchain for science: making better use of the data-mining network
 Case studies: comparing ecosystems - Bitcoin, Hyperledger, Ethereum, and more

TEXT BOOKS:
1. "Blockchain Basics: A Non-Technical Introduction in 25 Steps" by Daniel Drescher,
Apress.
2. "Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money,
Business, and the World" by Don Tapscott and Alex Tapscott, Portfolio.
3. "The Basics of Bitcoins and Blockchains" by Antony Lewis, O'Reilly Media.

Data virtualization &


Course No.: Credits: 3-0-0-6 Prerequisites : NIL
Dashboards

Course Objectives:

1. To introduce students to the concept of data virtualization and its applications in


the field of big data and blockchain.
2. To provide students with hands-on experience using popular data virtualization
tools to create a unified view of data from multiple sources.
3. To teach students how to design effective dashboards that provide meaningful
insights into complex data sets.
4. To explore advanced topics in data virtualization and dashboards, such as real-
time data integration, self-service analytics, and integration with big data platforms
and blockchain.

Course Outcomes:

1. Students will be able to describe the benefits and challenges of data virtualization
and how it differs from traditional data integration approaches.
2. Students will be able to create a virtual data layer using a popular data
virtualization tool and connect to various data sources, including relational
databases, big data systems, and cloud applications.
3. Students will be able to design effective dashboards using popular dashboard tools
and connect virtual data sources to create interactive visualizations.
4. Students will be able to identify and apply advanced techniques in data
virtualization and dashboard design, such as real-time data processing, self-
service analytics, and integration with big data platforms and blockchain.

Module 1: Introduction to Data Virtualization (9 hours)

 Overview of data virtualization and its benefits


 Understanding data integration and how it differs from data virtualization
 Use cases for data virtualization
 Challenges and limitations of data virtualization
 Introduction to popular data virtualization tools and their architectures

Module 2: Data Virtualization in Action (12 hours)

 Building a virtual data layer with a popular data virtualization tool


 Connecting to various data sources (relational databases, big data systems, cloud
applications, etc.)
 Creating views and queries using the selected data virtualization tool
 Handling complex data transformations with the selected tool
 Managing metadata and security in a virtual environment

Module 3: Data Visualization and Dashboards (12 hours)

 Introduction to data visualization and dashboard design


 Key principles of effective data visualization
 Overview of popular dashboard tools (e.g. Tableau, Power BI, QlikView)
 Best practices for designing interactive dashboards
 Connecting virtual data sources to dashboards

Module 4: Advanced Topics in Data Virtualization and Dashboards (9 hours)

 Using data virtualization to support self-service analytics


 Real-time data integration and processing with data virtualization
 Integrating data virtualization with big data platforms and blockchain
 Best practices for performance tuning and optimization in data virtualization
 Future trends in data virtualization and dashboard design

TEXTBOOKS:

1. Data Virtualization for Business Intelligence Systems: Revolutionizing Data


Integration for Data Warehouses (Rick van der Lans)
2. Data Visualization: A Practical Introduction (Kieran Healy)
3. The Big Book of Dashboards: Visualizing Your Data Using Real-World Business
Scenarios (Steve Wexler, Jeffrey Shaffer, and Andy Cotgreave)
4. Building a Modern Data Center: Principles and Strategies of Design (Scott D. Lowe
and David M. Davis)

Name: Cryptocurrency And Credits: 3-0-0-


Course No.: Prerequisites: NIL
Cyber Security 6

COURSE OBJECTIVES:
1. To understand the fundamentals of network and symmetric ciphers.
2. To apply asymmetric ciphers and data integrity algorithms.
3. To explore the basics of cryptocurrencies and use Ethereum programming

COURSE OUTCOMES:

1. Recall the network security fundamentals.


2. Employ various symmetric ciphers.
3. Apply asymmetric ciphers and data integrity algorithms.
4. Explore the basics of cryptocurrencies.
5. Use Ethereum programming

Module 1: Introduction to Cybersecurity and Cryptography (10 hours)

 Need for cybersecurity,Concept of cyberspace,Cyber crimes and cyber-attacks


 Fundamental security principles,Key security triad,Key components of
cybersecurity network architecture,Basic security management and policies
 Cryptography,Private key cryptography,Classical encryption
techniques,Substitution techniques
Transposition techniques,Rotor machines,Steganography
 Data Encryption Standard,Advanced Encryption Standard, Multiple Encryption and
Triple DES

Module 2: Asymmetric Cryptography and Hash Functions (10 hours)

 Public-key cryptography,RSA algorithm,Diffie-Hellman key exchange,Elgamal


cryptographic system
 Elliptic curve arithmetic,Elliptic curve cryptography,MD5 message digest algorithm
 Secure hash algorithm (SHA),Digital signatures
 Authentication protocols, Digital signature standards (DSS)

Module 3: Blockchain Security and Privacy Issues (8 hours)

 Transaction security,Client security and privacy,Pseudo-anonymity vs anonymity


 Zcash and 2k-SNARKS for anonymity preservation
 Network layer attacks
 Security and privacy issues with scalability solutions,Balance privacy
 Wormhole attack

Module 4: Cybersecurity Infrastructure using Blockchain (8 hours)

 Blockchain-based PKI
 2-Factor authentication using blockchain
 Blockchain-based DNS
 Identity management
 Blockchain-based DDoS protection

Module 5: Security Aspects of Blockchain Applications (6 hours)

 Blockchain for cybersecurity and privacy in IoT


 loT
 Payment system applications

TEXT BOOKS
1. William Stallings, “Cryptography and Network security Principles and Practices”,
Pearson/PHI,2017.
2. Arvind Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller and Steven
Goldfeder, “Bitcoin and Cryptocurrency Technologies: A Comprehensive
Introduction”, Princeton University Press, July, 2016.

REFERENCE BOOKS
1. William Stallings, Network Security Essentials (Applications and Standards),
Pearson Education, India,2017
2. Imran Bashir, “Mastering Blockchain: Distributed Ledger Technology,
Decentralization and Smart
3. Contracts Explained”, Second Edition, Packt Publishing, 2018.

E BOOKS
1. https://www.pearson.com/us/higher-education/product/Stallings-Cryptography-and-
Network-Security-
2. Principles-and-Practice-5th-Edition/9780136097044.html
3. https://www.lopp.net/pdf/princeton_bitcoin_book.pdf
4. https://www.blockchainexpert.uk/book/blockchain-book.pdf

MOOC
1. http://nptel.ac.in/courses/106105031/

SEMESTER 3

Name: Advanced Statistical


Course No.: Credits: 3-0-0-6 Prerequisites: NIL
Methods

Course Objectives:

1. Understand the fundamental concepts and principles of advanced statistical


methods.
2. Apply statistical tools and techniques to analyze data and make informed
decisions.
3. Develop skills in modeling and forecasting using different approaches and
techniques.
4. Design and execute experiments to test hypotheses and draw conclusions.
Course Outcomes:

1. Develop a working knowledge of summary statistics, correlation, regression, and


inference methods for statistical analysis.
2. Use statistical software to analyze data, interpret results, and draw conclusions.
3. Apply modeling and forecasting techniques to real-world scenarios and make
accurate predictions.
4. Design and execute experiments, analyze data, and draw valid conclusions based
on statistical evidence.

Module 1 - BASIC STATISTICAL TOOLS FOR ANALYSIS (9 hours)

 Summary Statistics
 Correlation and Regression
 Concept of R2 and Adjusted R2 and Partial and Multiple Correlation
 Fitting of simple and Multiple Linear regression, Explanation and Assumptions of
Regression Diagnostics

Module 2 - STATISTICAL INFERENCE (9 hours)

 Basic Concepts
 Normal distribution-Area properties
 Steps in tests of significance –large sample tests-Z tests for Means and
Proportions
 Small sample tests –t-test for Means, F test for Equality of Variances, Chi-square
test for independence of Attributes

Module 3 - MODELING AND FORECASTING METHODS (12 hours)

 Introduction: Concept of Linear and Nonlinear Forecasting model


 Concepts of Trend, Exponential Smoothing, Linear and Compound Growth model
 Fitting of Logistic curve and their Applications
 Moving Averages, Forecasting accuracy tests
 Probability models for time series: Concepts of AR, ARMA and ARIMA models

Module 4 - DESIGN OF EXPERIMENTS (12 hours)

 Analysis of variance – one and two-way classifications


 Principle of design of experiments, CRD –RBD – LSD
 Concepts of 2^2 and 2^3 factorial experiments

TEXT BOOKS:
1. Applied Statistics and Probability for Engineers , 6ed , (2016) ,Douglas C.
Montgomery George C. Runger, John Wiley & Sons
2. Time Series Analysis and Its Applications With R Examples (2017), by Shumway,
Robert H.,Stoffer, David S. Springer publications

REFERENCE BOOKS
1. Trevor Hastie and Robert Tibshirani , “The Elements of Statistical Learning: Data
Mining,
2. Inference, and Prediction”, Second Edition -Springer Series in Statistics, (2017.
3. Applications for Engineering and the Computing Sciences” Mcgraw Hill education,
2017.

SOURCE: https://chennai.vit.ac.in/files/M.Tech(CSE)-BigData_2021_2022.pdf

BLOCKCHAIN ELECTIVES:

Name: Blockchain
Course No.: Technologies: Credits: 3-0-0-6 Prerequisites: NIL
Platforms & Applications
Course Objectives:

1. Articulate blockchain platforms that show promise in solving complex business


problems
2. Examine the life cycle of a chain code and its components
3. Implement various blockchain-based enterprise applications

Course Outcomes:

1. Demonstrate an understanding of various blockchain platforms and their potential


use cases in business
2. Develop and deploy smart contracts on the Ethereum platform using Solidity
programming language
3. Configure and deploy a production network on the Hyperledger Fabric platform

Module 1 - INTRODUCTION TO BLOCKCHAIN TECHNOLOGIES (6 hours)

 Introduction to Blockchain Technologies


 Overview of Blockchain Platforms: Ethereum, Hyperledger Project, IBM
Blockchain, Multichain, Hydrachain, Ripple, R3 Corda, BigChainDB, IPFS

Module 2 - ETHEREUM SMART CONTRACTS (12 hours)

 Introduction to Smart Contracts


 Solidity Programming Language
 Contract Creation and Deployment
 Web3js and RPC Protocols
 Miners, Transactions, and Blocks in Ethereum
 Front-End Development with React and Web3

Module 3 - HYPERLEDGER FABRIC (12 hours)

 Introduction to Hyperledger Fabric


 Fabric Model
 Identity Management in Fabric: Membership Service Provider (MSP)
 Policies in Fabric
 Ledgers in Fabric: World State and Transaction Log
 Chaincode in Fabric: Writing and Deploying Smart Contracts
 Endorsement Peers and Endorsement Policies in Fabric

Module 4 - ADVANCED TOPICS IN BLOCKCHAIN TECHNOLOGIES (12 hours)

 Ordering Nodes in Hyperledger Fabric: Solo Ordering Service, Kafka


 Committing Peers and Anchor Peers in Hyperledger Fabric
 Private Data Sharing in Hyperledger Fabric: Sharing Private Data, Private Data
Sharing Patterns
 Key-level Transaction Access Control and Endorsement in Hyperledger Fabric
 Setting up a Production Network on Hyperledger Fabric

TEXTBOOKS/LEARNING RESOURCES:
1. Tom Serres, Bill Wagner and Bettina Warburg, Basics of Blockchain (1 ed.),
missing, 2019. ISBN 9781089919441.

REFERENCE BOOKS/LEARNING RESOURCES:


a) Gaur and Nitin, Hands-On Blockchain with Hyperledger: Building decentralized
applications with Hyperledger Fabric an (1 ed.), Packt Publishing
Ltd, 2018. ISBN 978-17889945

Name: Web Development


Course No.: For Credits: 3-0-0-6 Prerequisites: NIL
Blockchain Applications

Course Objectives:

1. Understand the basics of Blockchain Technology and its integration with Web
Development
2. Gain hands-on experience in developing blockchain-based web applications using
JavaScript and Python
3. Explore different server-side options and databases for building blockchain
applications
4. Learn about web security, continuous integration, and deployment of blockchain
applications on a production server.

Course Outcomes:
1. Ability to build blockchain-based web applications using JavaScript and Python
2. Understanding of server-side options and databases for building blockchain
applications
3. Proficiency in web security and deployment of blockchain applications on a
production server
4. Acquiring skills in using various web development tools and technologies for
building blockchain applications.

Module 1 : Introduction to Blockchain Web Development (10 hours)

 Blockchain Technology and its integration with Web Development


 Technology stacks for blockchain-based web development
 HTML5 & CSS for blockchain-based web development
 Chrome DevTools for web development
 Functional programming paradigm for JavaScript inside a browser
 Python data types and basics
 Building client and server for blockchain applications
 Miner and wallet for blockchain applications
 Building a socket communication utility for blockchain applications
 Use of Low Code, No Code Tools in the development

Module 2 : JavaScript for Blockchain Web Development (10 hours)

 JavaScript enabled blockchain applications


 Compiling new JavaScript to the old one with webpack
 Better CSS with webpack
 Code organization in a project
 Asynchronous JavaScript code for developing smart contracts
 APIs for blockchain solutions
 Building a simple blockchain application

Module 3: Server-side Development for Blockchain Applications (11 hours)

 Overview of server-side options for blockchain applications


 Node.js environment for blockchain and its ecosystem
 JSON REST API for blockchain applications
 Using Postman to debug APIs
 Managing server-side application state for blockchain applications
 Web3.js for blockchain web applications
 Databases and SQL (SQLite, PostgreSQL) for blockchain applications
 Data normalization for blockchain applications
 User authorization and authentication for blockchain applications
 Allowing users to interact with blockchain applications

Module 4 : Web Security and Development Organization for Blockchain


Applications(11 hours)

 Web security basics for blockchain applications,Not trusting your clients for
blockchain applications
 Why use HTTPS for blockchain applications,Integrating other software with the
server for blockchain applications
 Developing frontend with React for blockchain applications
 Concept of single-page applications for blockchain applications,Managing client-
side application state (Redux) for blockchain applications,Overview of other client
JS frameworks for blockchain applications
 Development organization for blockchain applications
 Using Git for blockchain application development
 Concept of continuous integration for blockchain application development
 Configuring a production web server with Ubuntu for blockchain applications
TEXTBOOKS/LEARNING RESOURCES:

1. "Building Blockchain Projects: Building Decentralized Blockchain Applications with


Ethereum and Solidity" by Narayan Prusty, published by Packt Publishing.
2. "Blockchain Basics: A Non-Technical Introduction in 25 Steps" by Daniel Drescher,
published by Apress.
3. "Mastering Blockchain: Distributed Ledger Technology, Decentralization, and
Smart Contracts Explained" by Imran Bashir, published by Packt Publishing.

Name: Blockchain Policy:


Course No.: Legal, Social And Economic Credits: 3-0-0-6 Prerequisites: NIL
Impact

Course Objectives:

1. Understand the importance and impact of blockchain policies, regulations, and


guidelines.
2. Analyze the different stakeholders and communities affected by blockchain policies
and their implications.
3. Develop skills for drafting and implementing blockchain policies to ensure
sustainable infrastructure investment and international trade.
4. Evaluate the potential unintended consequences of blockchain and apply effective
strategies for mitigating them.

Course Outcomes:

1. Develop a comprehensive understanding of blockchain policies, regulations, and


guidelines.
2. Assess the impact of blockchain policies on different stakeholders and
communities.
3. Draft and implement effective blockchain policies to enable sustainable
infrastructure investment and international trade.
4. Analyze and mitigate the potential unintended consequences of blockchain for
successful policy implementation.

Module 1:Blockchain Policy and Guidelines (12 hours)

 Introduction to blockchain policies and their importance,Guidelines for blockchain


applications and infrastructures,International laws and regulations related to
blockchain
 Dialogue on distributed ledger technology (DLT),Policies for preventing money
laundering and terrorism financing,FATF standards on virtual assets
 Stable coins and their policy implications,Issues related to trust and
framework,Challenges and business impact of blockchain
 Resources for blockchain policies,Smart securities and derivatives

Module 2: Impact of Blockchain on Different Stakeholders (12 hours)

 Tokenization and securities for physical assets,Impact of blockchain on different


stakeholders,Shareholder engagement and investor privacy
 Blockchain industry bodies around the world,Corporate governance on the
chain,Impact on specific communities
 Problem of equality and blockchain
 Role of blockchain in the ecosystem for persons with disabilities,Impact of
blockchain on women,Tax administration to transparency,Tax treatment of digital
financial assets

Module 3: Enabling Sustainable Infrastructure Investment (10 hours)

 Digital financial marketplaces and track and trace,Provenance to countering fraud


 Agricultural supply chains and policy makers,Material supply chains
 Facilitating international trade,Trade finance to customs
 How government can support blockchain innovation,Blockchain adoption

Module 4: Unintended Consequences and Technical Assistance (8 hours)

 Blockchain and the environment,Steering blockchain through the energy transition


 Reducing the cost of remittances with blockchain
 Potential unintended consequences of blockchain
 Addressing criminal activities, inequality, privacy, security, and data
protection,Intellectual property regulations

TEXTBOOKS:
1. "Blockchain and the Law: The Rule of Code" by Primavera De Filippi and Aaron
Wright, published by Harvard University Press.
2. “The Age of Cryptocurrency: How Bitcoin and Digital Money are Challenging the
Global Economic Order" by Paul Vigna and Michael Casey (St. Martin's Press,
2015)
3. "Blockchain Revolution: How the Technology Behind Bitcoin is Changing Money,
Business, and the World" by Don Tapscott and Alex Tapscott (Portfolio, 2016)

Course No.: Name: Modern Cryptography Credits: 3-0-0-6 Prerequisites: NIL

Course Objectives:
1. To understand the fundamentals of modern cryptography, including symmetric and
asymmetric ciphers, hash functions, and digital signatures.
2. To explore the mathematics behind modern cryptography, including modular
arithmetic, prime numbers, and finite fields.
3. To gain knowledge of widely-used cryptographic algorithms, including RSA, AES,
and SHA.
4. To learn about the practical application of cryptography in information security,
authentication, and data protection.

Course Outcomes:

1. Understand the fundamental principles of modern cryptography and its


mathematical foundations.
2. Evaluate the security of cryptographic algorithms and design secure systems
based on modern cryptographic techniques.
3. Design and implement secure data encryption, authentication, and signature
mechanisms using cryptographic tools and algorithms.
4. Apply cryptography in various fields, including computer science, finance, and
government, to achieve secure and confidential communication.

Module 1: Fundamentals of Cryptography (15 hours)


 Modular arithmetic, polynomial arithmetic, and finite fields
 Symmetric ciphers and their types
 Asymmetric ciphers and their types
 Hash functions and message authentication codes

Module 2: Cryptographic Algorithms (15 hours)


 Advanced Encryption Standard (AES) and Data Encryption Standard (DES)
 RSA algorithm and Diffie-Hellman key exchange
 Elliptic curve cryptography
 Digital signatures and authentication mechanisms

Module 3: Cryptographic Applications and Tools (12 hours)


 Cryptographic tools and libraries
 Authentication and key establishment
 Cryptographic protocols and standards
 Cryptography and information security

Reference Textbooks:

1. "Applied Cryptography: Protocols, Algorithms, and Source Code in C" by Bruce


Schneier, published by Wiley.
2. "Cryptography and Network Security: Principles and Practice" by William Stallings,
published by Prentice Hall.
3. "Introduction to Modern Cryptography" by Jonathan Katz and Yehuda Lindell,
published by CRC Press.
4. "Serious Cryptography: A Practical Introduction to Modern Encryption" by Jean-
Philippe Aumasson, published by No Starch Press.
Course Name: Smart Contracts And Solidity Credits: 3-0-
Prerequisites: NIL
No.: Programming 0-6

Course Objectives:

1. To provide an introduction to the concept of smart contracts and their applications.


2. To familiarize students with the Solidity programming language and its constructs.
3. To enable students to design, implement, and deploy smart contracts on the
Ethereum blockchain.
4. To teach students best practices for secure smart contract development and
auditing.

Course Outcomes:

1. Students will be able to understand the purpose and potential of smart contracts in
various industries.
2. Students will be able to write smart contracts in Solidity and deploy them on the
Ethereum blockchain.
3. Students will be able to design and implement secure smart contracts, and avoid
common security issues.
4. Students will be able to apply best practices for auditing and testing smart
contracts.

Module 1: Introduction to Smart Contracts and Solidity (6 hours)

 Definition and brief history of smart contracts,Applications of smart contracts


 Introduction to the Ethereum blockchain
 Solidity programming language and its syntax
 Structure of a smart contract,Global variables in Solidity

Module 2: Ethereum Development (12 hours)

 Life cycle of a Solidity contract,Interfaces and inheritance in Solidity,External


function calls
 Fallback functions,Payable functions and transactions,Revert, assert, and require
statements
 Decentralized Autonomous Organizations (DAOs)
 Introduction to MakerDAO

Module 3: Advanced Solidity Development (12 hours)


 Token-based membership,Share-based membership,Automated immutable
systems
 Pure functions and view functions,Ethereum Virtual Machine (EVM)
 Bytecode interpretation
 Ethereum mining reward scheme,Gas pricing

Module 4: Security and Auditing of Smart Contracts (12 hours)

 Security issues in smart contracts,Common attacks on smart contracts,Error


handling in smart contracts
 Best practices for secure smart contract development,Modifiers
 Mutex pattern and balance limit pattern,Smart contract security tools, including
Smart Inspect, GasTap, Smart Check, and Solgraph
 Advanced research topics in smart contracts

TEXTBOOKS:

1. "Mastering Blockchain: Distributed Ledger Technology, Decentralization, and


Smart Contracts Explained" by Imran Bashir. Packt Publishing, 2018.
2. "Building Ethereum Dapps: Decentralized Applications on the Ethereum
Blockchain" by Roberto Infante. Apress, 2018.
3. "Solidity Programming Essentials: A Beginner's Guide to Build Smart Contracts for
Ethereum and Blockchain" by Ritesh Modi. Packt Publishing, 2018.

BIGDATA ELECTIVES

Course Name: Security And Privacy For Credits: Credits:


Prerequisites: NIL
No.: Big Data Analytics 3-0-0-6

Course Objectives:

1. Understand the basic concepts of Cryptography.


2. Learn methods and tools for securing big data and how to apply them in practice.
3. Understand differential privacy and its impact on big data.
4. Be familiar with the laws and regulations regarding data protection in big data
environments.

Course Outcomes:

1. Understand the basic concepts of Cryptography.


2. Develop skills and knowledge to apply different methods and tools to secure big
data.
3. Be able to analyze the impact of differential privacy and malware on big data.
4. Understand the data protection laws and regulations for big data and apply them in
practice.
Module 1: Cryptography for Big Data Security (12 hours)

 Introduction to cryptography and its relevance to big data,Symmetric and


asymmetric encryption techniques
 Hash functions and message authentication codes (MACs)
 Public Key Infrastructure (PKI) and digital certificates
 Cryptographic protocols for secure communication in big data,Cryptographic tools
and libraries for big data security

Module 2: Security and Privacy in Big Data (20 hours)

 Threat modeling and risk assessment for big data,Access control and
authentication mechanisms for big data systems
 Data anonymization and privacy-preserving techniques for big data
 Network security and data protection in distributed big data systems
 Intrusion detection and prevention in big data environments,Best practices for
securing big data and compliance with data protection laws

Module 3: Big Data Modeling for Security Analysis (10 hours)

 Data modeling and schema design for security analysis in big data,Machine
learning and data mining techniques for security analysis in big data
 Visualization and analytics tools for security analysis in big data,Data fusion and
correlation for security intelligence in big data
 Case studies of security analysis in big data environments
 Big data security testing and evaluation methodologies

Textbooks:

1. Big Data , Storage sharing and security , Fei Hu, CRC press
2. Privacy & Big data , by Mary E. Ludloff, Terence Craig. Released September 2011.
Publisher(s): O'Reilly Media, Inc.

Name: Time Series


Course No.: Credits: 3-0-0-6 Prerequisites: NIL
Analysis

Course Objectives:

 To provide an introduction to the fundamental concepts of time series analysis and


their applications.
 To teach the students how to analyze and interpret time series data using different
techniques.
 To enable the students to apply various time series methods for forecasting and
modeling.
 To equip the students with the necessary statistical knowledge and tools to
evaluate time series models.

Course Outcomes:

 Students will be able to understand the fundamental concepts of time series


analysis and its applications.
 Students will be able to analyze and interpret time series data using different
techniques.
 Students will be able to apply various time series methods for forecasting and
modeling.
 Students will be able to evaluate time series models using statistical tests for
stationarity and model selection.

Module 1: Introduction to Time Series Analysis (9 hours)

 Purpose of Time Series Analysis,Descriptive Techniques,Time Series


Plots,Visualizing Multidimensional Time Series
 Visualizing Multiple Time Series,Histograms,Seasonal Effects and Trend
Identification,Transformations
 Sample Autocorrelation,Correlogram,Time Series Filtering,Probability Models
 Stochastic Processes,Stationarity,Second-order Stationarity,Autocorrelation

Module 2: Time Series Forecasting Techniques (12 hours)

 White Noise Model,Random Walks, Moving Average, Invertibility, ARIMA Models


 Autoregressive Processes,Fitting an AR Process,Yule–Walker Equations, General
Linear Process,Wold Decomposition Theorem
 Exponential Smoothing,Holt-Winters,Box-Jenkins Forecasting
 Optimality Models for Exponential Smoothing, Model Selection for Time Series
Forecasting

Module 3: Spectral Analysis (9 hours)

 Sinusoidal Model,Wiener-Khintchine Theory,Cramer Representation,Periodogram


Analysis
 Statistical Properties of Periodogram,Consistent Estimators of Spectral Density
 Bivariate Processes,Cross-covariance, Cross-correlation
 ARCH
 GARCH

Module 4: Advanced Topics in Time Series Analysis (12 hours)


 Gaussian Process,Gaussian Regression
 Vector Autoregression Models (VAR)
 Structural Form,Reduced Form,Parameter Estimation
 Kernel Methods for Forecasting,Adaptive Filtering Mechanism for
Forecasting,Statistical Testing for Stationarity,Augmented Dickey-Fuller
 Kwiatkowski–Phillips–Schmidt–Shin Test,Goodness of Estimation

Textbooks:

1. "Time Series Analysis and Its Applications: With R Examples" by Robert H.


Shumway and David S. Stoffer, Springer Publication
2. "Forecasting: Principles and Practice" by Rob J. Hyndman and George
Athanasopoulos, OTexts Publication
3. "Time Series Analysis: With Applications in R" by Jonathan D. Cryer and Kung-S

Course Name: Exploratory Credits: 3- Prerequisites: Basic understanding


No.: data analytics 0-0-6 of programming and statistics

Course Objectives:

1. To understand the principles of exploratory data analysis and its importance in


data-driven decision-making.
2. To learn various techniques of data cleaning, data wrangling, and data
visualization.
3. To understand the statistical concepts required for EDA, including probability
distributions, central limit theorem, and hypothesis testing.
4. To learn how to use popular tools and platforms such as R, Python, and Tableau
for EDA.

Course Outcomes:

1. Students will be able to understand the importance of exploratory data analysis


and its role in data-driven decision-making.
2. Students will be able to clean, wrangle, and visualize different types of data to
make better data-driven decisions.
3. Students will be able to apply statistical concepts to analyze data and make
inferences from it.
4. Students will be able to use popular tools and platforms such as R, Python, and
Tableau to perform exploratory data analysis.

Course Outline:

Module 1: Introduction to EDA (9 hours)


 Introduction to EDA
 Importance of EDA in data-driven decision-making
 Types of data and data quality issues
 Data cleaning techniques
 Data wrangling techniques

Module 2: Data Visualization (12 hours)

 Introduction to data visualization


 Types of data visualization
 Visualization tools: R, Python, Tableau
 Data visualization techniques
 Best practices in data visualization

Module 3: Statistical Inference (12 hours)

 Introduction to statistical inference


 Probability distributions
 Central limit theorem
 Hypothesis testing
 Regression analysis

Module 4: Advanced Topics in EDA (9 hours)

 Introduction to machine learning for EDA


 Time series analysis
 Text analytics
 Big data and EDA

Reference Textbooks:

1. "Exploratory Data Analysis with R" by Roger D. Peng, published by Springer.


2. "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media.
3. "Data Visualization: A Practical Introduction" by Kieran Healy, published by
Princeton University Press.
4. "Data Science for Business: What You Need to Know about Data Mining and Data-
Analytic Thinking" by Foster Provost and Tom Fawcett, published by O'Reilly
Media.
Course No.: Name: Data Engineering Credits: 3-0-0-6 Prerequisites: NIL

Course Objectives:

1. To understand data engineering principles and practices, including data modeling,


database design, and data warehousing.
2. To develop skills in building efficient and scalable data pipelines for data
processing and storage.
3. To learn how to manage and optimize data systems for performance and reliability.
4. To gain practical experience with data engineering tools and technologies,
including SQL, ETL, and data warehousing.

Course Outcomes:

1. Demonstrate an understanding of data engineering concepts and principles.


2. Design and implement efficient and scalable data pipelines for data processing and
storage.
3. Manage and optimize data systems for performance and reliability.
4. Apply data engineering tools and technologies to real-world data problems.

Module 1: Introduction to Data Engineering (6 hours)

 Overview of Data Engineering


 Key Concepts in Data Modeling
 Relational Database Design Principles
 Data Warehousing Concepts

Module 2: Data Processing and Storage (12 hours)

 Data Pipelines and ETL (Extract, Transform, Load)


 Distributed Systems and Parallel Computing
 Data Storage Technologies, including NoSQL databases
 Data Quality and Validation

Module 3: Managing and Optimizing Data Systems (9 hours)

 Performance Tuning and Optimization


 Data Security and Privacy
 Scalability and Availability
 Disaster Recovery and Backup
Module 4: Data Engineering Tools and Technologies (9 hours)

 SQL and Relational Database Management Systems


 Big Data Frameworks, including Hadoop and Spark
 Cloud-Based Data Warehousing, including Amazon Redshift and Google BigQuery
 Data Visualization and Reporting Tools

Module 5: Capstone Project (6 hours)

 Students will apply data engineering concepts, principles, and tools to design and
implement a data pipeline and data warehousing solution for a real-world data
problem.

Textbook References:

1. Designing Data-Intensive Applications by Martin Kleppmann (O'Reilly Media)


2. Data Warehousing in the Age of Big Data by Krish Krishnan (Morgan Kaufmann)
3. The Data Warehouse Toolkit by Ralph Kimball and Margy Ross (Wiley)

You might also like