BIG_DATA_Unit 4

Big Data
(KCS061)
Unit-4
3rd YEAR (6th Sem)
(2023-24)
HDFS High Availability &
Federation
High Availability
3
High Availability
4
High Availability
5
HDFS Federation
• Hdfs federation basically enhances the existing architecture
of HDFS. Earlier with Hadoop 1, entire cluster was using a
single namespace. And there a single namenode was
managing that namespace. Now if that namenode was
failing, the Hadoop cluster used to be down.
• The Hadoop cluster used to be unavailable until that
namenode was coming back to the cluster causing the loss
of resources and time.
• Hdfs federation with Hadoop 2 comes over this limitation
by allowing the uses of more than one namenode and thus
more than one namespace.
6
HDFS Federation
7
HDFS Federation Benefits
• Isolation
• Namespace Scalability
• Enhanced Performance
8
Introduction to
NoSQL
What is RDBMS
2
 RDBMS : the relational

database management system.
 Relation: a relation is a 2D table

which has the following features:
 Name
 Attributes
 Tuples
10
Name
Issues with RDBMS - Scalability
3
 Issues with scaling up when the

dataset is just too big e.g. Big Data.
 Not designed to be distributed.
 Looking at multi-node database
solutions. Known as ‘horizontal
scaling’.
 Different approaches include:
 Master-slave
 Sharding
11
Scaling RDBMS
4
Master-Slave Shardin
 All writes are written to the g
master. All reads are performed
 Scales well for both reads
against the replicated slave and writes.
databases.  Not transparent, application needs
 Critical reads may be incorrect as to be partition-aware.
 Can no longer have relationships or
writes may not have been joins across partitions.
propagated down.
 Loss of referential integrity across
 Large data sets can pose problems shards.
as master needs to duplicate data
to slaves.
12
What is NoSQL
5
• Stands for Not Only SQL. Term was redefined by Eric Evans after
Carlo Strozzi.
• Class of non-relational data storage systems.
• Do not require a fixed table schema nor do they use the concept of
joins.
• Relaxation for one or more of the ACID properties (Atomicity,
Consistency, Isolation, Durability) using CAP theorem.
13
Need of NoSQL
6  Explosion of social media sites (Facebook, Twitter, Google etc.)
with large data needs.
 Rise of cloud-based solutions such as Amazon S3 (simple storage

solution).
 Just as moving to dynamically-typed languages (Ruby/Groovy), a

shift to dynamically-typed data with frequent schema changes.
 Expansion of Open-source community.
 NoSQL solution is more acceptable to a client now than a year

ago.
14
NoSQL Types
7
NoSQL database are classified into four types:
• Key Value pair based
• Column based
• Document based
• Graph based
15
Key Value Pair Based
8
• Designed for processing dictionary. Dictionaries contain a

collection of records having fields containing data.
• Records are stored and retrieved using a key that
uniquely identifies the record, and is used to quickly
find the data within the database.
Example: CouchDB, Oracle NoSQL Database, Riak etc.
We use it for storingsession information, user

profiles, preferences, shopping cart data.
We would avoid it when we need to query data having

relationships between entities.
16
Column based
It store data as Column families
9
containing rows that have many

columns associated with a row key.
Each row can have different columns.
Column families are groups of

related data that is accessed
together.
Example: Cassandra, HBase,

Hypertable, and Amazon
DynamoDB.
We use it for content management

systems, blogging platforms, log
aggregation.
We would avoid it for systems that

are in early development, changing 17
Document Based
The database stores and retrieves
10
documents. It stores documents in

the value part of the key-value
store.
Self- describing, hierarchical tree

data structures consisting of maps,
collections, and scalar values.
Example: Lotus Notes, MongoDB,

Couch DB, Orient DB, Raven DB.
We use it for content management
systems, blogging platforms, web
analytics, real-time analytics, e-
commerce applications.
18
We would avoid it for systems that need complex transactions
Graph Based
11
Store entities and relationships between these entities

as nodes and edges of a graph respectively. Entities
have properties.
Traversing the relationships is very fast as relationship

between nodes is not calculated at query time but is
actually persisted as a relationship.
Example: Neo4J, Infinite Graph, OrientDB, FlockDB.
It is well suited for connected data, such as social

networks, spatial data, routing information for goods
and supply.
19
CAP Theorem
12
 According to Eric Brewer a distributed system has 3 properties :
 Consistency
 Availability
 Partitions
 We can have at most two of these three properties for any shared-
data system
 To scale out, we have to partition. It leaves a choice between

consistency and availability. ( In almost all cases, we would choose
availability over consistency)
 Everyone who builds big applications builds them on CAP :

Google, Yahoo, Facebook, Amazon, eBay, etc.
20
Advantages of NoSQL
13
 Cheap and easy to implement (open source)

 Data are replicated to multiple nodes (therefore
identical and fault-tolerant) and can be partitioned
 When data is written, the latest version is on at least
one node and then replicated to other nodes

 No single point of failure
 Easy to distribute
 Don't require a strict schema
21
What is not provided by NoSQL
14
 Joins
 Group by
 ACID transactions
 SQ L
 Integration with applications that are based on SQ L
22
Where to use NoSQL
15
 NoSQL Data storage systems makes sense for applications that

process very large semi-structured data –like Log Analysis, Social
Networking Feeds, Time-based data.
 To improve programmer productivity by using a database that
better matches an application's needs.
 To improve data access performance via some combination of
handling larger data volumes, reducing latency, and improving
throughput.
23
MongoDB
What is MongoDB?
• MongoDB is a document-oriented NoSQL
database used for high-volume data storage.
• It contains the data model, which allows you to
represent hierarchical relationships.
• It uses JSON-like documents with optional
schema instead of using tables and rows in
traditional relational databases.
• Documents containing key-value pairs are the
basic units of data in MongoDB.
25
MongoDB Characteristics
• Document Oriented
• Support for ad hoc queries
• Replication
• Indexing
• Load balancing
• Schema-less database
• Sharding
• High performance
• GridFS
26
MongoDB Applications
• In E-commerce product catalogue.

• Big data
• Content management
• Real-time analytics
• Maintain Geolocations
• E.g.
• location: { type: "Polygon", coordinates: [ [ -73.564453125,
41.178653972331674 ], [ -71.69128417968749,
41.178653972331674 ], [ -71.69128417968749,
42.114523952464246 ], [ -73.564453125, 42.114523952464246 ],
[ -73.564453125, 41.178653972331674 ] ]] }
• Maintaining data from social websites.
27
MongoDB v/s RDBMS
MongoDB RDBMS
• Document-oriented and non- • Relational database
relational database • Row based
• Document based • Column based
• Field based • Table based
• Collection based and key–value
• Supports Joins
pair
• Has predefined schema and
• Embedded documents
not good for hierarchical data
• Has dynamic schema and ideal
storage
for hierarchical data storage
• By increasing RAM, vertical
• 100 times faster and
scaling can happen
horizontally scalable through
sharding
28
Database Commands
• View all databases

> show dbs
• Create a new or switch databases
> use dbName
• View current Database
> db
• Delete Database
> db.dropDatabase()
29
Collection Commands
• Show Collections
> show collections
• Create a collection named 'comments'
> db.createCollection('comments')
• Delete a collection named 'comments'
> db.comments.drop()
30
Row(Document) Commands
• Show all Rows in a Collection
> db.comments.find()
• Show all Rows in a Collection (Prettified)
> db.comments.find().pretty()
• Find the first row matching the object
> db.comments.findOne({name: 'Harry'})
31
• db.list.find() to print out all documents in it. It
will print the below output :
32
• If you call the pretty() method, it will print the
result like below :
33
• Insert One Row
• db.comments.insert({ 'name': 'Harry', 'lang': 'JavaScript',
'member_since': 5 })
• Insert many Rows
• db.comments.insertMany([
{ 'name': 'Harry', 'lang': 'JavaScript', 'member_since': 5 },
{'name': 'Rohan', 'lang': 'Python', 'member_since': 3 },
{'name': 'Lovish', 'lang': 'Java', 'member_since': 4 }
])
34
• Search in a MongoDb Database
> db.comments.find({lang:'Python'})
• Limit the number of rows in output
> db.comments.find().limit(2)
• Count the number of rows in the output
> db.comments.find().count()
35
• Update a row
• db.comments.updateOne( {name: ‘John'},
{$set: {'name': 'Harry', 'lang': 'JavaScript',
'member_since': 51 } } )
*The above example uses the db.collection.updateOne() method

on the comments collection to update the first document where
name equals “John”
36
• Mongodb Rename Operator
• db.comments.update({name: 'Rohan'}, {$rename:
{ member_since: 'member' }})
• Delete Row
• db.comments.remove({name: 'Harry'})
• db.comments.remove({name: 'Harry'},
{justOne:True})
• db.comments.deleteOne()
• db.comments.deleteMany()
37
38
39
Advantages of MongoDB
• Flexible Database
• Sharding
• High Speed
• High Availability
• Scalability
• Ad-hoc Query Support
• Easy Environment Setup
40
Disadvantages of MongoDB
• Joins not Supported
• High Memory Usage
• Limited Data Size
• Limited Nesting
41
Spark
Apache Spark Overview
• An in-memory big data platform that performs especially

well with iterative algorithms
• Originally developed by UC Berkeley starting in 2009
Moved to an Apache project in 2013
• 10-100x speedup over Hadoop with some algorithms,
especially iterative ones as found in machine learning
43
DA
G
• Stands for Directed Acyclic
Graph
• For every spark job a DAG
of tasks is created
44
Spark APIs
• APIs for
• Java
• Python
• Scala
• R
• Spark itself is written in Scala
45
46
Spark Libraries
• Spark SQL
• For working with structured data. Allows you to
seamlessly mix SQL queries with Spark programs
• Spark Streaming
• Allows you to build scalable fault-tolerant streaming
applications
• MLlib
• Implements common machine learning algorithms
• GraphX
• For graphs and graph-parallel computation
47
Spark Architecture
48
What is Spark Used For?
• Stream processing
• log files
• sensor data
• financial transactions
• Machine learning
• store data in memory and rapidly run repeated queries
• Interactive analytics
• business analysts and data scientists increasingly want to
explore their data by asking a question, viewing the result, and
then either altering the initial question slightly or
drilling deeper into results.
This interactive query process requires systems such as
Spark that are able to respond and adapt quickly
• Data integration
• Extract, transform, and load (ETL) processes
49
Reasons to choose Spark
• Simplicity
• All capabilities are accessible via a set of rich APIs
• designed for interacting quickly and easily with data at scale
• well documented
• Speed
• designed for speed, operating both in memory and on disk
• Support
• supports a range of programming languages
• includes native support for tight integration with a number of leading
storage solutions in the Hadoop ecosystem and beyond
• large, active, and international community
• A growing set of commercial providers
• including Databricks, IBM, and all of the main Hadoop vendors deliver
comprehensive support for Spark-based solutions
50
Spark execution model
• Application
• Driver
• Executer
• Job
• Stage
51
52
• At runtime, a Spark application maps to a single driver process and a
set of executor
processes distributed across the hosts in a cluster
• The driver process manages the job flow and schedules tasks and is
available the entire
time the application is running.
• Typically, this driver process is the same as the client process used to initiate
the job
• In interactive mode, the shell itself is the driver process
• The executors are responsible for executing work, in the form of
tasks, as well as for storing any data that you cache.
• Invoking an action inside a Spark application triggers the launch of
a job to fulfill it
• Spark examines the dataset on which that action depends and
formulates an execution plan.
• The execution plan assembles the dataset transformations into
stages. A stage is a collection of tasks that run the same code,
each on a different subset of the data.
53
Cluster Managers
• Yarn
• Spark Standalone
• Mesos
54
Basic Programming Model
• Spark’s basic data model is called a Resilient
Distributed Dataset (RDD)
• It is designed to support in-memory data storage,
distributed across a cluster
• fault-tolerant
• tracking the lineage of transformations applied to data
• Efficient
• parallelization of processing across multiple nodes in the cluster
• Immutable
• Partitioned
55
RDDs
• Two basic types of operations on RDDs
• Transformations
• Transform an RDD into another RDD, such as mapping,
filtering, and more
• Actions
• Process an RDD into a result , such as count, collect, save , …
• The original RDD remains unchanged throughout
• The chain of transformations from RDD1 to RDDn
are logged
• and can be repeated in the event of data loss or the
failure of a cluster node
56
RDDs
• Data not have to fit in a single machine
• Data is separated into partitions
• RDDs remain in memory
• greatly increasing the performance of the cluster,
particularly in use cases with a requirement for
iterative queries or processes
57
RDDs
58
Transformations
• Transformations are lazily processed, only upon an action
• Transformations create a new RDD from an existing one
• Transformations might trigger an RDD repartitioning, called a
shuﬄe
• Intermediate results can be manually cached in memory/on
disk
• Spill to disk can be handled automatically
59
RDDs
• Transformation
• There are two kinds of transformations:
a. Narrow Transformations
It is the result of map, filter and such that the data is from a single
partition only, i.e. it is self-sufficient. An output RDD has partitions
with records that originate from a single partition in the parent
RDD. Only a limited subset of partitions used to calculate the
result.
60
RDDs
b. Wide Transformations
It is the result of groupByKey() and reduceByKey() like functions.
The data required to compute the records in a single partition may
live in many partitions of the parent RDD. Wide transformations
are also known as shuffle transformations because they may or
may not depend on a shuffle.
61
Spark transformations
62
Spark transformations
63
Spark actions
64
Spark actions
65
RDD and cache
• Spark can persist (or cache) a dataset in memory
across operations
• Each node stores in memory any slices of it that it
computes and reuses them in other actions on that
dataset – often making future actions more than 10x
faster
• The cache is fault-tolerant: if any partition of an RDD
is lost, it will automatically be recomputed using
the transformations that originally created it
• You can mark an RDD to be persisted using the
persist() or cache() methods on it
66
Installing Spark
• Step 1: Installing Java
• We need Java8 onwards
• Step 2: Install Scala
• We will use Spark Scala Shell
• Step 3: Install Spark
• Step 4: within the “spark” directory, run:

• ./bin/spark-shell
67
For Cluster installations
• Each machine will need Spark in the same folder,
and key-based passwordless SSH
• access from the master for the user running Spark
• Slave machines will need to be listed in the slaves
file
• See spark/conf/
68
Spark features
• Fast processing
• In memory computing
• flexible
• Fault tolerance
• Better analytics
69
Advantages of Spark
• Runs Programs up to 100x faster than Hadoop MapReduce
• Does the processing in the main memory of the worker
nodes
• Prevents unnecessary I/O operations with the disks
• Ability to chain the tasks at an application programming
level
• Minimizes the number of writes to the disks
• Uses Directed Acyclic Graph (DAG) data processing engine
• MapReduce is just one set of supported constructs
70
WordCount in Java
71
WordCount In Scala
• Map
• In this step, using Spark context variable, sc, we read a text file.
var map = sc.textFile("/path/to/text/file")
• then we split each line using space " " as separator.
var split = map.flatMap(line => line.split(" "))
• and we map each word to a tuple (word, 1), 1 being the number of
occurrences of word.
var mapf = split. map(word => (word,1))
• We use the tuple (word,1) as (key, value) in reduce stage.
• Reduce
• We reduce all the words based on Key
var counts = map.reduceByKey(_ + _);
• Save counts to local file
• The counts could be saved to local file.
var reducef = counts.saveAsTextFile("/path/to/output/") 72
Scala
What is Scala?
• The Scala programming language was created in 2001 by Martin Odersky to

combine functional programming and object-oriented programming into one
language.
• Scala is a general-purpose, high-level, multi-paradigm programming language.
• It is a pure object-oriented programming language which also provides the
support to the functional programming approach.
• Scala programs can convert to bytecodes and can run on the JVM(Java Virtual
Machine).
• Scala stands for SCAalable LAnguage.
• Scala is highly influenced by Java and some other programming languages like
Lisp, Haskell, Pizza etc.
74
Features of Scala
• There are many features which makes it different from

other languages.
• Object- Oriented: Every value in Scala is an object so it is
a purely object-oriented programming language.
• Functional: It provides the support for the high-order
functions, nested functions,etc.
• Statically Typed: The process of verifying and enforcing the
constraints of types is done at compile time in Scala.
• Run on JVM & Can Execute Java Code: The Scala compiler
compiles the program into .class file, containing the
Bytecode that can be executed by JVM.
75
The Basics of Language
• Data Types
• Casting
• Opperations
• If Expression
• Arrays & Tuples & Lists
• For & while Comprehension
• Functions
• Call By ...
76
Operators
• Arithmatic operators
• Assignment operators
• Logical operators
• Bitwise operators
• Relational operators
77
Data Types
• scala> var a:Int=2

Byte a: I nt = 2
• Short
• Int scala> var a:Double=2.2
a : Double = 2.2
• Long
• Float scala> var a:Float=2.2f
• a : Float = 2.2
Double
• Char scala> var a:Byte=4
• Boolean a : Byte = 4
• Unit scala> var a:Boolean=false
a : Boolean = false
78
Casting
• asInstanceOf
scala> var a:Int=955

a : I n t = 955
scala> var b=a.asInstanceOf[Char]

b: Char = λ
79
Operations
scala> var a=955
a : I n t = 955
// Import all definitions from scala.math. This only means you don't have to call math.sin(x), you can call sin(x) instead. To do this,
begin your session with
scala> import scala.math._
import scala.math._
scala> p r i n t ( 4 - 3 / 5 . 0 )
//compile time they w i l l change to
primitive not object any more to run
faster 3.4
scala> pr i nt ( a +3 ) / / a . a dd( 3 ) add=>+

958
scala> println(math.sin(4*Pi/3).abs)
//1)you can import classes 2)=>math.abs(math.sin(4*Pi/3))
0.8660254037844385
scala> if(a==955) | print("lambda")

println("HOLY" >=("EVIL"))
Lambda
true
80
If Expression
scala> var m="wood"
m: String = wood
scala> p r int l n ( m.lengt h( ) )

4
scala> p r int l n ( "$m i s brown" )

woodi s brown
scala> i f ( m== "wood") { / /== i s equi valent to
equal
print("brown")
}
else if (m.equals("grass")) {
println("green")
}
else{
p r i n t ( " i don't know")
}
brown
81
Arrays
scala> var a r r = A r r a y ( 5 , 4 , 4 7 , 7 , 8 , 7 )
a r r : A r r a y [ I n t ] = Ar r ay ( 5, 4 , 47, 7 , 8 , 7)
scala>
println(arr(1));println(arr(2));println(arr(3));
4
47
7
scala> var array=new A r r a y [ S t r i n g ] ( 3 ) ;
array: Array[ Stri n g ] = A r r a y ( n u l l , n u l l , n u l l )
scala> var at=Array(4,4.7,"Gasai Yuno")

at: Array[Any] = Ar r ay ( 4, 4 . 7 , Gasai Yuno)
82
Tuples
scala> var t u p l e s = ( 2 * 5 , " t e n " , 1 0 . 0 d , 1 0 . 0 f , 1 0 . 0 , ( 9 , " N I N E " ) )

t u p l e s : ( I n t , String , Double, F l o a t , Double, ( I n t , S t r i n g ) ) =
( 1 0 ,te n , 1 0 .0 ,1 0 .0 ,1 0 .0 ,( 9 ,N IN E) )
scala> p r i n t ( t u p l e s . _ 1 )
10
scala> p r i n t ( t u p l e s . _ 2 )
ten
scala>
print(tuples._6._2) //
2nd element of 6th
element in the tuple.
NINE
83
Lists
scala> var l s t=Li s t( "b", "c", "d")

l s t : Li s t [ St r i ng ] = L i s t ( b , c , d)
scala> p r int ( ls t . h ead)
b
scala> p r i n t ( l s t . t a i l )
L i s t ( c , d)
scala> var l s t 2 = " a " : : l s t

lst2: Li st [ St r i ng] = L i s t ( a , b, c , d)
84
While
• while (Boolean Expression) { Expression }
• do { Expression } while (Boolean
Expression)
scala> var i = 0 ;
i : Int = 0
scala> while(i<10){
i+=1;
print(i+" ")
}
1 2 3 4 5 6 7 8 9 10
Note : In functional , It is preferred to not use while and in general imperative

style.
85
Functions
• Syntax
def 'method_name' ('parameters':'return_type_parameters') : ('return_type_of_method') =
{ 'method_body'
return 'value'
}
• Note: if you add “return” you need to specify return type else you are not obligated
and one line function no need to bracket.
86
Functions
object FuncEx {
def main(args: Array[String]): Unit = {
// Calling the function
println("Sum is: " + functionToAdd(5,3));
}
// declaration and definition of function
def functionToAdd(a:Int, b:Int) : Int =
{
var sum:Int = 0
sum = a + b
// returning the value of sum
return sum
}
}
Output:
Sum is: 8
87
Lazy val & Eager Evaluation
• lazy val vs. val
 The difference between them is, that a val is executed when it is defined whereas
a lazy val is executed when it is accessed the first time.
 In contrast to a method (defined with def) a lazy val is executed once and then
never again. This can be useful when an operation takes long time to complete
and when it is not sure if it is later used. languages (like Scala) are strict by default,
but lazy if explicitly specified for given variables or parameters.
 E.g.
scala> val x = 15
x: Int = 15
scala> lazy val y = 13
y: Int = <lazy>
scala> x
res0: Int = 15
Scala> y
res1: Int = 13
88
Closures
A closure is a function, whose return value depends on the value of

one or more variables declared outside this function.
scala> var votingAge = 18

votingAge: I n t = 18
scala> val isVotingAge = (age: I n t ) => age >= votingAge

isVotingAge: I n t => Boolean = <function1>
scala> isVotingAge( 16)

res2: Boolean = false
89
Classes
• Scala classes are blueprint or template for
creating objects. Moreover they contain
information of fields, methods, constructor,
super classes, etc. So with the help
of class keyword we can define the class.
• To access the members of the class we need
to create an object of the class. So with the
help of the new keyword we can create an
object of class.
90
Classes
class Smartphone
{
// Class variables
var number: Int = 16
var company: String = "ABC"
// Class method
def Display()
{
println("Name of the company : " + company);
println("Total number of Smartphone generation: " + number);
}
}
91
Objects
• We call the elements of a class type objects.
• We create an object by prefixing an application of the constructor of
the class with the operator new.
• It is a basic unit of Object Oriented Programming and represents the
real-life entities.
• In Scala use the object keyword, in place of class, to define an object.
92
Objects
object Main
{
// Main method
def main(args: Array[String]) :Unit = {
// Class object
var obj = new Smartphone();
obj.Display();
}
}
93
Scala Inheritance
• Inheritance is an object oriented concept which is

used to reusability of code.
• You can achieve inheritance by using extends
keyword.
• To achieve inheritance a class must extend to other
class.
• A class which is extended called super or parent
class.
• A class which extends class is called derived or base
class.
94
Scala Inheritance
class Employee {
var salary:Float = 10000
}
class Programmer extends Employee {

var bonus:Int = 5000
println("Salary = "+salary)
println("Bonus = "+bonus)
}
object MainObject {
def main(args:Array[String]) {
new Programmer()
}
}
Output:
Salary = 10000.0 Bonus = 5000
95
Scala Inheritance
//Scala Method Overriding Example

class Vehicle{
def run(){
println("vehicle is running")
}
}
class Bike extends Vehicle{

override def run(){
println("Bike is running")
}
}
96
Scala Inheritance
object MainObject {
def main(args:Array[String]) {
var b = new Bike()
b.run()
}
}
Output:
Bike is running
97
Scala Inheritance
98
Thank You!
99

BIG_DATA_Unit 4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BIG_DATA_Unit 4

Uploaded by

Copyright:

Available Formats

Big Data

 RDBMS : the relational

 Relation: a relation is a 2D table

 Issues with scaling up when the

 Rise of cloud-based solutions such as Amazon S3 (simple storage

 Just as moving to dynamically-typed languages (Ruby/Groovy), a

 Expansion of Open-source community.

 NoSQL solution is more acceptable to a client now than a year

• Designed for processing dictionary. Dictionaries contain a

We use it for storingsession information, user

We would avoid it when we need to query data having

containing rows that have many

Column families are groups of

Example: Cassandra, HBase,

We use it for content management

We would avoid it for systems that

documents. It stores documents in

Self- describing, hierarchical tree

Example: Lotus Notes, MongoDB,

Store entities and relationships between these entities

Traversing the relationships is very fast as relationship

Example: Neo4J, Infinite Graph, OrientDB, FlockDB.

It is well suited for connected data, such as social

 To scale out, we have to partition. It leaves a choice between

 Everyone who builds big applications builds them on CAP :

 Cheap and easy to implement (open source)

one node and then replicated to other nodes

 NoSQL Data storage systems makes sense for applications that

• In E-commerce product catalogue.

• View all databases

*The above example uses the db.collection.updateOne() method

• An in-memory big data platform that performs especially

• Step 4: within the “spark” directory, run:

• The Scala programming language was created in 2001 by Martin Odersky to

• There are many features which makes it different from

• scala> var a:Int=2

scala> var a:Int=955

scala> var b=a.asInstanceOf[Char]

scala> pr i nt ( a +3 ) / / a . a dd( 3 ) add=>+

scala> if(a==955) | print("lambda")

scala> p r int l n ( m.lengt h( ) )

scala> p r int l n ( "$m i s brown" )

scala> var at=Array(4,4.7,"Gasai Yuno")

scala> var t u p l e s = ( 2 * 5 , " t e n " , 1 0 . 0 d , 1 0 . 0 f , 1 0 . 0 , ( 9 , " N I N E " ) )

scala> var l s t=Li s t( "b", "c", "d")

scala> var l s t 2 = " a " : : l s t

Note : In functional , It is preferred to not use while and in general imperative

A closure is a function, whose return value depends on the value of

scala> var votingAge = 18

scala> val isVotingAge = (age: I n t ) => age >= votingAge

scala> isVotingAge( 16)

• Inheritance is an object oriented concept which is

class Programmer extends Employee {

//Scala Method Overriding Example

class Bike extends Vehicle{

You might also like