Professional Documents
Culture Documents
LPI DevOps06032020
LPI DevOps06032020
LPI DevOps06032020
In 2013 , while browsing Linkedin I stumbled across an articl tolking about something called “DevOps” . As I read it, I
The company I’d worked at for over a decade was struggling to deliver software quickly enough. Provisioning
environments was a costly, time-consuming, manual, and inelegant affair. Continuous integration was barely
existent, and setting up development environments were an exercise in patience. As my job title included the words
In 2018, and after some years of DevOps practicing I decided to pass the LPI-701 DevOps tools engineer exam
I checked the LPI site , but unfortunately I can’t find a documentation about the exam. I used the official
documentation of each element included in the exam and of course I took notes , After a hard work I took the
certification on November 2018 and today I’m here to share all notes in a book and I hope that all people pass the
exam.
About Me:
My name is Radhouen Assakra, DevOps , full stack engineer and Author . Certified DevOps LPI-701 engineer tools
and I have a lot of badges about multiple technologies like Git , Kubernetes , Docker, and others .
• My LinkedIn: https://www.linkedin.com/in/assakraradhouen/
• My Twitter: @assakraradhouen
• My Github: https://github.com/radhouen
Plan
• When the software is developed and released, the agile team will not care
development.
Kanban
• It is focused client process. So, it makes sure that the client is continuously involved during every stage.
• Agile teams are extremely motivated and self-organized so it likely to provide a better result from the
development projects.
• Agile software development method assures that quality of the development is maintained
• The process is completely based on the incremental progress. Therefore, the client and team know exactly
what is complete and what is not. This reduces risk in the development process.
Limitation of the Agile Model
• Cost of implementing an agile method is little more compared to other development methodologies.
• The project can easily go off track if the project manager is not clear what outcome he/she wants.
Limitation of the Agile Model
• The testing process starts once development is over. Hence, it has high chances of bugs to be found later in
Agile and Waterfall are very different software development methodologies and are good in their respective way.
However, there are certain major differences highlighted below Waterfall model is ideal for projects which have
defined requirements, and no changes are expected. On the other hand, Agile is best suited where there is a
In Agile process, requirements can change frequently. However, in a waterfall model, it is defined only once by
In Agile Description of project, details can be altered anytime during the system development life cycle “SDLC”
When it comes to improving IT performance in order to give organizations competitive advantages, we need a
new way of thinking, a new way of working that improve all the production and management processes and
operations from the team or project level to the organizational level while encouraging collaboration between
all the individuals involved for fast delivery of valuable products and services.
For this reason, a new culture, corporate philosophy and way of working is emerging. This way of working
integrates agile methods, lean principles and practices, social psychological beliefs for motivating workers,
systems thinking for building complex systems, continuous integration and continuous improvement of IT
products and services for satisfying both customers and production and development teams. This new way of
working is DevOps.
What’s DevOps
Adam Jacobs in a presentation defined DevOps as “a cultural and professional movement, focused on how we
build and operate high velocity organizations, born from the experiences of its practitioners”. This guru of
DevOps also states that DevOps is reinventing the way we run our businesses. Moreover, he argue that DevOps
is not the same but unique to the people who have practiced it (Jacobs, 2015).
Gartner analysts declare that DevOps “… is a culture shift designed to improve quality of solutions that are
business-oriented and rapidly evolving and can be easily molded to today’s needs” (Wurster, et al., 2013).
Thus, DevOps is a movement that integrates different ways of thinking and different ways of working for
We cannot talk about DevOps in a corporate environment without integrating a set of principles and practices
that make development and operations teams work together. For this reason, Garter analysts support that
DevOps takes into account several commonly agreed practices which form the fundamentals of DevOps
• Continuous delivery :DevOps strives for deadlines and benchmarks with major releases. The ideal goal is to
• It's essential for the operational team to fully understand the software release and its hardware/network
2. Collaborative Development
This starts with development sketch plan and programming.
3. Continuous Testing
Unit and integration testing help increase the efficiency and speed of the development.
5. Continuous Monitoring
This is needed to monitor changes and address errors and mistakes spontaneously whenever they happen.
Taking care of these six stages will make you a good DevOps organization. This is not a must-have model but it
is one of the more sophisticated models. This will give you a fair idea on the tools to use at different stages to
CD pipelines, CI tools, and containers make things easy. When you want to practice DevOps, having a
• Agile
Software development method emphasis on iterative, incremental, and
evolutionary development.
Iterative approach which focuses on collaboration, customer feedback, and small,
rapid releases.
Priority to the working system over complete documentation
• DevOps
Software development method focuses on communication, integration, and
collaboration among IT professionals.
Practice of bringing development and operations teams together.
Process documentation is foremost : it will send the software to the operational
team for deployment.
DevOps is a culture, it's an agile's extension
What is TDD ?
Test-driven development is a software development process that relies on the repetition of a very short
development cycle: requirements are turned into very specific test cases, then the software is improved so that
the tests pass.
It refers to a style of programming in which three activities are nested:
Coding.
Testing (in the form of writing unit tests).
Refactoring
TDD cycles
Application architecture
• Service-based architecture
• All the services would then work with an aggregation layer that can be termed as a bus.
• As SOA Bus got bigger and bigger with more and more.
• All the communication between the services in over REST over HTTP.
• https://microservices.io/
• https://rubygarage.org/blog/monolith-soa-microservices-serverless
Micro-services vs Monolith
Restful API
What is an API ?
What is an REST ?
REST Methods:
https://www.restapitutorial.com/lessons/httpmethods.html
REST Endpoint :
The URI/URL where API/service can be accessed by a client application.
Authentication
Some API’s require authentication to use their service. This could be free or paid.
Broken authentication
Etc ...
Application security risks
Authentication without passwords ( cryptography private keys, bio-metrics, smart card, etc ...)
Using the Cross-Origin Resource Sharing (CORS) headers to prevent Cross-site request forgery “CSRF”
Avoid using redirects and forwards whenever possible. At least prevent users from affecting the destination.
Application security risks
Commands are sent from the user's browser to a web site or a web application.
CORS handles this vulnerability well : disallows the retrieval and inspection of data from another Origin
It prevent the third-party JavaScript from reading data out of the image, and will fail AJAX requests
deployment
Plan
Relational database:
Each row in a table has its own unique key (primary key).
Rows in a table can be linked to rows in other tables by adding a foreign keys.
NoSQL database:
Mechanism for storage and retrieval of data other than the tabular relations used in relational
databases.
Properties :
Simplicity of design
Object storage:
CAP theorem:
It is impossible for a distributed data store to simultaneously provide more than two out of the three
guarantees :
Consistency : Receive the same information, regardless the node that process the order.
Availability : the system provides answers for all requests it receives, even if one or more nodes are down.
Partition-tolerance : the system still Works even though it has been divided by a network failure.
Data platforms and concepts
ACID properties :
Set of properties of database transactions intended to guarantee validity even in the event of errors, power
Atomicity : each transaction is treated as a single "unit", which either succeeds completely, or fails
completely.
Consistency (integrity): Ensures that a transaction can only bring the database from one valid state to
Isolation: two or more transactions made at the same time must be independent and do not affect each
other.
Durability: If a transaction is successful, it will persist in the system (recorded in non-volatile memory)
Data platforms and concepts
Message brokers
A message broker acts as an intermediary platform when it comes to processing communication between two
applications.
Take incoming messages from applications and perform some action on them :
Route messages
Hub-and-spoke
Message bus.
AWS SQS, RabbitMQ, Apache Kafka, ActiveMQ, Openstack Zaqar, Jboss Messaging, ...
Message brokers and queues
Message brokers
Perform message aggregation, decomposing messages into multiple messages and sending them to their
destination, then recomposing the responses into one message to return to the user.
AWS Lambda
Plesk
OpenShift
Cloud Foundry
Etc ...
PaaS Platforms : CloudFoundry
Promoted for continuous delivery : supports the full application development life cycle (from initial
Container-based architecture : runs apps in any programming language over a variety of cloud service
providers.
platform is available from either the Cloud Foundry Foundation as open-source software or from a variety of
In a platform, all external dependencies (databases,messaging systems, files systems, etc ...) are considered
services.
PaaS Platforms : OpenShift
• Used to create, test, and run applications, and finally deploy them on cloud.
• Capable of managing applications written in different languages (Node.js, Ruby, Python, Perl, and Java).
• It is extensible : helps the users support the application written in other languages).
free and open-source software platform for cloud computing, mostly deployed as IaaS.
interrelated components that control diverse, multi-vendor hardware pools of processing, storage, and
OpenStack Components
PaaS Platforms : Openstack Architecture
Cloud Init : what’s cloud init ?
Cloud-init allows you to customize a new server installation during its deployment using data supplied in
Etc ...
Disk configuration
Command execution
Package management
Bootstrapping Chef/Puppet/Ansible
Etc ...
Module 3
Source code management
Plan
• colloaboration by :
multiple sources.
Documents
Computer programs
• Each revision is associated with a timestamp and the person making the change.
• Revisions can be compared, restored, and with some types of files, merged
Why use SCM ???
1. Version Control
2. Working in teams
6. Show off
SCM solutions: SCM types
Ability to work offline (Allows users to work productively when not connected to a network)
• The concept of a centralized system is that it works on a Client-Server relationship. The repository is located
• Whereas, in a Distributed System, every user has a local copy of the repository in addition to the central
• License version 2.
• Advantages :
Implicit backup
• Git provides the git config tool, which allows you to set configuration variables. Git stores all global configurations in .gitconfig
file, which is located in your home directory. To set these configuration values as global, add the --global option, and if you
omit --global option, then your configurations are specific for the current Git repository.
• You can also set up system wide configuration. Git stores these values in the /etc/gitconfig file, which contains the
configuration for every user and repository on the system. To set these values, you must have the root rights and use the --
system option.
• Color highlighting:
• Git and Github are two different things. Git is the version control system, while GitHub is a service for hosting Git repos that
helps people collaborate on writing software. However, they are often confounded because of their similar name, because of
the fact that GitHub builds on top of Git, and because many websites and articles don’t make the difference between them
clear enough.
Git service hosting
Git Life Cycle
we will discuss the life cycle of Git. In later chapters, we will cover the Git commands for each operation.
If necessary, you also update the working copy by taking other developer's changes.
You commit changes. If everything is fine, then you push the changes to the repository.
After committing, if you realize something is wrong, then you correct the last commit and push the
• Git Add : git add <fileName> , git add -A or git add . , git add -u(Modified and deleted files), git add *.go
• Git Commit : git commit -a , git commit -m , git commit -am , git commit –amend
• Git Clone
• Git Stash
• Git Ignore
• Git Fork: A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting
• Git Repository
• Git Index : When you create a commit, what is committed is what is currently in the index, not what is in your working
directory.
• Git Head : HEAD is a reference to the last commit in the currently check-out branch.
• Git Remote
• Git Tags : Git has the ability to tag specific points in a repository’s history as being important. Typically, people use this
• Git Checkout : is the act of switching between different versions of a target entity. The git checkout
command operates upon three distinct entities: files, commits, and branches. ... In the Undoing Changes topic,
• Git Revert : git revert [saved hash] , get hash from git log , will revert to specific commit .
• Git Reset : git reset HEAD~ --hard : remove the last commit from the master branch .
• Git Rm : Remove files from the index, or from the working tree and the index
• Git Cherry-pick : Sometimes you don't want to merge a whole branch into another, and only need to pick one
• Git Diff : Show changes between commits, commit and working tree
• Git Status : displays the state of the working directory and the staging area
• Git Blame : Show what revision and author last modified each line of a file
Git work flow :Recovering from mistakes in Git
Git Branching and Merging
• Git Branch
• Git Rebase
• Git Squash: With git it’s possible to squash previous commits into one. This is a great way to group certain
• Git Fetch: Is a primary command used to download contents from a remote repository.
• Git Pull: Fetch from and integrate with another repository or a local branch.
• Git Push: Command is used to upload local repository content to a remote repository. Pushing is how you
A local .gitignore file is usually placed in the root directory of a project. You can also create a global .gitignore file and any
Branches are used to develop features isolated from each other. The master branch is the "default" branch when you create a
repository. Use other branches for development and merge them back to the master branch upon completion.
Git Branching: Rename and Delete
Git Branching: Push branch to remote repository
Git work flow : Rebase vs Merge
• Rebasing and merging are both designed to integrate changes from one branch into another branch but in
different ways.
• For ex. let’s say we have commits like below, the merge will result as a combination of commits, whereas
rebase will add all the changes in feature branch starting from the last commit of the master branch:
• When you do rebase a feature branch onto master, you move the base of the feature branch to master
• Merging takes the contents of the feature branch and integrates it with the master branch. As a result, only
the master branch is changed. The feature branch history remains same.
• Develop Not ready for pubic consumption but compiles and passes all tests
• Feature branches :
• Release branches
• Hotfix
• Bugfix
git commit -a
git push
5) Bring it up to date with develop (to minimize big changes on the ensuing pull request)
6) Finish the feature branch (don’t use git flow feature finish)
Source: https://danielkummer.github.io/git-flow-cheatsheet/
Module 4
System image creation and VM
Deployment
Plan
• Vagrant
• Vagrantfile
• Vagrantbox
• Packer
Vagrant
• Wrapper around configuration management software such as Ansible, Chef, Salt, and Puppet.
$ mkdir centos
$ cd centos
$ vagrant up
• OR :
$vagrant ssh
Vagrant Command
• Creating a VM :
• vagrant init -- Initialize Vagrant with a Vagrantfile and ./.vagrant directory, using no specified base image.
Before you can do vagrant up, you'll need to specify a base image in the Vagrantfile.
Vagrant init -f : Create a new Vagrantfile, overwriting the one at the current path
Vagrant init –box-version : Create a Vagrantfile, locking the box to a version constraint:
• vagrant init <boxpath> -- Initialize Vagrant with a specific box. To find a box, go to the public Vagrant box
catalog. When you find one you like, just replace it's name with boxpath. For example,
• Starting a VM :
vagrant up -- starts vagrant environment (also provisions only on the FIRST vagrant up)
vagrant resume -- resume a suspended machine (vagrant up works just fine for this as well)
vagrant reload --provision -- restart the virtual machine and force provisioning
vagrant ssh <boxname> -- If you give your box a name in your Vagrantfile, you can ssh into it with
• Stopping a VM:
• Saving Progress:
vagrant snapshot save [options] [vm-name] <name> -- vm-name is often default. Allows us to save so
• Tips:
vagrant provision --debug -- use the debug flag to increase the verbosity of the output
vagrant up --provision | tee provision.log -- Runs vagrant up, forces provisioning and logs all output to a
file
Vagrant provionners
• Alright, so we have a virtual machine running a basic copy of Ubuntu and we can edit files from our
machine and have them synced into the virtual machine. Let us now serve those files using a webserver.
• We could just SSH in and install a webserver and be on our way, but then every person who used Vagrant
would have to do the same thing. Instead, Vagrant has built-in support for automated provisioning. Using
this feature, Vagrant will automatically install software when you vagrant up so that the guest machine can
https://www.vagrantup.com/intro/getting-started/provisioning.html
https://docs.ansible.com/ansible/latest/scenario_guides/guide_vagrant.html
Vagrant Box contents
Vagrantfile : The information from this will be merged into your Vagrantfile that is created when you
run vagrant init boxname in a folder.
• Boxes commands :
vagrant box list -- see a list of all installed boxes on your computer
vagrant box add <name> <url> -- download a box image to your computer
Multi-provider portability
Stability
Identicality
Packer Uses cases
• Continuous Delivery:
Generate new machine images for multiple platforms on every change to Ansible, Puppet or Chef repositories
• Environment Parity:
• Auto-Scaling acceleration:
Launch completely provisioned and configured instances in seconds, rather than minutes or even hours.
Packer Terminology:
variables (optional) : Variables allow you to set API keys and other variable settings without changing
the configuration file
provisioners (optional) : Tools that install software after the initial OSinstall
• This varies depending on which builder you use. Thefollowing is an example for the QEMU builder:
4. Using VNC, type in commands in the installer to start an automated install via kickstart/preseed/etc
9. Packer Shuts down VM and then runs the post processor (if set)
10. PROFIT!
• Usage: packer [--version] [--help] <command> [<args>]
• Advantages of Virtualization:
Conserve power
Easier automation.
• Problems of Virtualization:
• Containers Solution :
Containers provide a standard way to package your application's code, configurations, and dependencies
Containers share an operating system installed on the server and run as resource-isolated processes,
• Linux Containers :
cgroup: Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their
future children, into hierarchical groups with specialized behaviour.
namespace: wraps a global system resource in an abstraction that makes it appear to the processes
within the namespace that they have their own isolated instance of the global resource.
• In short:
Memory
CPU
Block I/O
Network
Multiple namespaces:
Written in: Go
Subscription-based, commercially
Intended for:
infrastructure density
Container Solutions & Landscape
Image
Container
The image when it is ‘running.’ The standard unit for app service
Engine
The software that executes commands for containers. Networking and volumes are part of Engine.
Registry
Control Plane
Dockerfile
• Lifecycle
Commands docker create creates a container but does not start it.
docker rename allows the container to be renamed.
docker run creates and starts a container in one operation.
docker rm deletes a container.
docker update updates a container's resource limits.
• Starting and Stopping
docker start starts a container so it is running.
docker stop stops a running container.
docker restart stops and starts a container.
docker pause pauses a running container, "freezing" it in place.
Commands
docker unpause will unpause a running container.
docker wait blocks until running container stops.
docker kill sends a SIGKILL to a running container.
docker attach will connect to a running container.
• Images : Images are just templates for docker containers.
• Life cycle :
docker images shows all images.
docker import creates an image from a tarball.
docker build creates image from Dockerfile.
docker commit creates image from a container, pausing it
temporarily if it is running.
Commands docker rmi removes an image.
docker load loads an image from a tar archive as STDIN,
including images and tags (as of 0.7).
docker save saves an image to a tar archive stream to STDOUT
with all parent layers, tags & versions (as of 0.7).
• Info :
docker history shows history of image.
docker tag tags an image to a name (local or registry).
Docker Network
host: For standalone containers, remove network isolation between the container and the Docker host,
overlay: Connect multiple Docker daemons together and enable swarm services to communicate with
each other.
macvlan : Allow to assign a MAC address to a container, making it appear as a physical device on
network
none: Disable all networking. Usually used in conjunction with a custom network driver.
• Containers can be attached and detached from user-defined networks on the fly.
• Network:
docker network create my-net : create new network my-net
docker network ls list all docker network
docker network rm my-net : delete network
docker create --name my-nginx --network my-net --publish
8080:80 \nginx:latest : create new container and connect to
Commands
network my-net
docker network connect my-net my-nginx : connect to network
my-net
docker network disconnect my-net my-nginx : disconnect from
network my-net
Module 6
Container Infrastructure
Plan
local Mac
Windows box
Company network
Data center
Create a machine. Requires the --driver flag to indicate which provider (VirtualBox,
• Here is an example of using the --virtualbox driver to create a machine called dev.
• Machine drivers:
Microsoft Azure
Digital Ocean
Microsoft Hyper-V
OpenStack
Rackspace
IBM Softlayer
Oracle VirtualBox
VMware Fusion
• Docker machine :
active Print which machine is active
config Print the connection config for machine
create Create a machine
env Display the commands to set up the env for the Docker client
inspect Inspect information about a machine
ip Get the IP address of a machine
kill Kill a machine
ls List machines
provision Re-provision existing machines
regenerate-certs Regenerate TLS Certificates for a machine
restart Restart a machine
Commands rm Remove a machine
ssh Log into or run a command on a machine with SSH.
scp Copy files between machines
mount Mount or unmount a directory from a machine with SSHFS.
start Start a machine
status Get the status of a machine
stop Stop a machine
upgrade Upgrade a machine to the latest version of Docker
url Get the URL of a machine
version Show the Docker Machine version or a machine docker version
help Shows a list of commands or help for one command
Flocker
• Flocker is an open-source Container Data Volume Manager for your Dockerized applications.
• By providing tools for data migrations, Flocker gives ops teams the tools they need to run containerized
• Unlike a Docker data volume which is tied to a single server, a Flocker data volume, called a dataset,
is portable and can be used with any container, no matter where that container is running.
• Flocker manages Docker containers and data volumes together. When you use Flocker to manage your
stateful microservice, your volumes will follow your containers when they move between different hosts
in your cluster.
• You can also use Flocker to manage only your volumes, while continuing to manage your containers
• Flocker is an open-source Container Data Volume Manager for your Dockerized applications.
• By providing tools for data migrations, Flocker gives ops teams the tools they need to run containerized
• Unlike a Docker data volume which is tied to a single server, a Flocker data volume, called a dataset,
is portable and can be used with any container, no matter where that container is running.
• Flocker manages Docker containers and data volumes together. When you use Flocker to manage your
stateful microservice, your volumes will follow your containers when they move between different hosts
in your cluster.
• You can also use Flocker to manage only your volumes, while continuing to manage your containers
• Docker-compose
• Docker Swarm
• Kubernetes
Docker compose
Docker applications.
application’s services.
• Then, with a single command, you create and start all the
Production :
Staging,
Testing,
As well as CI workflows.
Docker-compose
1- Development environments :
• Create and start one or more containers for each dependency (databases, queues, caches, web
service
3- Cluster deployments :
• The Docker Engine may be a single instance provisioned with Docker Machine or an entire Docker
Swarm cluster
Docker-compose
Define the services that make up your app in docker-compose.yml so they can be run together in
an isolated environment.
Lastly, run docker-compose up and Compose will start and run your entire app.
• Docker composeS:
build Build or rebuild services
bundle Generate a Docker bundle from the Compose file
config Validate and view the Compose file
create Create services
down Stop and remove containers, networks, images, and volumes
events Receive real time events from containers
exec Execute a command in a running container
help Get help on a command
images List images
kill Kill containers
logs View output from containers
pause Pause services
port Print the public port for a port binding
Commands
ps
pull
List containers
Pull service images
push Push service images
restart Restart services
rm Remove stopped containers
run Run a one-off command
scale Set number of containers for a service
start Start services
stop Stop services
top Display the running processes
unpause Unpause services
up Create and start containers
version Show the Docker-Compose version information
Containers have become popular thanks to their focus on consistency across platforms from development to production. The rise in
interest to containers has in turn brought in higher demands for their deployment and management.
The need for better control attracted a number of software options as solutions for container orchestration, which allows for abstraction of
Two of the major players developing container orchestration are Kubernetes and Docker Swarm. In this post, we will take a look at
docker host .
Docker swarm
Docker swarm
features highlights
1- Cluster management integrated with Docker Engine: Use the Docker Engine CLI to create a swarm of Docker Engines
where you can deploy application services. You don’t need additional orchestration software to create or manage a swarm.
2- Decentralized design: Instead of handling differentiation between node roles at deployment time, the Docker Engine
handles any specialization at runtime. You can deploy both kinds of nodes, managers and workers, using the Docker Engine.
This means you can build an entire swarm from a single disk image.
3- Declarative service model: Docker Engine uses a declarative approach to let you define the desired state of the various
services in your application stack. For example, you might describe an application comprised of a web front end service with
4- Scaling: For each service, you can declare the number of tasks you want to run. When you scale up or down, the swarm
manager automatically adapts by adding or removing tasks to maintain the desired state.
5- Desired state reconciliation: The swarm manager node constantly monitors the cluster state and reconciles any differences
between the actual state and your expressed desired state. For example, if you set up a service to run 10 replicas of a
container, and a worker machine hosting two of those replicas crashes, the manager creates two new replicas to replace the
replicas that crashed. The swarm manager assigns the new replicas to workers that are running and available.
Docker swarm
features highlights
6- Multi-host networking: You can specify an overlay network for your services. The swarm manager automatically assigns
addresses to the containers on the overlay network when it initializes or updates the application.
7- Service discovery: Swarm manager nodes assign each service in the swarm a unique DNS name and load balances
running containers. You can query every container running in the swarm through a DNS server embedded in the swarm.
8- Load balancing: You can expose the ports for services to an external load balancer. Internally, the swarm lets you specify
how to distribute service containers between nodes.
9- Secure by default: Each node in the swarm enforces TLS mutual authentication and encryption to secure communications
between itself and all other nodes. You have the option to use self-signed root certificates or certificates from a custom root
CA.
10- Rolling updates: At roll out time you can apply service updates to nodes incrementally. The swarm manager lets you
control the delay between service deployment to different sets of nodes. If anything goes wrong, you can roll-back a task
Node :
• A node is an instance of the Docker engine participating in the swarm. You can also think of this as a Docker node. You
can run one or more nodes on a single physical computer or cloud server, but production swarm deployments typically
include Docker nodes distributed across multiple physical and cloud machines.
• To deploy your application to a swarm, you submit a service definition to a manager node. The manager node dispatches
Manager nodes also perform the orchestration and cluster management functions required to maintain the desired state of
the swarm. Manager nodes elect a single leader to conduct orchestration tasks.
scheduling services
Worker nodes receive and execute tasks dispatched from manager nodes. By default manager nodes also run services as
worker nodes, but you can configure them to run manager tasks exclusively and be manager-only nodes. An agent runs on
each worker node and reports on the tasks assigned to it. The worker node notifies the manager node of the current state
of its assigned tasks so that the manager can maintain the desired state of each worker.
Docker swarm
• A service is the definition of the tasks to execute on the manager or worker nodes. It is the central structure of the
swarm system and the primary root of user interaction with the swarm.
• When you create a service, you specify which container image to use and which commands to execute inside running
containers.
• In the replicated services model, the swarm manager distributes a specific number of replica tasks among the nodes based
• For global services, the swarm runs one task for the service on every available node in the cluster.
• A task carries a Docker container and the commands to run inside the container. It is the atomic scheduling unit of
swarm. Manager nodes assign tasks to worker nodes according to the number of replicas set in the service scale. Once a
task is assigned to a node, it cannot move to another node. It can only run on the assigned node or fail.
Docker swarm
Load Balancing:
• The swarm manager uses ingress load balancing to expose the services you want to make available externally to the
swarm. The swarm manager can automatically assign the service a PublishedPort or you can configure a PublishedPort for
the service. You can specify any unused port. If you do not specify a port, the swarm manager assigns the service a port
• External components, such as cloud load balancers, can access the service on the PublishedPort of any node in the cluster
whether or not the node is currently running the task for the service. All nodes in the swarm route ingress connections to
• Swarm mode has an internal DNS component that automatically assigns each service in the swarm a DNS entry. The
swarm manager uses internal load balancing to distribute requests among services within the cluster based upon the DNS
Commands
• Initialize A Swarm:
1. Make sure the Docker Engine daemon is started on the host machines.
Uses Case
kubernetes
What’s kubernetes ?
Sometimes called:
Kube
application architectures
4. Kubernetes does NOT and will not expose all of the 'features' of the
• Master
• Minion/Node
• Pod
• Deployment
• Service
• Label
• Namespace
Kubernetes Overview
Kubernetes Primitives and key words:
Kubernetes Master
kube-apiserver
kube-scheduler
kube-controller-manager
Etcd
• Might contain:
kube-proxy
• Kube-apiserver : Component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control
plane.It is designed to scale horizontally – that is, it scales by deploying more instances.
• Etcd : Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
• Scheduler : Component on the master that watches newly created pods that have no node assigned, and selects a node
• Controller-manager: Component on the master that runs controllers . Logically, each controller is a separate process, but
to reduce complexity, they are all compiled into a single binary and run in a single process:
Node Controller : For checking the cloud provider to determine if a node has been deleted in the cloud after it stops
responding
Service Controller : For creating, updating and deleting cloud provider load balancers
Volume Controller : For creating, attaching, and mounting volumes, and interacting with the cloud provider to
orchestrate volume
Kubernetes Minion-node
kubelet
kube-proxy
cAdvisor
• Might contain:
• A Pod is the basic execution unit of a Kubernetes application–the smallest and simplest unit in the Kubernetes object
model that you create or deploy. A Pod represents processes running on your Cluster .
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: myapp-container
image: busybox
A ReplicationController ensures that a specified number of pod replicas are running at any one time. In other words, a
ReplicationController makes sure that a pod or a homogeneous set of pods is always up and available.
Consists of :
Pod template
Count
Label Selector
• Kube will try to keep $count copies of pods matching the label selector running.
• If too few copies are running the replication controller will start a new pod somewhere in the cluster
A ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to
guarantee the availability of a specified number of identical Pods.
Replica Sets are declared in essentially the same way as Replication Controllers, except that they have more options for the
selector.
Kubernetes Replication controller ,replicaset
Replication controller
You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state
at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
Make a 3
replicas of
selector: service
nginx using
matchLabels: matchlabels
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.12.2
ports:
- containerPort: 80
Kubernetes Services
With Kubernetes you don’t need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes
gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.
Kubernetes Services
Kubernetes Services
Module 6
Container Infrastructure
Plan
• Continuous Integration (CI)
• What is it?
• What are the benefits?
• Continuous Build Systems
• Jenkins
• What is it?
• Where does it fit in?
• Why should I use it?
• What can it do?
• How does it work?
• Where is it used?
• How can I get started?
• Putting it all together
• Conclusion
• References
CI- Defined
“Continuous Integration is a software development practice where members of a team integrate their work frequently,
usually each person integrates at least daily - leading to multiple integration per day. Each integration is verified by
Martin Fowler
CI- What does it really mean ?
• Integrated
All changes up until that point are combined into the project
• Built
• Tested
• Archived
• Deployed
Simply, you have two identical environments (infrastructure) with the “green” environment hosting the current production apps
Now, when you’re ready to make a change to app2 for example and upgrade it to v2, you’d do so in the “blue environment”. In
that environment you deploy the new version of the app, run smoke tests, and any other tests (including those to exercise/prime
the OS, cache, CPU, etc). When things look good, you change the loadbalancer/reverse proxy/router to point to the blue
environment:
Canary deployment
Canary is about deploying an application in small, incremental steps, and only to a small group of people. There are a few
possible approaches, with the simplest being to serve only some percentage of the traffic to the new application, to a more
complicated solutions, such as a feature toggle. A feature toggle allows you to gate access to certain features based on specific
criteria (e.g., gender, age, country of origin). The most advanced feature toggle I am aware of, gatekeeper, is implemented at
Facebook.
CI- Workflow
Continous
Artifact
Code Build System
Repository
Repository Executable /
Regular Interval package
Testing
Results
Source and
Tests
Test Reports
Deployment
Developer
Improving Your Productivity
• Code Repositories :
• Artifact Repositories:
• Jenkins (http://jenkins-ci.org/) is
Easy to install
Easy to use
Multi-technology
Multi-platform
Widely used
Extensible
Free
Jenkins for a Developer
• Easy to install
• Easy to use
• Multi-technology
Artifact
Code
Repository
Repository Executable /
Regular package
Interval
Testing
Results
Source and
Tests
Test Reports
Deployment
Developer
Jenkins User Interface
Actions
Nodes
Jobs
Developer demo goes here…
• Languages
Java
C
Python
More Power – Jenkins Plugins
Builders
Test Frameworks
Notifiers
Static Analyzers
Jenkins Plugins - SCM
Accurev
Bazaar
BitKeeper
ClearCase
Darcs
Dimensions
Git
Harvest
MKS Integrity
PVCS
StarTeam
Subversion
Visual SourceSafe
Jenkins Plugins – Build & Test
Ant Junit
Maven Nunit
MSBuild MSTest
Cmake Selenium
Gradle Fitnesse
Grails
Scons
Groovy
Jenkins Plugins – Analyzers
Checkstyle Emma
CodeScanner Cobertura
DRY Clover
Crap4j GCC/GCOV
Findbugs
PMD
Fortify
Sonar
FXCop
Jenkins Plugins – Other Tools
• Notification: • Authorization
Campfire LDAP
IM Amazon EC2
IRC VMWare
Sounds Xen
Speak Libvirt
Jenkins – Integration for You
Faster
Safer
Easier
Smarter
Declarative Pipelines
Stages
Tools
Post-build actions
Notifications
Environment
All wrapped up in a pipeline { … } step, with syntactic and semantic validation available.
Declarative Pipelines
Step syntax is valid within the pipeline block and outside it.
Notifications and postBuild actions are run at the end of your build even if the build has failed.
pipeline {
stages {
stage('build') {
steps {
sh 'go version'
}
Declarative Pipelines
Current sections:
Stages
Agent
Environment
Tools
Post Build
Notifications
Declarative Pipelines
Stages:
Stage blocks look the same as the new block-scoped stage step.
Think of each stage block as like an individual Build Step in a Freestyle job.
Example:
stages {
stage("build") {
sh './run-some-script.sh'
stage("deploy") {
sh "./deploy-something.sh"
}
Declarative Pipelines
Agent:
Agent docker:’ubuntu’ - Run on any node within a Docker container of the “ubuntu” image
Agent docker:’ubuntu’, label:’foo’ - Run on a node with the label “foo” within a Docker container of the “ubuntu” image
Agent none - Don’t run on a node at all - manage node blocks yourself within your stages.
Tools:
The tools section allows you to define tools to autoinstall and add to the PATH.
The tools section takes a block of tool name/tool version pairs, where the tool
Example:
tools {
}
Declarative Pipelines
Implementations provide:
A condition name
A method to check whether the condition has been satisfied with the current build status.
success {
failure {
mail to:"me@example.com",
subject:"Build failed",
body:"Fix me please!"
----------------------------------------------
postBuild {
always {
archive "target/**/*"
junit 'path/to/*.xml'
failure {
sh './cleanup-failure.sh'
}
Declarative Pipelines
A real-world example with tools, postBuild and notifications
pipeline{
tools {
stages {
stage("build") {
postBuild {
always {
archive "target/**/*"
junit 'target/surefire-reports/*.xml
}}
notification {
}
Declarative Pipelines
Master/slave architecture:
Declarative Pipelines
Master/slave architecture:
• Jenkins Master:
Jenkins Slave:
The job of a Slave is to do as they are told to, which involves executing build jobs dispatched by the Master.
We can configure a project to always run on a particular Slave machine or a particular type of Slave machine, or simply
pipeline{
agent none
stages {
stage("distribute") {
parallel (
"windows":{
node('windows') {
},
"mac":{
node('osx') {
},
"linux":{
node('linux') {
} )
}
Module 9
Ansible and configuration
management tools
Plan
• Configuration management tools
• Ansible
• Inventory
• Playbook
• Variables
• Template module (Jinja2)
• Roles
• ansible-vault
• Puppet
• Chef
Configuration management tools
DevOps is evolving and gaining traction as organizations discover how it enables them to produce better applications and
DevOps' core values are Culture, Automation, Measurement, and Sharing (CAMS), and an organization's adherence to them
Another DevOps concept is the idea that almost everything can be managed in code: servers, databases, networks, log files,
• Ansible Components
Inventory files
Ansible modules
Playbooks
Ansible Configuration File
• Default configuration
/etc/ansible/ansible.cfg
ANSIBLE_CONFIG ENV
Groups enclosed in []
• Example: Instructing Ansible to use the active Python Interpreter when using Python Virtual
–a argument
YAML Overview
Multiple YAML
documents YAML uses spacing to
separates by a nest data structures
---
Ansible Terms
Ansible Playbooks
• Written in YAML
• One or more plays that contain hosts
and tasks
• Tasks have a name & module keys.
• Modules have parameters
• Variables referenced with {{name}}
Ansible gathers “facts”
Create your own by register-ing
output from another task
• http://docs.ansible.com/ansible/latest/
YAMLSyntax.html
Ansible Playbooks
Using Variable Files and Loops with Ansible
• vars_files: filename.yaml
• vars_files: filename.yaml
• Attributes
PLAY RECAP
**********************************************************
******* localhost : ok=1 changed=1
unreachable=0 failed=0
• Ansible has an extensive module library capable of operating compute, storage and networking devices
http://docs.ansible.com/ansible/modules_by_category.html
Loops
Conditionals
Many more!
http://docs.ansible.com/ansible/playbooks.html
https://galaxy.ansible.com/
• Ansible use cases
• Setting up Ansible infrastructure
• Using the Ansible ad-hoc CLI
• Creating and running Ansible playbooks
machines. It is an open-source software configuration management tool developed using Ruby which
• In Puppet, the first thing what the Puppet master does is to collect the details of the target machine.
Using the factor which is present on all Puppet nodes (similar to Ohai in Chef) it gets all the machine
level configuration details. These details are collected and sent back to the Puppet master.
• Then the puppet master compares the retrieved configuration with defined configuration details, and
with the defined configuration it creates a catalog and sends it to the targeted Puppet agents.
• The Puppet agent then applies those configurations to get the system into a desired state.
• Finally, once one has the target node in a desired state, it sends a report back to the Puppet master,
which helps the Puppet master in understanding where the current state of the system is, as defined in
the catalog
Chef
• Chef is a configuration management tool for dealing with machine setup on physical servers, virtual
machines and in the cloud. Many companies use Chef software to control and manage their
• As shown in the diagram below, there are three major Chef components:
Workstation
Server
Nodes
Chef
Chef
Workstation
• The Workstation is the location from which all of Chef configurations are managed. This machine holds
all the configuration data that can later be pushed to the central Chef Server. These configurations are
tested in the workstation before pushing it into the Chef Server. A workstation consists of a command-
line tool called Knife, that is used to interact with the Chef Server. There can be multiple Workstations
• Writing Cookbooks and Recipes that will later be pushed to the central Chef Server
1. Writing Cookbooks and Recipes that will later be pushed to the central Chef Server
describes everything that is required to configure part of a system. The user writes Recipes that
describe how Chef manages applications and utilities (such as Apache HTTP Server, MySQL, or Hadoop)
• These Recipes describe a series of resources that should be in a particular state, i.e. Packages that
should be installed, services that should be running, or files that should be written.
• Later in the blog, I will show you how to write a Recipe to install Apache2 package on Chef Nodes by
#Apache-recipe/apache.rb
package "apache2" do
action :install
end
service "apache2" do
end
end
Chef workstation : Cookbooks
Cookbooks: Multiple Recipes can be grouped together to form a Cookbook. A Cookbook defines a scenario
Recipes, which specifies the resources to use and the order in which they are to be applied
Attribute values
File distributions
Templates
The Workstation system will have the required command line utilities, to control and manage every aspect
of the central Chef Server. Things like adding a new Node to the central Chef Server, deleting a Node
from the central Chef Server, modifying Node configurations etc can all be managed from the Workstation
itself.
Chef workstation – Components
• Knife utility: This command line tool can be used to communicate with the central Chef Server from
Workstation. Adding, removing, changing configurations of Nodes in a central Chef Server will be
carried out by using this Knife utility. Using the Knife utility, Cookbooks can be uploaded to a central
Chef Server and Roles, environments can also be managed. Basically, every aspect of the central Chef
• A local Chef repository: This is the place where every configuration component of central Chef Server
is stored. This Chef repository can be synchronized with the central Chef Server (again using the knife
utility itself).
Chef Server
• The Chef Server acts as a hub for configuration data. The Chef Server stores Cookbooks, the policies
that are applied to Nodes, and metadata that describes each registered Node that is being managed by
the Chef-Client.
• Nodes use the Chef-Client to ask the Chef Server for configuration details, such as Recipes, Templates,
and file distributions. The Chef-Client then does as much of the configuration work as possible on the
Nodes themselves (and not on the Chef Server). Each Node has a Chef Client software installed, which
will pull down the configuration from the central Chef Server that are applicable to that Node. This
• Nodes can be a cloud based virtual server or a physical server in your own data center, that is
managed using central Chef Server. The main component that needs to be present on the Node is an
agent that will establish communication with the central Chef Server. This is called Chef Client.
It manages the initial registration of the Node to the central Chef Server.
It pulls down Cookbooks, and applies them on the Node, to configure it.
Periodic polling of the central Chef Server to fetch new configuration items, if any.
Chef : Chef-solo , Chef-knife
• Chef-Solo is an open source tool that runs locally and allows to provision guest machines using Chef
cookbooks without the complication of any Chef client and server configuration. It helps to execute
• Knife is Chef’s command-line tool to interact with the Chef server. One uses it for uploading cookbooks
and managing other aspects of Chef. It provides an interface between the chefDK (Repo) on the local
• Chef nodes, Cookbook , Recipe , Environments , Cloud Resources , Cloud Provisioning and installation
https://www.tutorialspoint.com/chef/chef_knife_setup.html
• monitoring.
• mainly written in Go. Publically launched in early 2015, 1.0 released in July 2016.
• CNCF.
Prometheus Community
• Over 200 articles, talks and blog posts have been written about it.
• We provide precompiled binaries for most official Prometheus components. Check out the download
From source
• For building Prometheus components from source, see the Makefile targets in the respective repository.
Using Docker
• All Prometheus services are available as Docker images on Quay.io or Docker Hub.
• Running Prometheus on Docker is as simple as docker run -p 9090:9090 prom/prometheus. This starts
• The main Prometheus server which scrapes and stores time series data
• an node_exporter :
Most Prometheus components are written in Go, making them easy to build and deploy as static binaries.
Prometheus configuration file
Prometheus configuration file
Our default configuration file has four YAML blocks defined:global,alerting,rule_files, and scrape_configs.
Global : The first block,global, contains global settings for controlling the Prometheusserver’s behavior.
• The first setting, the scrape_interval parameter, specifies the interval betweenscrapes of any application
or service—in our case, 15 seconds. This value will bethe resolution of your time series, the period in
• The evaluation_interval ells Prometheus how often to evaluate its rules. Rulescome in two major flavors:
Recording rules - Allow you to precompute frequent and expensive expres-sions and to save their
Alerting rules - Allow you to define alert conditions.With this parameter, Prometheus will (re-
)evaluate these rules every 15 seconds.We’ll see more about rules in subsequent chapters.
Prometheus configuration file
In our default configuration, the alerting block contains the alerting configura-tion for our server. The
alertmanagers block lists each Alertmanager used by this Prometheus server. The static_configs block
indicates we’re going to specifyany Alertmanagers manually, which we have done in the targets array.
• Rule files : The third block, rule_files, specifies a list of files that can contain recording oralerting rules.
• Scrape configuration: The last block,scrape_configs, specifies all of the targets that Prometheus
willscrape.
Features and components
• a multi-dimensional data model with time series data identified by metric name and key/value pairs
• https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085
The Prometheus client libraries offer four core metric types. These are currently only differentiated in the
client libraries (to enable APIs tailored to the usage of the specific types) and in the wire protocol.
• Counter: A counter is a cumulative metric that represents a single monotonically increasing counter
Note: “Do not use a counter to expose a value that can decrease”
• Gauge: A gauge is a metric that represents a single numerical value that can arbitrarily go up and
down.
• Histogram: A histogram samples observations (usually things like request durations or response sizes)
and counts them in configurable buckets. It also provides a sum of all observed values.
• Summary: Similar to a histogram, a summary samples observations (usually things like request
durations and response sizes). While it also provides a total count of observations and a sum of all
There are a number of libraries and servers which help in exporting existing metrics from third-party
systems as Prometheus metrics. This is useful for cases where it is not feasible to instrument a given
system with Prometheus metrics directly (for example, HAProxy or Linux system stats).
Third-party exporters:
Some of these exporters are maintained as part of the official Prometheus GitHub organization, those are
We encourage the creation of more exporters but cannot vet all of them for best practices. Commonly,
The exporter default port wiki page has become another catalog of exporters, and may include exporters
The JMX exporter can export from a wide variety of JVM-based applications, for example Kafka and
Cassandra.
Features and components
Prometheus scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway
for short-lived jobs. It stores all scraped samples locally and runs rules over this data to either aggregate
and record new time series from existing data or generate alerts. Grafana or other API consumers can be
• Fully customizable
• Supports templates
• Lots of users
• Faculty & staff & students more than 40000 users on campus
• Lots of systems
• Lots of logs
• Log management platform can monitor all above-given issues as well as process operating system logs,
NGINX, IIS server log for web traffic analysis, application logs, and logs on cloud.
• Log management helps DevOps engineers, system admin to make better business decisions.
• The performance of virtual machines in the cloud may vary based on the specific loads, environments,
L stands for LogStash : used for both shipping as well as processing and storing logs
• which is hosted through Nginx or Apache Designed to take data from any source, in any format, and to
• Provides centralized logging that be useful when attempting to identify problems with servers or
• applications.
L stands for LogStash : used for both shipping as well as processing and storing logs
• which is hosted through Nginx or Apache Designed to take data from any source, in any format, and to
• Provides centralized logging that be useful when attempting to identify problems with servers or
• applications.
• It offers advanced queries to perform detail analysis and stores all the data centrally.
• Also allows you to store, search and analyze big volume of data.
Features :
• Full-Text Search
Advantages
• Provides horizontal scalability, reliability, and multitenant capability for real time use of indexing to
• Cluster : A collection of nodes which together holds data and provides joined indexing and search
capabilities.
• It is very useful while performing indexing, search, update, and delete operations.
• Document : The basic unit of information which can be indexed. It is expressed in JSON
• '{"user": "nullcon"}'. Every single Document is associated with a type and a unique id
Logstash
• It gathers all types of data from the different source and makes it available for further use.
• Logstash can unify data from disparate sources and normalize the data into your desired destinations.
Inputs
You use inputs to get data into Logstash. Some of the more commonly-used inputs are:
• file: reads from a file on the filesystem, much like the UNIX command tail -0F
• syslog: listens on the well-known port 514 for syslog messages and parses according to the RFC3164 format
• redis: reads from a redis server, using both redis channels and redis lists. Redis is often used as a "broker"
in a centralized Logstash installation, which queues Logstash events from remote Logstash "shippers".
• For more information about the available inputs, see Input Plugins.
Logstash Inputs : File , MySql log example
Logstash.conf
input {
file {
}
Logstash Inputs : syslog
Logstash.conf
input {
syslog {
}
Logstash Inputs : Redis
Logstash.conf
input {
redis {
id => "my_plugin_id"
}
Logstash : Filters
• Filters
Filters are intermediary processing devices in the Logstash pipeline. You can combine filters with conditionals
to perform an action on an event if it meets certain criteria. Some useful filters include:
Grok: parse and structure arbitrary text. Grok is currently the best way in Logstash to parse
unstructured log data into something structured and queryable. With 120 patterns built-in to Logstash,
it’s more than likely you’ll find one that meets your needs!
Mutate: perform general transformations on event fields. You can rename, remove, replace, and modify
Geoip: add information about geographical location of IP addresses (also displays amazing charts in
Kibana!)
For more information about the available filters, see Filter Plugins.
Logstash Filtres :GROK
Logstash.conf
filter {
grok {
%{NUMBER:bytes} %{NUMBER:duration}" }
Why:
A filter within Logstash: used to parse unstructured data to something structured( exp: json)
Library of terms that wrap Regular Expressions that match text patterns and match lines in log files(exp:
INT(?:[+-]?(?:[0-9]+)) )
Grok Syntax:
“%{GROK1:semantic1}%{GROK2:semantic2}%{GROK3:semantic3}”
Logstash Filtres :GROK
https://grokdebug.herokuapp.com/
https://logz.io/blog/logstash-grok/
GrokPatterns_title
https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns
Logstash Filtres :Mutate
Logstash.conf
filter {
mutate {
mutate {
}
Logstash : Outputs
Outputs are the final phase of the Logstash pipeline. An event can pass through multiple outputs, but once all
output processing is complete, the event has finished its execution. Some commonly used outputs include:
Elasticsearch: send event data to Elasticsearch. If you’re planning to save your data in an efficient, convenient,
and easily queryable format… Elasticsearch is the way to go. Period. Yes, we’re biased :)
Graphite: send event data to graphite, a popular open source tool for storing and graphing metrics.
http://graphite.readthedocs.io/en/latest/
Statsd: send event data to statsd, a service that "listens for statistics, like counters and timers, sent over UDP
and sends aggregates to one or more pluggable backend services". If you’re already using statsd, this could be
For more information about the available outputs, see Output Plugins.
Logstash : Outputs file example
Logstash.conf
input {
file {
path => "C:/Program Files/Apache Software Foundation/Tomcat 7.0/logs/*access*"
type => "apache"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
file {
path => "C:/tpwork/logstash/bin/log/output.log"
}
}
Features
Advantages
• Offers plugins to connect with various types of input sources and platforms
Kibana: what’s Kibana ?
• Dashboard offers various interactive diagrams, geospatial data, and graphs to visualize complex quires.
• It can be used for search, view, and interact with data stored in Elasticsearch directories.
• It helps users to perform advanced data analysis and visualize their data in a variety of tables, charts, and
maps.
Features :
• Users can search, View, and interact with data stored in Elasticsearch.
• Execute queries on data & visualize results in charts, tables, and maps.
Advantages :
• Easy visualizing
Filebeat : is a log shipper belonging to the Beats family , a group of lightweight shippers installed on hosts for
shipping different kinds of data into the ELK Stack for analysis.
Filebeat.yml example
• In an ELK-based logging pipeline, Filebeat plays the role of the logging agent , installed on the machine
generating the log files, tailing them, and forwarding the data to either Logstash for more advanced
• Centralized logging can be useful when attempting to identify problems with servers or applications
• ELK stack is a collection of three open source tools Elasticsearch, Logstash Kibana
• In ELK stack processing speed is strictly limited whereas Splunk offers accurate and speedy processes
• Netflix, LinkedIn, Tripware, Medium all are using ELK stack for their business
• ELK works best when logs from various Apps of an enterprise converge into a single ELK instance
• Different components In the stack can become difficult to handle when you move on to complex setup