Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

VERSION CONTROL SYSTEM:

A version control system allows you to track the history of a collection of files and includes the functionality to revert the collection of files to another version. Each version captures a snapshot of the file system at a certain point in time. The collection of files is usually source code for a programming language but a typical version control system can put any type of file under version control. The collection of files and their complete history are stored in a repository.

DISTRIBUTED VERSION CONTROL SYSTEM:


A distributed version control system has not necessary a central server which stores the data. The user can copy an existing repository. This copying process is typically called cloning in a distributed version control system. Typically there is a central server for keeping a repository but each cloned repository is a full copy of this repository. The decision which of the copies is considered to be the central server repository is a pure convention and not tied to the capabilities of the distributed version control itself. Every local copy contains the full history of the collection of files and a cloned repository has the same functionality as the original repository. Every repository can exchange versions of the files with other repositories by transporting these changes. This is typically done via the selected central server repository.

GIT:
Git is a distributed version control system. Git originates from the Linux kernel development and is used by many popular Open Source projects, e.g. the Android or the Eclipse Open Source projects, as well as by many commercial organizations. The core of Git was originally written in the programming language C but Git has also been re-implemented in other languages, e.g. Java and Python. In software development, Git is a distributed revision control and source code management (SCM) system with an emphasis on speed. Initially designed and developed by Linus Torvalds for Linux kernel development, Git has since been adopted by many other projects. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server. Git is free software distributed under the terms of the GNU General Public License version 2.

Characteristics:
Git's design is a synthesis of Torvalds's experience with Linux in maintaining a large distributed development project, along with his intimate knowledge of file system performance gained from the same project and the urgent need to produce a working system in short order. These influences led to the following implementation choices: Strong support for non-linear development

Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history. A core assumption in Git is that a change will be merged more often than it is written, as it is passed around various reviewers. Branches in git are very lightweight: A branch in git is only a reference to a single commit. With its parental commits, the full branch structure can be constructed. Distributed development Like Darcs, BitKeeper, Mercurial, SVK, Bazaar and Monotone, Git gives each developer a local copy of the entire development history, and changes are copied from one such repository to another. These changes are imported as additional development branches, and can be merged in the same way as a locally developed branch. Compatibility with existing systems/protocols Repositories can be published via HTTP, FTP, rsync, or a Git protocol over either a plain socket, ssh or HTTP. Git also has a CVS server emulation, which enables the use of existing CVS clients and IDE plugins to access Git repositories. Subversion and svk repositories can be used directly with git-svn. Efficient handling of large projects Torvalds has described Git as being very fast and scalable, and performance tests done by Mozilla showed it was an order of magnitude faster than some revision control systems, and fetching revision history from a locally stored repository can be one hundred times faster than fetching it from the remote server. In particular, Git does not get slower as the project history grows larger. Cryptographic authentication of history The Git history is stored in such a way that the id of a particular revision (a commit in Git terms) depends upon the complete development history leading up to that commit. Once it is published, it is not possible to change the old versions without it being noticed. The structure is similar to a hash tree, but with additional data at the nodes as well as the leaves. Toolkit-based design Git was designed as a set of programs written in C, and a number of shell scripts that provide wrappers around those programs. Although most of those scripts have since been rewritten in C for speed and portability, the design remains, and it is easy to chain the components together. Pluggable merge strategies As part of its toolkit design, Git has a well-defined model of an incomplete merge, and it has multiple algorithms for completing it, culminating in telling the user that it is unable to complete the merge automatically and manual editing is required. Garbage accumulates unless collected Aborting operations or backing out changes will leave useless dangling objects in the database. These are generally a small fraction of the continuously growing history of wanted objects. Git will automatically perform garbage collection when enough loose objects have been created in the repository. Garbage collection can be called explicitly using git gc -prune. Periodic explicit object packing

Git stores each newly created object as a separate file. Although individually compressed, this takes a great deal of space and is inefficient. This is solved by the use of packs that store a large number of objects in a single file (or network byte stream) called packfile, deltacompressed among themselves. Git implements several merging strategies; a non-default can be selected at merge time: resolve: the traditional three-way merge algorithm. recursive: This is the default when pulling or merging one branch, and is a variant of the three-way merge algorithm. When there are more than one common ancestors that can be used for three-way merge, it creates a merged tree of the common ancestors and uses that as the reference tree for the three-way merge. This has been reported to result in fewer merge conflicts without causing mis-merges by tests done on actual merge commits taken from Linux 2.6 kernel development history. Additionally this can detect and handle merges involving renames. octopus: This is the default when merging more than two heads.

Data structures:
Git's primitives are not inherently a source code management (SCM) system. Torvalds explains,

In many ways you can just see git as a filesystem it's content-addressable, and it has a notion of versioning, but I really really designed it coming at the problem from the viewpoint of a filesystem person (hey, kernels is what I do), and I actually have absolutely zero interest in creating a traditional SCM system.
From this initial design approach, Git has developed the full set of features expected of a traditional SCM, with features mostly being created as needed, then refined and extended over time. Git has two data structures: a mutable index (also called stage or cache) that caches information about the working directory and the next revision to be committed; and an immutable, append-only object database.

The object database contains four types of objects: A blob object is the content of a file. Blob objects have no file name, time stamps, or other metadata. A tree object is the equivalent of a directory. It contains a list of file names, each with some type bits and the name of a blob or tree object that is that file, symbolic link, or directory's contents. This object describes a snapshot of the source tree.

A commit object links tree objects together into a history. It contains the name of a tree object (of the top-level source directory), a time stamp, a log message, and the names of zero or more parent commit objects. A tag object is a container that contains reference to another object and can hold additional meta-data related to another object. Most commonly, it is used to store a digital signature of a commit object corresponding to a particular release of the data being tracked by Git. Git stores each revision of a file as a unique blob object. The relationships between the blobs can be found through examining the tree and commit objects. Newly added objects are stored in their entirety using zlib compression. This can consume a large amount of disk space quickly, so objects can be combined into packs, which use delta compression to save space, storing blobs as their changes relative to other blobs. Git servers typically listen on TCP port 9418.

LOCAL REPOSITORY AND OPERATIONS:


After cloning or creating a repository the user has a complete copy of the repository. The user performs version control operations against this local repository, e.g. create new versions, revert changes, etc. There are two types of Git repositories: bare repositories used on servers to share changes coming from different developers working repositories which allow you to create new changes through modification of files and to create new versions in the repository If you want to delete a Git repository, you can simply delete the folder which contains the repository.

REMOTE REPOSITORIES:
Git allows the user to synchronize the local repository with other (remote) repositories. Users with sufficient authorization can push changes from their local repository to remote repositories. They can also fetch or pull changes from other repositories to their local Git repository.

BRANCHING AND MERGING:


Git supports branching which means that you can work on different versions of your collection of files in parallel. For example if you want to develop a new feature, you can create a branch and make the changes in this branch without affecting the state of your files in another branch. Branches in Git are local. A branch created in a local repository, which was cloned from another repository, does not need to have a counterpart in the remote repository. Local branches can be compared with remote tracking branches which proxy the state of branches in another remote repository.

Git supports that changes from different branches can be combined. This allows the developer for example to work independently on a branch called production for bug fixes and another branch called feature_123 for implementing a new feature. The developer can use Git commands to combine the changes at a later point in time. For example the Linux kernel community used to share code corrections (patches) via mailing lists to combine changes coming from different developers. Git is a system which allows developers to automate such a process.

WORKING TREE:
The user works on a collection of files which may originate from a certain point in time of the repository. The user may also create new files or change and delete existing ones. The current collection of files is called the working tree. A standard Git repository contains the working tree (single checkout of one version of the project) and the full history of the repository. You can work in this working tree by modifying content and committing the changes to the Git repository.

EXAMPLES:
Create directory: The following commands create an empty directory which you will use as Git repository.

# switch to home cd ~/ # create a directory and switch into it mkdir ~/repo01 cd repo01 # create a new directory mkdir datafiles
Create Git repository Every Git repository is stored in the .git folder of the directory in which the Git repository has been created. This directory contains the complete history of the repository. The .git/config file contains the local configuration for the repository. The following command creates a Git repository in the current directory.

# Initialize the Git repository # for the current directory git init

Create content The following commands create some files with some content that will be placed under version control.

# switch to your new repository cd ~/repo01 # create another directory # and create a few files touch test01 touch test02 touch test03 touch datafiles/data.txt # Put a little text into the first file ls >test01 Remote repositories:
Remote repositories are repositories that are hosted on the Internet or network. Such remote repositories can be use to synchronize the changes of several Git repositories. A local Git repository can be connected to multiple remote repositories and you can synchronize your local repository with them via Git operations. It is possible that users connect their individual repositories directly, but a typically Git workflow involve one or more remote repositories which are used to synchronize the individual repository.

Bare repositories:
A remote repository on a server typically does not require a working tree. A Git repository without working tree is called abare repository. You can create such a repository with the -bare option. The command to create a new empty bare remote repository is displayed below.

# create a bare repository git init --bare


By convention a bare repository should end with the .git extension.

Tags in Git:
Git has the option to tag a commit in the repository history so that you find them more easily at a later point in time. Most commonly, this is used to tag a certain version which has been released.

Lightweight and annotated tags:

Git supported two different types of tags, lightweight and annotated tags. A lightweight tag is a pointer to a commit, without any additional information about the tag. An annotated tag contains additional information about the tag, e.g. the name and email of the person who created the tag, a tagging message and the date of the tagging. Annotated tags can also be signed and verified with GNU Privacy Guard (GPG).

List existing tags


You can list the available tags via the following command:

git tag Creating annotated tags


You can create a new annotated tag via the git tag -a command. An annotated tag is also created with you use the -m . Via the -m parameter, you specify the description of this tag. The following command tags the current active HEAD.

# create tag git tag 1.6.1 -m 'Release 1.6.1' # See the tag git show 1.6.1
You can also create tags for a certain commit id.

git tag 1.5.1 -m 'version 1.5' [commit id]

16. What are branches?


Git allows you to create branches, i.e. named pointers to commits. You can work on different branches independently from each other. The default branch is called master. Git allows you to create branches very fast and cheaply in terms of resource consumption. Developers are encouraged to use local branches frequently. If you decide to work on a branch, you checkout this branch. This means that Git moves the HEAD pointer to the new branch which points to a commit and populates the working tree with the content of this commit. Untracked files remain unchanged and are available in the new branch. This allows you to create a branch for unstaged and uncommited changes at any point in time.

Also dirty files remain unchanged and stay dirty. If Git would need to modify a dirty file on checkout, the checkout fails with a "checkout conflict" error. This prevent that you loose any changes. The changes must in this case committed, reverted or stashed.

BRANCHES: List available branches


The git branch command lists all locally available branches. The currently active branch is marked with *.

# lists available branches git branch


If you want to see all branches (including remote tracking branches), use the -a for the git branch command.

# lists all branches including the remote branches git branch -a Create new branch
You can create a new branch via the git branch [newname] command. This command allows optionally to specify the starting point (commit id, tag, remote or local branch). If not specified the currently checked out commit will be used to create the branch.

# Syntax: git branch <name> <hash> # <hash> in the above is optional git branch testing # Switch to your new branch git checkout testing # Some changes echo "Cool new feature in this branch" > test01 git commit -a -m "new feature" # Switch to the master branch git checkout master # Check that the content of test01 is the old one cat test01
To create a branch and to switch to it at the same time you can use the git checkout command with the -bparameter.

# Create branch and switch to it

git checkout -b bugreport12 # Creates a new branch based on the master branch # without the last commit git checkout -b mybranch master~1 TYPICAL GIT WORKFLOW USING SEPARATE REPOSITORIES
The following description highlights typical Git workflows.

Providing a patch
Git emphasizes the creation of branches for feature development or to create bug fixes. The following description lists a typical Git workflow for fixing a bug in your source code (files) and providing a patch for it. This patch contains the changes and can be used by another person to apply the changes to his local Git repository. This description assumes that the person which creates the changes cannot push changes directly to the remote repository. For example you may solve an issue in the source code of an Open Source project and want that the maintainer of the Open Source project integrates this into his project. 1. Clone the repository, in case you have not done that. 2. Create a new branch for the bug fix 3. Modify the files (source code) 4. Commit changes to your branch 5. Create patch 6. Send patch to another person or attach it to a bug report, so that is can be applied to the other Git repository You may also want to commit several times during 3.) and 4.) and rebase your commits afterwards. Even if you have commit rights, creating a local branch for every feature or bug fix is a good practice. Once your development is finished you merge your changes to your master and pushes the changes from master to your remote Git repository.

Working with two repositories


Sometimes you want to add another remote repository to your local Git repo and pull and push from and to both repositories. The following example describes how to add another remote repository and to pull and fetch from both repositories. You can add another remote repository called remote_name via the following command.

# add remote git remote add <remote_name> <url_of_gitrepo> # see all repos git remote -v
For merging the changes in remote_name create a new branch called newbranch.

# create a new branch which will be used # to merge changes in repository 1

git checkout -b <newbranch>


Afterwards you can pull from your new repository called remote_name and push to your original repository.

# reminder: your active branch is newbranch # pull remote_name and merge git pull <remote_name> # or fetch and merge in two steps git fetch <remote_name> git merge <remote_name>/<newbranch> # afterwards push to first repository git push -u origin master Using pull requests
Another very common Git workflow is the usage of pull requests. In this workflow a developer clones a repository and once he thinks he has something useful for another clone or the origin repository he sends the owner a pull request asking to merge his changes. This workflow is also actively promoted by the Github.com hosting platform.

TYPICAL GIT WORKFLOWS WITH SHARED REPOSITORIES Working with a shared remote
Git emphasizes the creation of branches for feature development or to create bug fixes. The following description contains a typical Git workflow for developing a new feature or bug fix and pushing it to the remote repository which is shared with other developers. After cloning the repository the developer would: 1. creates a new branch for the development 2. changes content in the working tree and add and commit his changes 3. if required he switches to other branches to do other work 4. once the development in the branch is complete he rebases (or merges) the commit history onto the relevant remote tracking branch to allow a fast-forward merge for this development 5. he pushes his changes to the remote repository, this results in a fast-forward merge in the remote repository During this development he may fetch and merge or rebase the changes from the remote repository at any point in time. The developer may use the pull command instead of the fetch command.

You might also like