BUT WHAT THE HECK are Git and Github?

A beginner's guide to version control with Git

Have you always wondered what git is and why everyone is using it? Or maybe, you have used git but couldn’t quite grasp what it was actually doing, right?

If yes, then read along as I tell you exactly what it is, why developers can’t live without it and how it can help make your life a lot easier!

Introduction

Whether you’re a Software Engineer, a Computer Science (CS) Student, or just someone interested to get into IT, you’ll be expected to be familiar with git.

Learning a new technology can be daunting, especially if it involves the satanic, god-forbidden terminal (aka. command line).

Git can be quite tricky to learn because it is one of the first tools that Developers/CS students are exposed to. I understand that because I too had a hard time getting around git initially.

But it became way easier when I realised what its purpose is and how it manages to do it. I’ll try to explain in a way that I wish someone else had explained to me, when I was starting out.

We’re going to look at git from a top-down perspective because that is easier to understand and uses less jargon.

Some of you might be familiar with Github Desktop, which is a desktop GUI application, that serves as a git client. But I strongly suggest you stick to the terminal because:

• It gives you more control
• It does not limit you to only Github as a git hosting provider
• It makes you feel like a hackerman

It is important to note that there are many programs and tools out there only available via the terminal. Getting used to it has some clear benefits, even if it feels scary at first.

But why is git so important? Why does almost every software related job post have it as a requirement? What the heck is git in the first place goddammit?!

Calm down. Grab your mango juice and Odin’s terminal and let me explain.

The Problems

Before we look at git, we should understand what problems it adresses.

1. Changes can break a working application

Imagine you’re making a web app. You have worked on it for a long time, installed necessary dependencies and wrote the necessary code. It works fine.

You then decided to add a new cool feature. After updating your code, your app doesn’t work anymore. You try to fix it, and make changes hoping that at least you can go back to your previous working state.

But unfortunately, it doesn’t seem to go back to working.

What are you supposed to do now other than crying in the corner?

2. Working in a team project

Suppose you’re working on a team project. You and your team members are working from different locations. If two or more people are editing the same file on the same version, you will eventually run into conflicts. How can you guys work on the same source code in real time?

Serious problem, right?

The Solution

Suppose you’re smarter than the crying cat. Every time you make a change to your code, you make a copy of that version. For example like: my-app-v1, my-app-v2, my-app-v3, …, my-app-vN.

What you are doing in fact is Manual Version Control.

Version Control refers to the management of changes to files over time, so that a particular version can be referred to later.

This works, but,

• manual maintenance of the versions is a headache
• files are not stored efficiently, as many unchanged files can be duplicated

Fortunately, we’ve had brilliant minds before us who have solved this problem. We have Version Control System to the rescue!

A Version Control System (VCS) is a program that helps to keep track of changes to files over time.

A VCS allows us to be able to track changes of files efficently. It does this by taking snapshots of the state of files at particular points in time, and those snapshots can be referred to later, whenever needed.

There have been many forms of VCSs since the 90s,

• from the ones that only operated on a single machine,
• to the ones with client-server architectures (CVS, SVN etc.),
• to the distributed ones we have today (Git, Mercurial, etc.).

That brings us to git.

Git is a free and open source distributed VCS originally developed in 2005 by Linus Torvalds.

Linus Torvalds, if you’re unfamiliar, is also the creator of the Linux kernel. One of the legends of Computer Science.

Git the de facto standard of VCS in software development today. So, let’s git right into it…

Git Terminology

For storing version history, git makes use of repositories. For our intents and purposes:

A Git Repository is simply a folder that git uses to maintain version history for a project.

The repository folder also called the git directory, is used to store objects (like contents of files), references and other metadata needed by git to carry out its job. By default, it is a .git folder that lives at the root folder of your project.

So, how do we measure a change? In git, the unit of change is a commit.

A Commit records changes made to files and folders in the repository.

I like to think of commit as a snapshot or record of the current state of the project. It is important to note that the changes that are committed are taken from the index, not the working directory.

The Working directory consists of files and folders that you can directly edit and work with.

Files and folders in the working directory are often removed and replaced by git, when you switch to other (older or newer) commits.

The Index is the staging area for a commit.

Meaning, it acts as a middle-man between the working directory and the repository. When you want to save some changes to some files, you add them to the index first before commiting to the repository.

In order to keep track of commits, git makes use of branches.

A Branch is just a (moveable) pointer to a commit.

By default, git creates a branch named master. Every time you commit from the master branch, the pointer automatically moves forward to the new commit.

We won’t get into branching much in this tutorial. But to put it visually, a branch represents a sequence of commits, with the tip of the branch pointing to a single commit.

Phew! That was a lot of jargon. Now comes the best part …

The High Level View of Git

Here’s a diagram that really helps me visualize git.

The texts in the arrows are the corresponding commands for getting from point A to B. In git, there are many different ways to achieve the same outcome, but we’re going to focus on the standard practices.

We’ll dive more into the commands in a moment.

And Github?

So where does Github come into play?

So far, the repositories discussed are limited to your computer only.

• What if you want to access your project from another computer?

• What if more people want to contribute to your project?

All of these problems (and more) could be solved - only if there was a way to share repositories through the internet…

*Github has entered the chat*

“I’m a friggin git hosting provider.” - Github

Like some other *hubs you may be familiar with, Github instead is a huge collection of git repositories, hosted by countless poor servers running in data centers connected to the internet (aka. “cloud”).

You can create new repositories in Github through an account and, push changes from your local repository to a remote repository, and likewise pull changes in the other direction as well.

Here’s the git diagram with a remote repo.

If you notice, the statement says “a” git hosting provider, meaning, there are more options out there. Two more popular providers are Gitlab and Bitbucket, both having their pros and cons. But we’ll keep our focus on Github for now.

Hands on - Git

It’s about time we get our hands dirty. Open up the mighty terminal and let’s get started…

Installation

First we need to install git in our operating system.

For Mac, run:

brew install git


For Ubuntu, Mint and other Debian distros, run:

add-apt-repository ppa:git-core/ppa
apt-get update
apt-get install git


For other Linux distros, further instructions here.

Configuration

After installation, we’ll set the global username and email, which git uses to record who commited what.

git config --global user.name "your username"
git config --global user.email "your.email@example.com"


A simple Git exercise

Alright, we’re all set. Our project is going to be stupidly simple - a simple grocery list.

So how do we get a repository? We have two options:

• either we create a new one
• or we clone an existing one

Since it’s a new project, we’ll create a new one. But first let’s make a folder for our project.

mkdir groceries
cd groceries


If you’re unfamiliar with these commands,

• mkdir groceries: makes a new directory(i.e. folder) named groceries
• cd groceries: takes the terminal to the groceries directory i.e. changes directory

After making the folder and navigating our terminal in that folder, run this to initialize a repository:

git init


Open up your favourite editor. I’ll go with my trusty VS Code, which you can download from here.

code .


Make a new text file named to_buy.txt, add two/three items and save it. Here’s mine:

Now check the status of the repo:

git status


It should show the changes made (the text file) and that you’re in the master branch.

Let’s add the untracked file to the index, as it says:

git add to_buy.txt


Now commit, with option -m which allows you to add a message describing the commit:

git commit -m "add initial groceries"


Well done!

Now if you run git status, you’ll see that it says “working tree clean” , meaning there are no uncommitted changes are there in the working directory.

To check the commit you just made, run:

git log


The long crazy string after “commit” is the commit id which uniquely identifies your commit. It is generated by a hash function, but you don’t have to worry about it now.

Notice that the HEAD points to master, denoting you’re still on master branch.

HEAD is a just reference to the last commit you switched to.

Let’s add a new item to our file:

Add changes to index and commit. Tip: You can add all changes made (to any files) - to the index with git add .

git add .


Now you can view both commits made with:

git log


Notice that the master branch has automatically moved, as we had discussed.

Let’s say we want to go back to our initial commit for some reason. Copy the first commit id, and run checkout.

git checkout your_commit_id


Tip: The first 5 characters of the id will suffice as well. In my case the id is 606a3e3c25022f8a89a2f895c4d8465292363446, so I can run git checkout 606a3.

Congratulations! You can now travel to the past.

Not really but, at least you can go back to meet/restore/fix an old state of your file(s).

You’ll notice that it says you’re in detached HEAD state. That basically means no branch is pointing at this commit (as master, the only branch is pointing at the last made commit).

You can now view and edit files here, and create a new branch which has different commits than the one that was your last commit.

Before moving on, let’s go back to the latest commit, which is pointed by master.

git checkout master


First make a Github account, if you don’t have one already.

After logging in, click the plus button at the top right and click on New repository.

Name your repository. I’ll name mine git-demo-groceries. Now scroll to the bottom and click Create repository.

Great! You now have a repo. Don’t be turned off by all the commands on screen.

Click on the “HTTPS” button under “Quick setup”. Then scroll below and find the section saying “or push an existing repository from the command line”. Copy the two commands below it and run them.

Mine looks like this:

So I can run:

git remote add origin https://github.com/AluBhorta/git-demo-groceries.git
git push -u origin master


Let’s understand the two commands.

• The first one adds a new remote repository named origin.
• The second one pushes your changes (commits) to master branch of origin.

After push is completed successfully, if you refresh your new Github repository page, Voila! Your groceries are now on Github.

Still with me? Great! We’re almost done.

Let’s add a new file from Github and pull the changes to our local repo.

Go to your Github repository and click on the “Add a README” button at the bottom right. This will automatically generate a README.md file for you to edit.

It is customary for repositories to keep a README.md file at the root, which provides an overview (or installation instructions etc.) about the repository.

Then scroll to the bottom and click Commit new file. That will automatically create a commit message named “Create README.md”, which you can overwrite.

After the README is created, go back to your terminal and run:

git pull


If everything went accordingly, you now have the README.md file in your local repo. Nice! You can now run git log to view all the commits, including the latest one that added the README.md file.

That concludes our session, finally! If you need reference, here’s the link to my remote repo from our exercise.

Conclusion

If you have followed along and did the exercises accordingly, Congratulations! You now know git.

Of course, there is a lot more to learn about git than a single tutorial can tell. But these are the basics of a standard git workflow.

If you want to learn git in depth, you could look into the official git documentation or the Pro Git book by Scott Chacon.

But the best way to learn is, of course, by doing it yourself. So, go out there and git good!