Week 1 - Moments from IT history | Louis Alvin

Home Blog Week 1 - Moments from IT history

Week 1 - Moments from IT history

31st Jan 2022 SPEAIT TalTech

Since the early nineties, the Internet hasn’t stopped developing, and early protocols like Gopher now seem atrociously outdated. Currently, we appear to be leaning towards a more decentralized web, IPFS being the perfect example of this trend. Internet has also boosted the cooperation between open-source developers, making tools like Git essential. In this article, we will take a quick look at these three technologies.

Gopher

The first one is a protocol that was released in 1991 – just like the World Wide Web –, and that used to be an actual competitor of the Web. Gopher is an internet protocol developed by Mark P. McCahill’s team at the University of Minnesota, whose goal was to provide a single access to all of the university’s digital resources, from files to phone directories and Telnet connections to various services.
The Gopher-space is essentially composed of series of linkable menus, which are lists of pointers to other internet resources. The simplicity of the protocol is one of its main features, and a typical Gopher sequence is as simple as the client establishing a TCP connection to the server, the server accepting it, the client requesting a file, and the server sending it and terminating the connection.
Thanks to this simplicity – both in terms of implementation and usage –, Gopher rapidly gained a lot of attention and started being used by many other universities and diverse institutions across the world, while the WWW remained much less used.
However, things changed dramatically when in 1993, the University of Minnesota started publishing their implementation of Gopher under a proprietary licence, making a lot of users move towards the Web instead. Finally, as the Internet started being used for commercial purposes, the graphical features of the Web (including the tag, and CSS stylesheets) as well as HTML’s hyperlinks proved to be more useful than ever, and the WWW quickly overtook Gopher, which is now only used by a few hobbyists.

Git

When Git was released, in 2005, Version Control Systems had already been there for a while. This type of software is used to keep records of every change that has been made to a group of files (generally source code) over time, and to potentially roll back in case of a problem. Most of them were using a centralized server for file hosting, that would be storing the files once and then keeping a list of all the changes that had been made, even though distributed VCSs also appeared, meaning that every client had a full copy of the repository, including all its history. One of these programs is BitKeeper, a proprietary tool that was used for the development of the Linux kernel.
But in 2005, BitKeeper decided to move to a paid model, which pushed Linus Torvalds to develop his own tool. He used this event as an opportunity to redefine the way VCSs work, and instead of keeping a list of modifications to files, Git makes a snapshot of the repository every time a change is made. This approach heavily relies on checksums to track changes, which is also a strong protection against data corruption, while making the use and the merging of several branches easier and more efficient. Finally, because of the Linux kernel being a big project, he needed Git to be fast, which is why like BitKeeper, it is distributed.
Today, Git is the most used Version Control System, and together with user interfaces and hosting platforms like GitHub or GitLab, it plays a huge role in current software development workflows.

IPFS

The Web as we know it today, like many internet services, uses a client-server model. But since 2015, IPFS (InterPlanetary File System) offers an alternative to this model. It’s a distributed protocol that can be used to store and access various types of data, including websites and applications, with the aim to build a more resilient, censorship-resistant and faster Internet.
By nature, IPFS is very different from what we are used to, the most visible change being that data is not addressed by its location (e.g. https://en.wikipedia.org/wiki/Aardvark) but by its content (using hashes, called Content identifiers or CIDs ; for example: /ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/wiki/Aardvark.html).
In order to retrieve a file, a node looks into a DHT (Distributed Hash Table) which nodes have the needed data, and asks them to send it. After downloading the file, the node will share it for a certain period of time; nodes can also ‘pin’ content to share it indefinitely.
Due to the nature of the CID, it’s easy for the protocol to verify that the data has not been tampered with. Since big chunks of data are split into smaller blocks, if two very similar files are stored on IPFS, the system is able to detect their common parts and avoid storing them several times. The decentralized nature of that system helps to speed up the speed of transfers for two reasons: first, you can download from several peers at one time (similarly to BitTorrent), and you can download from peers that are close to you instead of downloading from a server far away (on the same principle as CDNs).

Sources

https://flylib.com/books/en/3.223.1.302/1/
https://www.w3.org/People/Bos/PROSA/rep-protocols.html#gopher
(In French) https://connect.ed-diamond.com/GNU-Linux-Magazine/glmf-136/gopher-a-la-recherche-du-protocole-perdu
https://www.howtogeek.com/661871/the-web-before-the-web-a-look-back-at-gopher/
https://git-scm.com/book/en/v2 (chap. 1)
https://www.welcometothejungle.com/en/articles/btc-history-git
https://docs.ipfs.io/concepts/#ipfs-101
https://academy.moralis.io/blog/interplanetary-file-system-explained-what-is-ipfs

Previous Post Next Post