About the IPFS
What is IPFS?
IPFS stands for InterPlanetary File System.
A definition by ipfs.io: "The InterPlanetary File System (IPFS) is a protocol and peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespace connecting all computing devices."
Example:
- When you want to access a website like this: https://en.wikipedia.org/wiki/Aardvark, you will send an HTTP request to a server that saves the web page and waits for the response from it. Usually, it works well, but sometimes when the server is unavailable, you can be delayed or get some error like 404 File Not Found. This protocol is called client-server.
- But with IPFS, you can access the site by using this link https://ipfs.gateway.name/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/wiki/Aardvark.html. When you use this link, IPFS won't ask Wikipedia's server for the page. Instead, your computer or laptop will ask many computers around you to find all pieces of the page, collect them, and combine them together to build the page and show it to you. This protocol is peer-to-peer.
As you can see in the example, there are some differences between the two above protocols, and those exactly are the advantages of IPFS from the HTTP protocol.
HTTP |
IPFS |
|
Architecture |
Centralization |
Decentralization |
Addressing |
Location-Based |
Content-Based |
Participation |
More users are slower |
More users are faster |
Architecture: With decentralization, IPFS is more reliable, immutable, and faster:
- Decentralization creates a resilient internet because it doesn't store files in only one or several servers, but it stores files in so many common computers. Therefore, when one or some servers are attacked or turned off for any reason, the files can still be accessible by access to other computers. It makes IPFE more reliable than HTTP.
- IPFS makes it harder to censor content. Since the files on IPFS were saved on many computers, it's harder for anyone to change, block or remove any contents on the IPFS network.
- With this decentralization architecture, IPFS can speed up the web. For example, if you are trying to connect to the web using HTTP protocol and are far away from the server, you can get the web very slowly or even disconnected. But if you use the IPFS protocol, you can connect to the web faster by the way you retrieve the web from someone nearby you instead of the far-away server.
Addressing:
- In the above example, HTTP uses en.wikipedia.org/wiki/Aardvark to identify the web. When you want to access the web, you walk through e.wikipedia.org, and wiki and arrive at the Aardvark page. It's located identifier. And with IPFS, you access the web through this QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco. It is a CID (content identifier). It's a hash of the webpage's content.
- Because IPFS stores content on many computers, it can not save all locations of each file on each computer. It's one of the reasons IPFS uses content-based addressing.
Participation:
- On the traditional internet, which uses a client-server architecture, the more users access the web, the slower it gets because of the server's bandwidth limitation.
- However, with the IPFS network, the more users access the web, the faster it is given that it will be replicated on more computers.
How does IPFS work?
To understand how IPFS works, you need to clarify three fundamental principles of IPFS:
- Unique identification via content addressing
- Content linking via directed acyclic graphs (DAGs)
- Content discovery via distributed hash tables (DHTs)
Content addressing:
- IPFS uses a content identifier (CID) to address content on its network. The CID of content is a hash of it. The hash function can be SHA-256 or any other.
- CID is not only used to identify content but also used to link them together. There is a project which was created to solve this problem: Interplanetary Linked Data (IPLD) project.
Directed acyclic graphs (DAGs):
- IPFS uses a Merkle DAG structure to link CIDs together.
- IPFS will split a file into many small blocks (the size of the block can be specific, normally 32 bytes, 512 bytes, or larger …). Each block has its CID, and a number of blocks will be linked together to create another node. In the above picture, assume that a file is put into the IPFS system, the size of the file is 481 bytes, and the block size is 32; this file is split into 16 blocks, and the last one just is 1 byte. Eleven first blocks were linked together to create a node. Its CID was calculated by a hash of the content of 11 blocks, etc.
- Folders have CIDs of themself. It is a hash of the CIDs from the files in the folder.
- One of the advantages of IPFS is that when you have two similar files, they can share blocks that can be used for both of them.
Distributed hash tables (DHTs):
- With content identifier, you can address content on the IPFS network, and with Merkle DAG, you can link them together, so how to discover a file on the network. IPFS uses distributed hash tables to find a node or contents.
- A hash table is just a mapping key to values, and a distributed hash table is one where the table is split across all the peers in a distributed network. To find content, you ask these peers.
- libp2p project is the "search engine" of the IPFS network. It uses DHTs to discover nodes and content on the network.
- You use the DHT to find which peers have your needed blocks (routing), and after that, you need to connect to those peers and get it (exchange).
- To request blocks from and send blocks to other peers, IPFS currently uses a module called Bitswap.
Getting started with IPFS:
- The first thing you need to join the network is to have at least one gateway like "https://ipfs.gateway.name” in the above example.
- You can find some public and free gateways at this link: https://ipfs.github.io/public-gateway-checker/
- Or, if you can pay to get the high qualifier, you can use some paid gateways like this: https://infura.io/pricing or https://www.pinata.cloud/pricing
After that, you can learn how to build a private IPFS network and use IPFS gateway to join some public IPFS networks through this link: https://www.geekdecoder.com/setting-up-a-private-ipfs-network-with-ipfs-and-ipfs-cluster/
SWOT analysis of IPFS
Strengths
- IPFS is a highly available system with no single point of attack because of its decentralized and peer-to-peer features.
- Minimize bottlenecks, especially when traffic skyrockets to more users faster.
- With IPFS, you can compare your received data from many sources, so it's no longer have to trust any central party.
- IPFS builds a network that is less affected by outages.
- If you are a big company like Facebook, you have to pay millions of dollars to build the servers, but with IPFS, there is no need to do so. Therefore, IPFS will reduce costs for data providers.
- No longer waste bandwidth when you have to access the overseas website when you can get it from your neighbors with the IPFS network.
- One of the key advantages of IPFS is making it almost impossible to censor content. Maybe some governments don't like this thing, but it's still an advantage of IPFS.
- IPFS is safer than HTTP because you can verify the data you get by the hash function.
Weaknesses
However, IPFS still has some weaknesses
- Firstly, the content addressing IPFS is currently not user-friendly. Most probably, the user experience would deteriorate. Because people are used to visiting a site by its link with its domain like https://en.wikipedia.org/wiki/Aardvark so when they move to US CID, it will be hard for end-users so when they move to US CID, it will be hard for end-users.
- On IPFS, there is also little incentive for nodes to maintain long-term backups of data on the network. It is a big problem for IPFS because non-tech users don't want to waste their storage; they will delete cached or data that they don't need to use anymore.
- The costs for maintaining the network would skyrocket.
- Implementing IPFS still exists because the WWW/HTTP are so widespread in use - evolving existing protocols incrementally seems more likely than changing paradigms.
Opportunities
IPFS has a significant opportunity to replace HTTP and build a better web for all of us. Can it be true or not?
IPFS is a special fundament, so there are many applications and use cases that can be built on top of IPFS:
- The global marketplace for data storage: https://filecoin.io/
- Query the DWeb across blockchains: https://thegraph.com
- Share files: https://github.com/ipfs/ipfs-desktop or https://enzypt.io/
- Collaborate: https://peerpad.net/
- P2P video streaming platform: http://www.blust.tv/
- IPFS as infrastructure: https://cluster.ipfs.io/ or https://www.npmjs.com/
- Decentralized database: https://github.com/orbitdb/orbit-db
- Deploy your website on IPFS: https://fleek.co/
- Build a dApp: https://fission.codes/; https://fleek.co/ or https://linktr.ee/textileio
Threats
- Leak sensitive information like banking information, accounts, messages,...
- Hard to remove data publish on the IPFS networks
- Waste storage because of resident data mechanism
- Not enough nodes participate in the network.
Conclusion
With its superior features, IPFS has great potential to create a new revolution that can complete or even replace HTTP, completely changing the way people access and use the internet. Currently, many applications are being built on top of IPFS that attract many users and have proven the superiority of this network. For example, File Coin with FIL token is ranked 39 on coigeckco.com, or The Graph with GRT is ranked 50. However, IPFS still has many unfinished weaknesses. Therefore, the use of IPFS as an infrastructure to build and develop applications needs to be considered carefully.
References
[1] IPFS White Paper, docs.ipfs.tech, accessed April 19th, 2022.
[2] InterPlanetary File System, geeksforgeeks.org, accessed April 19th, 2022.
[3] What is IPFS (InterPlanetary File System)?, moralis.io, accessed April 19th, 2022.
[4] Bieri, C., An Overview into the InterPlanetary File System (IPFS): Use Cases, Advantages, and Drawbacks, accessed April 19th, 2022.
[5] A Beginner’s Guide to IPFS, hackernoon.com, accessed April 19th, 2022.
[6] HTTP is obsolete. It's time for the Distributed Web, blog.neocities.org, accessed April 19th, 2022.