This is the fourth and last chapter of a series about web decentralization. As we have seen in the previous chapter, the blockchain and smart contracts are great to store records for some specific use cases (like financial transactions), but they are not going to replace your whole database. Here we are going to focus on distributed and decentralized databases: What are the current candidates and how do they compare each other. Then we will conclude this series and step back on what we have learned.
For once, let’s start with a theorem — the CAP theorem:
The CAP theorem states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:
– Consistency: Every read receives the most recent write or an error
– Availability: Every request receives a (non-error) response — without the guarantee that it contains the most recent write
– Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
This theorem is not perfect, but it is simple and gives you an idea of the design trade-off in a quick glance. It will help us to classify the different candidates.
Also, if you have no idea of what we are talking about, here is a good introduction:
All good? Now let’s have a look at these 5 solutions:
- A real-time, decentralized, offline first, graph database.
- AP: it guarantees availability, partition tolerance, but only eventual consistency. It makes it particularly interesting for real-time (chat, online game, live sensors visualization on IOT, …), but you should probably avoid using it when consistency is critical, like in banking. Read more here.
- Storage-agnostic. Current adapters include File Storage, Local Storage (in the browser), IPFS and S3.
- Highly scalable: capable of doing 20M+ API ops/sec.
- VC-backed and raised $1.5M in 2018.
The roadmap is available here.
Note that GUN can be used without any server (except for peers discovery), but the data will only persist in the browser local storage. So if you need data persistence, you should always keep a few nodes running.
The Universe is not Strongly Consistent
GUN made an interesting trade-off by focusing on real-time and offline capability. If you want to understand why this presentation explains well the pros and the cons in a simple and fun way. GUN may not suit for every application, but it definitely worth having a look to it!
Fluence started as a decentralized database network and recently shifted to become a trust-less query layer over the Ethereum blockchain.
Fluence creates an independent network with its own decentralized consensus for each query, security model and crypto-economics design. This network is optimized for retrieving and processing data from decentralized sources without sacrificing the security and performance. Databases, query languages and arbitrary algorithms can be created and deployed to the Fluence network and can run completely decentralized on the Fluence virtual machine.
With Fluence, you will be able to query data from the blockchain and from cold storage (Ethereum Swarm, IPFS) using SQL, GraphQL or a custom query language.
In the same vein, you might be interested in EthQL, a GraphQL interface to Ethereum (which I’m not going to detail here because it relies on a centralized server).
Bluzelle provides a NoSQL key-value store decentralized database with a CRUD API. In a similar way than for many file storage solutions, nodes can be rewarded with BLZ tokens for hosting data. If you are a consumer, you use the same cryptocurrency to rent data storage in the swarm. Bluzelle is also open source.
Bluzelle is still in beta and you can try it for free on the TestNet.
With high throughput, low latency, powerful query functionality, decentralized control, immutable data storage and built-in asset support, BigchainDB is like a database with blockchain characteristics.
BigchainDB introduces a new paradigm in databases. It is asset-focused (instead of table or document focused) and works with 2 kinds of transactions: CREATE and TRANSFER:
BigchainDB is now managed by the IPDB foundation which provides a TestNet and maintains the open source code base. A bunch of drivers is available in different languages, as well as a GraphQL interface.
A public, distributed and decentralized database with one common thread: trust. Enforced by integrated fault tolerance, incentive schemes and smart contracts.
Like with Bluzelle, Ties.DB incentives nodes to process queries and host data by paying them in TIE tokens. For now, the network is integrated with Ethereum but they are planning to integrate with other blockchains in the future. Ties.DB is AP. You can give a try to the alpha version here.
Bonus point for offering a user-friendly GUI (something similar to phpMyAdmin for MySQL):
We have had a look at the 3 most important bricks of the decentralized web:
- How to store files
- How to get a consensus
- How to store relational data
So what’s the current state of these technologies? Should you start your new project in a decentralized way?
Well in my opinion… it depends on your project. We are still at a very early stage of web decentralization, and it is a totally new paradigm compared to the current web.
If you are working on something experimental, then go for it and choose any of those technologies. You will learn A LOT. Even though some tools are still unstable for the moment, it should get better with time and you will be part of it!
If your project does not aim to generate incomes but you need to store a large amount of data, I would recommend decentralizing the storage layer on IPFS. You will save a lot of money and still have something extremely scalable. You could ask your community to host some IPFS nodes. In addition, if your project dies for some reason, users could still have access to their data on IPFS.
If your project needs real-time data but does not require strong consistency (like for a chat), you should consider using GUN. It is really nice to use, looks quite stable and is already being used by large dApps.
Concerning the blockchain, 2018 has seen lots of really interesting projects die. Most of the time the reason was not “Is blockchain a good fit for that purpose?” but “Does it worth affording all the complexity the blockchain brings for that purpose?”. I strongly think that blockchain will keep evolving and getting better in the next few years. However, it was probably a few years too early to start using it for every purpose. There are still some really good reasons to start a project using the blockchain, it’s up to everyone to balance. But if you are willing to use blockchain just traceability purpose, something like BigchainDB can probably provide what you need in a simpler way.
Also, I would recommend you to keep an eye on a few projects which are developing an abstraction layer making it much easier for developers to integrate with decentralized technologies. They are still young but if they can deliver what they promise, building a dApp will be as easy as developing a normal app! Here are my 3 favorites:
- DADI: The decentralized cloud provider. Their roadmap includes storage, compute layer, CDN, API, queues, well everything you need.
- Blockstack: Like for DADI but limited to identity and storage.
- Textile: While building their Photos dApp on top of IPFS, they faced many issues like identity management, pinning files on IPFS, notifications, how to get it working on mobile, … All those things are available in their go-textile framework. They also provide a React Native SDK.
That’s all for me, thanks for reading! Feel free to share your thoughts and experiments about web decentralization in the comments.