I asked a question about Ripple’s database implementation, and received this response:
The ripple server uses SQLite for structured data and a configurable “back end” for unstructured “bulk” storage.
The structured data consists of things like transactions indexed by which accounts they affected. The unstructured data consists of “chunks” of data indexed by hash that constitute portions of network history.
The preferred back end for bulk storage is currently RocksDB on Linux platforms.
This strikes me as strange since Ripple’s structure allows the developers to place almost any demand they wish upon the server operators. In other words, why not use a database server, specifically PostgreSQL?
I found this interesting breakdown of PostgreSQL vs SQLite, and this explanation:
It breaks down to how they implement snapshot isolation.
SQLite uses file locking as a means to isolate transactions, allowing writes to hit only once all reads are done.
Postgres, in contrast, uses a more sophisticated approach called multiconcurrency version control (mvcc), that allows multiple writes to occur in parallel with multiple reads.
First, is it true that the ideal implementation for bulk storage is to use a file database?
Second, is it true that for concurrent reads & writes, PostgreSQL vastly outperforms a file database?
Lastly, when tables approach billions of rows in length, for concurrent performance, is a file database or PostgreSQL superior?
Please assume both alternatives are ideally tuned.