Flume Gets an Overhaul

The Rivet team announced Flume in 2020 after the ETH Denver hackathon, and indeed at the time Flume was a hackathon project — dashed together in 36 hours, accelerating log delivery and offering some enhanced endpoints.

In the years that followed, Flume has grown in scope. While it originally handled exclusively log data, it grew to cover blocks, transactions, and transaction receipts as well, and became an essential part of the Cardinal Architecture.

But under the hood it was still pretty much a hackathon project. It lacked good unit testing, and its codebase felt pretty slapped together. Until now.

Flume has been almost completely rewritten with code quality and testability in mind. Its database architecture has changed, and it even comes with a plugin framework now.

Cardinal RPC

At the heart of the Flume rewrite is the Cardinal RPC framework. ‘ In Flume’s original codebase, each RPC method started with an HTTP request and had to parse out its input parameters and carefully construct its response — including any error responses. Cardinal RPC tucks all of that under the hood, allowing each method to focus exclusively on its own functionality.

Cardinal RPC also greatly simplifies testing. Rather than having to construct a mock HTTP object, the Flume services can be invoked as Go functions, making comprehensive unit tests a much simpler proposition.

Database Redesign

In the original Flume architecture, we used one SQLite database for blocks, transactions, and logs.

Did you say SQLite?

We often get weird looks when we tell people we’re using SQLite for Flume. Because SQLite is an embedded database (as opposed to a networked database), it’s often used in desktop and mobile applications and very rarely in high availability, high throughput server applications. It has a reputation for being small scale.

But the reality is that SQLite is as performant as any commercial grade database. With Polygon, the Flume logs database is over 4TB, and most queries are measured in low double digit milliseconds. In the benchmarking we’ve done, SQLite is competitive, if not the best database available for a given query.

And with Rivet’s streaming replication architecture, embedded databases are easy to work with. All of the information that feeds into Flume comes from our PluGeth master servers, either via Kafka or websockets. Cardinal unpacks this data into a BadgerDB database, while Flume uses a SQLite database. Using an embedded database in Flume allows us to operate Flume using the same patterns we use with Cardinal, Geth, and the other services we manage at Rivet.

In the new version of Flume, we use separate SQLite databases for each of these databases. Operationally at Rivet, this allows us to tune the performance of each database’s underlying volume to optimize cost / performance. But it also has another side effect.

A standard Flume config file might look like this:

networkName: sepolia
brokers:
  - url: ws://localhost:8555
loggingLevel: info
pluginPath: /var/lib/flume/
databases:
  blocks: /var/lib/flume/blocks/blocks.sqlite
  transactions: /var/lib/flume/tx/transactions.sqlite
  logs: /var/lib/flume/logs/logs.sqlite

Designating where the blocks, transactions, and logs databases should go. But what if you only want Flume for its original purpose — faster eth_getLogs — and don’t care about blocks or transactions?

Well, if you just leave those out of your config, you’ll only populate the logs database (and in turn only be able to serve log-specific queries).

For our open source users, this makes Flume more flexible, allowing people to use it for narrower purposes.

Plugin Support

Out-of-the-box Flume supports Ethereum, Ethereum Classic, and a handful of Ethereum testnets. But Polygon makes some additional demands of Flume. Specifically, Polygon has concepts of Bor transactions and Bor receipts, as well as information about Validators, Proposers, and Snapshots.

We didn’t want to bake a bunch of Polygon-specific functionality into the core of Flume, especially as we look forward to a future where we expect to add support for other EVM-based chains that will each have their own subtle nuances. Instead, we built in a Plugin framework to allow network-specific extensions to Flume. Flume plugins can:

Index additional data
Add new RPC Methods
Alter the responses to existing RPC Methods

Like with PluGeth, we’re adopting a philosophy of adding plugin hooks on an as-needed basis, so if you have a use case for a new Flume plugin hook, come talk to us on the #flume channel on Our Discord.

Cardinal RPC

Database Redesign

Plugin Support

Share