Skip to main content

How Indexers Work?

This chapter aims to describe the functioning of a typical indexer and its components. We will then take the example of BlockWatch Indexer used for the TzStats explorer.

Typical Blockchain Explorer Backends

Indexers are node operators that extract, transform and load data into a database by mapping the data into a predefined schema of tables with referential integrity in order to provide indexing and query processing services via an API.

Indexing is a data structure technique to efficiently retrieve records from the database based on some attributes on which the indexing has been done.

  • A Tezos Node is the heart of the blockchain. It manages the protocol. Here the Archive node is responsible for fetching all the data from the network that will be used and made available by the indexer/explorer.
  • ETL stands for extract, transform, and load.
  • API is the acronym for Application Programming Interface, which is a software intermediary that allows two applications to talk to each other.

Data are extracted with an ETL from a Tezos archive node and stored in a database. The database is exposed to offline queries and an API.

FIGURE 1: Typical Blockchain Explorer Backends


Focus on BlockWatch Indexer (TzIndex)

The Blockwatch Indexer TzIndex is used for the TzStats explorer.

The Blockwatch indexer uses a high-performance columnar database that allows for extremely fast analytical queries.

Columnar database is a column-oriented storage for database. It is optimized for fast retrieval of data columns, for example, for analytical applications. It significantly reduces the overall disk I/O requirements and limits the amount of data you need to load from the disk.

It's a custom-made database for blockchain analytics. Avoiding the storage bottleneck allows for more complex data processing.

Storage bottleneck is a situation where the flow of data gets impaired or stopped completely due to bad performance or lack of resources.

State updates happen at each block, which means all the balance updates are always verified, and the indexer will follow chain reorganizations in real time.

Tezos Archive Node => RPC Proxy Cache => Blockwatch Indexer <=> Embedded / External Columnar Datastore. Blockwatch Indexer => Explorer API, Tables API, Time-Series API

FIGURE 2: Blockwatch Indexer

To go further

To learn more on the subject, please refer to the official TzStats blog post and this video that illustrates the inner workings of an indexer.

References

[1] https://tzstats.com/blog/next-gen-blockchain-indexing-for-tezos/

[2] https://www.youtube.com/watch?v=2I9mmA0GzMk