This chapter aims to describe the functioning of a typical indexer and its components. We will then take the example of BlockWatch Indexer used for the TzStats explorer.
Indexers are node operators that extract, transform and load data into a database by mapping the data into a pre-defined schema of tables with referential integrity in order to provide indexing and query processing services via an API.
Indexing is a data structure technique to efficiently retrieve records from the database based on some attributes on which the indexing has been done.
- A Tezos Node is the heart of the blockchain. It manages the protocol. Here the Archive node is responsible for fetching all the data from the network that will be used and made available by the indexer/explorer.
- ETL stands for extract, transform, and load.
- API is the acronym for Application Programming Interface, which is a software intermediary that allows two applications to talk to each other.
FIGURE 1: Typical Blockchain Explorer Backends
The Blockwatch indexer uses a high-performance columnar database that allows for extremely fast analytical queries.
Columnar database is a column-oriented storage for database. It is optimized for fast retrieval of data columns, for example, for analytical applications. It significantly reduces the overall disk I/O requirements and limits the amount of data you need to load from the disk.
It's a custom-made database for blockchain analytics. Avoiding the storage bottleneck allows for more complex data processing.
Storage bottleneck is a situation where the flow of data gets impaired or stopped completely due to bad performance or lack of resources.
State updates happen at each block, which means all the balance updates are always verified, and the indexer will follow chain reorganizations in real-time.
FIGURE 2: Blockwatch Indexer