Sui Indexing Framework Enables Onchain Data Ingestion

Customizable access to Sui’s onchain data through the Sui Indexing framework.

Sui Indexing Framework Enables Onchain Data Ingestion

The Sui Indexing Framework offers customizable access to Sui’s onchain data through a powerful data ingestion framework. It enables the collection of both raw onchain data and derived datasets by any relevant software, whether operating onchain or offchain.

Leveraging the Sui Indexing Framework to create customizable data feeds enables developers to effortlessly build software and products that respond to onchain events.

The power of onchain data feeds

Blockchain data structures are designed to ensure the integrity of transactions, which often means they are not optimized for random data access across their entire history. However, customizable data feeds built with the Sui Indexing Framework overcome this limitation, empowering developers to harness onchain data more effectively for real-time analytics and responsive applications. 

Imagine a musician who wants to leverage NFTs to distribute music to their fans. They create a non-transferrable NFT collection where each NFT grants automatic access to an audio file stored in an offchain database upon minting. Utilizing the Sui Indexing Framework, a custom indexer can track the minting transactions associated with these specific NFTs on Sui. This setup enables a separate offchain service to perform actions like transferring audio files, triggered by events monitored through the custom indexer.

The Sui Indexing Framework can be particularly useful for those who want a leaner Full node setup. Without an indexing solution, Full nodes typically retain the history of every transaction. Using the Sui Indexing Framework, a custom indexer can be created which feeds checkpoint data to be stored separate from the Full node. Many apps relying on Full nodes don’t actually need the Full node to actively hold recent checkpoint data if stored elsewhere in real time. This framework allows more efficient infrastructure set ups as Full nodes can be aggressively pruned to create leaner Full nodes.

Additionally, the Sui Indexing Framework is a key piece needed for the development of onchain data dashboards. While a data analytics platform requires many more elements, the Sui Indexing Framework is a foundational piece for data ingestion that these apps rely on.

How it works

Data ingestion with Sui Indexing Framework begins with subscribing to the checkpoint stream from Sui in order to receive the most recent data. The most straightforward approach is to subscribe to the appropriate remote store of checkpoint data, like the ones Mysten Labs provides: 

  • Testnet - https://checkpoints.testnet.sui.io
  • Mainnet - https://checkpoints.mainnet.sui.io

To do this, a worker function must be created to process the checkpoint data. The main app then calls the worker function whenever it detects an event in the remote store.

use async_trait::async_trait;
use sui_data_ingestion_core::{setup_single_workflow, Worker};
use sui_types::full_checkpoint_content::CheckpointData;

struct CustomWorker;

#[async_trait]
impl Worker for CustomWorker {
    async fn process_checkpoint(&self, checkpoint: CheckpointData) -> Result<()> {
        println!(
            "processing checkpoint {}",
            checkpoint.checkpoint_summary.sequence_number
        );
        // custom processing logic
        ...
        Ok(())
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    let (executor, term_sender) = setup_single_workflow(
        CustomWorker,
        "https://checkpoints.mainnet.sui.io".to_string(),
        0, /* initial checkpoint number */
        5, /* concurrency */
        None, /* extra reader options */
    ).await?;
    executor.await?;
    Ok(())
}

For those operating their own Full node, they can opt in to create their own checkpoint stream. To enable the checkpoint stream, the following checkpoint-executor-config information must be added to the Full node configuration file:

checkpoint-executor-config:
  data-ingestion-dir: <path to a local directory>

Once the configuration is set, the Full node dumps checkpoint data into a local directory. The indexer daemon listens for checkpoint events and processes the data as new checkpoints arrive. The checkpoint data returned is a CheckpointData struct, which current apps are likely already familiar with. With the configuration, point the indexer to the data-ingestion-dir directory and process the data in the same manner as hosted subscriptions.

Sui Indexing Framework supports both pull-based and push-based processing methods, offering developers the flexibility to choose between straightforward implementation or reduced latency. This versatility is crucial for apps that prioritize real-time data access and responsiveness.

Dive deeper

Whether creating apps that respond to real-time blockchain events or general data and infrastructure management, the Sui indexing framework offers the flexibility and reliability needed for such uses. For detailed implementation guidance, explore the Sui Custom Indexer documentation. To see the Sui Indexing Framework in action, explore the specialized indexing pipelines used by Mysten Labs, SuiNS, and the Sui Bridge.