Indexer Runtime and Performance
Proper configuration and resource monitoring delivers the most performant custom indexer possible. For example:
-
Runtime configuration options for ingestion, database connections, and pipeline selection, as well as purposeful use of debugging tools like
tokio_consolehelp dial in your indexer performance. -
A sensible strategy targeting efficient data pruning for your tables keeps them performant over time.
-
Following best practices for exposing and extending Prometheus metrics helps you keep track of indexer performance.
Together, these techniques help you run indexers that are fast, resource-efficient, and easier to monitor in both development and production.
Fine-tuning configurations
The indexing framework provides multiple levels of configuration to optimize performance for different use cases. This section covers basic configuration options, while complex pipeline-specific tuning is covered in Indexer Pipeline Architecture.
Ingestion layer configuration
Control how checkpoint data is fetched and distributed:
use sui_indexer_alt_framework::config::ConcurrencyConfig;
let ingestion_config = IngestionConfig {
// Buffer size across all downstream workers (default: 5000)
checkpoint_buffer_size: 10000,
// Concurrency for checkpoint fetches.
// Adaptive by default: starts at 1 and scales up to 500 based on
// downstream channel pressure.
ingest_concurrency: ConcurrencyConfig::Adaptive {
initial: 1,
min: 1,
max: 500,
},
// Or use fixed concurrency:
// ingest_concurrency: ConcurrencyConfig::Fixed { value: 200 },
// Retry interval for missing checkpoints in ms (default: 200)
retry_interval_ms: 100,
// gRPC streaming configuration (applies when --streaming-url is provided)
// Initial batch size after streaming connection failure (default: 10)
streaming_backoff_initial_batch_size: 20,
// Maximum batch size after repeated streaming failures (default: 10000)
streaming_backoff_max_batch_size: 20000,
// Timeout for streaming connection in ms (default: 5000)
streaming_connection_timeout_ms: 10000,
// Timeout for streaming operations (peek/next) in ms (default: 5000)
streaming_statement_timeout_ms: 10000,
};
Tuning guidelines:
checkpoint_buffer_size: Increase for high-throughput scenarios, decrease to reduce memory usage.ingest_concurrency: Controls how many checkpoint fetches run concurrently. By default this is adaptive: it starts at 1 and scales up to 500 based on how full the downstream broadcast channel is. The controller uses a dead band (60%–85% fill) to avoid oscillation, increasing concurrency when the channel is draining fast and decreasing when it backs up. You can override withConcurrencyConfig::Fixedfor a static limit or customize the adaptive bounds. The adaptive controller also exposes adead_bandparameter to override the fill-fraction thresholds, but the defaults should work well for most workloads.retry_interval_ms: Lower values reduce latency for live data, higher values reduce unnecessary retries.streaming_backoff_initial_batch_size: Number of checkpoints to process via polling after the initial streaming failure. Lower values restore streaming faster, higher values reduce connection attempts.streaming_backoff_max_batch_size: Maximum checkpoints to process via polling after repeated failures. The batch size increases exponentially from the initial size up to this maximum. Higher values reduce connection attempts during prolonged outages.streaming_connection_timeout_ms: Time to wait for streaming connection establishment. Increase for slower networks.streaming_statement_timeout_ms: Time to wait for streaming data operations. Increase if checkpoints are large or network is slow.
Database connection configuration
let db_args = DbArgs {
// Connection pool size (default: 100)
db_connection_pool_size: 200,
// Connection timeout in ms (default: 60,000)
db_connection_timeout_ms: 30000,
// Statement timeout in ms (default: None)
db_statement_timeout_ms: Some(120000),
};
Tuning guidelines:
db_connection_pool_size: Size based onwrite_concurrencyacross all pipelines.db_connection_timeout_ms: Reduce for faster failure detection in high-load scenarios.db_statement_timeout_ms: Set based on expected query complexity and database performance.
Command-line arguments
Include the following command-line arguments to help focus processing. These values are for demonstration. Use values that make sense to your environment and goals.
# Checkpoint range control
--first-checkpoint 1000000 # Start from specific checkpoint
--last-checkpoint 2000000 # Stop at specific checkpoint
# Pipeline selection
--pipeline "tx_counts" # Run specific pipeline only
--pipeline "events" # Can specify multiple pipelines
Use cases:
- Checkpoint range: Essential for backfills and historical data processing.
- Pipeline selection: Useful for selective reprocessing or testing.
- Skip watermark: Enables faster backfills when watermark consistency isn't required.
Pipeline-specific advanced tuning
For complex configuration scenarios requiring deep understanding of pipeline internals:
Tokio runtime debugging
For performance-sensitive pipelines or when troubleshooting async runtime issues, the sui-indexer-alt-framework integrates with tokio-console, a powerful debugger for async Rust applications. This tool provides real-time insights into task execution, helping identify performance bottlenecks, stuck tasks, and memory issues.
When to use Tokio console
The Tokio console is particularly useful for:
- Performance debugging: Identifying slow or blocking tasks.
- Memory analysis: Finding tasks consuming excessive memory.
- Concurrency issues: Detecting tasks that never yield or wake themselves excessively.
- Runtime behavior: Understanding task scheduling and execution patterns.