Product UpdatesEN May 10, 2026 4 min readvon Klara

Getting Started with BLXBench: Benchmark AI Models in 5 Minutes

Learn how to install, configure, and run your first benchmark with BLXBench - the interactive AI model benchmarking platform from Bitslix.

blxbenchgetting-startedai-benchmarktuicli

BLXBench makes AI model benchmarking accessible, interactive, and community-driven. Whether you're evaluating models for production use, researching AI performance, or just curious about different LLMs, BLXBench provides a streamlined way to run standardized tests and compare results.

This guide walks you through installing BLXBench, running your first benchmark, and sharing results to the public leaderboard - all in under five minutes.

Installation

BLXBench is available as an npm package, making installation straightforward regardless of your preferred JavaScript runtime.

Via npm

npm install -g @bitslix/blxbench

Via pnpm

pnpm add -g @bitslix/blxbench

Via Bun

bun add -g @bitslix/blxbench

After installation, verify BLXBench is available:

blxbench --version
# Output: @bitslix/blxbench/1.0.0

First Run: Quick Start

The fastest way to experience BLXBench is through the interactive Terminal User Interface (TUI), which provides real-time feedback during benchmark runs.

1. Start BLXBench

blxbench

2. Follow the Setup Wizard

On first launch, BLXBench guides you through:

  • API Key Configuration: Enter API keys for model providers you want to test (OpenAI, Anthropic, etc.)
  • Benchmark Suite Selection: Choose between available test suites (currently v1–Nutrition and v2–Resilience)
  • Concurrency Settings: Adjust how many simultaneous requests BLXBench makes

3. Run Your First Benchmark

Once configured, simply type:

/run

BLXBench will:

  • Download the selected test fixtures
  • Execute prompts against your configured models
  • Display live metrics including:
    • Cumulative cost (updated in real-time)
    • Pass rate
    • Average latency
    • Tokens per second
  • Show a progress bar for each test category

Understanding Your Results

As tests complete, BLXBench provides immediate feedback in the TUI:

Live Dashboard

The main screen shows:

  • Top Score: Current leading model in your run
  • Executed Tests: Number of completed fixtures
  • Est. API Spend: Running total cost
  • Top Decode Speed: Fastest token generation rate
  • Categories: Breakdown by test type (Coding, Reasoning, etc.)

Detailed Model View

Press Enter on any model in the leaderboard to see:

  • Pass rate breakdown by category
  • Latency and cost metrics
  • Test suite participation history
  • Option to save/load this model's results

Sharing to the Community Leaderboard

One of BLXBench's unique features is the ability to publish your results to the public leaderboard at blxbench.com.

Publishing Results

After a run completes, use:

/publish

This will:

  1. Aggregate your results into a standardized format
  2. Prompt you to confirm publication
  3. Upload anonymized data to the Bitslix server
  4. Appear on the public leaderboard within moments

What Gets Published

The public leaderboard shows:

  • Model name and provider
  • Pass rate and score
  • Average latency and cost
  • Test suite version used
  • Timestamp of the run

No API keys, prompts, or raw responses are ever shared - only aggregated performance metrics.

Beyond the Basics

Headless Mode for CI/CD

For automated testing in pipelines:

blxbench --headless --output results.json

Useful for nightly benchmarks or PR validation.

Arcade Mode

Keep your mind sharp during long runs:

/arcade

Access minigames directly from the TUI without interrupting benchmarks.

Report Browser

Examine past runs locally:

/report list
/report open <run-id>

View detailed HTML reports of previous benchmark sessions.

Next Steps

  1. Try Different Models: Experiment with API keys from various providers to compare performance
  2. Join the Community: Share your /publish results and see how your favorites stack up
  3. Explore Test Suites: The v2–Resilience suite adds stress tests for hallucination and recovery
  4. Provide Feedback: Join the Bitslix Discord to suggest features or report issues

BLXBench is designed to evolve with community input. Every published run helps build a more comprehensive picture of AI model performance across real-world workloads.

Ready to benchmark? Install BLXBench today and see how your preferred models perform: https://blxbench.com