Brainplay is a subnet of Bittensor designed to benchmark AI models through competitive gameplay. Instead of relying solely on abstract mathematical scores, this approach allows people to visually understand a model’s performance by watching it play interesting and engaging games.
Traditional model evaluation methods can be difficult to interpret and lack visibility for general audiences. Brainplay makes AI benchmarking more accessible and entertaining by using games as the evaluation method. By observing AI models competing in games, users can intuitively grasp which models perform best, making AI evaluation more transparent, understandable, and fun.
- Uses TVM (Targon) for miner model submission and validator-side querying
- Both miners and validators require a Targon API key
- Miners deploy models via Targon; validators query server endpoints miners deployed via TVM
- No long-lived miner server; validator remains CPU-only
- Miners must have sufficient Targon credits to deploy and serve on TVM
- ✅ Codenames (first implemented game)
- ✅ 20 Questions (
twentyq) - ✅ SuperMario (
supermario) - first vision benchmark competition - 🚀 More games coming soon! (We plan to add more interesting games to further diversify benchmarking.)
1. Each game consists of two teams.
2. Each team is composed of two miners (AI models).
3. The teams compete in a game.
4. The winning team's miners receive a score.
For comprehensive details about Codenames, please visit: https://en.wikipedia.org/wiki/Codenames_(board_game)
Official rules PDF (stored in repo): Codenames Rules
- A validator starts a shared 20 Questions room.
- Each selected miner answers yes/no style questions to infer the hidden target.
- Scores are persisted through the shared backend and generic score store.
twentyqpublishes through the sharedllmweight group onmechid=0.
For background on the game format, see: 20 Questions
- A validator creates one shared SuperMario backend room.
- Each selected miner runs an isolated Mario benchmark attempt through its committed Targon endpoint.
- The validator polls each miner run, streams normalized step/frame updates to backend, and uploads the final video artifact when available.
- Final scores are round-relative: the best valid run gets
1.0, other valid runs scale by progress, and failed/invalid runs score0.0. supermariopublishes through thevisionweight group onmechid=1.
Canonical competition code is supermario. mario is accepted only as a temporary compatibility alias in deploy/profile handling.
The reward mechanism in Brainplay is designed to incentivize AI models (miners) to perform optimally during gameplay. Here's how it works:
-
Winning Team Rewards:
- The team that wins the game receives a reward. Each miner in the winning team is awarded a score based on their staking amount and performance.
-
Reward Calculation:
- The reward is calculated based on the outcome of the game and the staking amount of each miner. For instance, if the "red" team wins, the miners in the red team receive a higher reward compared to the blue team, with the reward being proportional to their staking amount. Conversely, if the "blue" team wins, the blue team miners receive the reward.
-
Reward Distribution:
- The rewards are distributed as an array of scores. For example, if the red team wins, the reward array might look like
[1.0, 1.0, 0.0, 0.0], where the first two values represent the scores for the red team miners, and the last two values represent the scores for the blue team miners. The actual values are adjusted based on the staking amounts.
- The rewards are distributed as an array of scores. For example, if the red team wins, the reward array might look like
-
Transparency and Fairness:
- The reward mechanism is designed to be transparent and fair, ensuring that all miners have an equal opportunity to earn rewards based on their performance in the game and their staking contributions.
This reward system not only motivates the miners to perform better but also provides a clear and understandable metric for evaluating the effectiveness of different AI models in competitive scenarios, while also considering their staking commitments.
codenamesandtwentyqpublish through thellmweight group onmechid=0.supermariopublishes through thevisionweight group onmechid=1.- If a competition has no valid recent games, insufficient games, stale scores, no scores, or no valid winner, the validator publishes burn weights for that weight group instead of keeping stale miner weights.
-
The validator requires no additional dependencies beyond a standard CPU node.
-
Miners are served via TVM on Targon, so you do not need to run a long-lived miner server. Hardware requirements depend on the model you deploy to Targon, not on your local machine.
-
Validators query serverless endpoints via TVM and require a configured Targon API key.
- Operating System (Ubuntu 22.04.04+ recommended)
- Python Version (Python 3.10 + recommended)
git clone https://github.com/shiftlayer-llc/brainplay-subnet.gitcp .env.example .envAdd Targon API key (required for both miners and validators) to your .env file.
If you're a validator, add your OpenAI API key and wandb key before running your node.
TARGON_API_KEY=your-targon-api-key # required for both miners and validators
OPENAI_KEY=sk-your-key-here # required for validators only
WANDB_API_KEY=your-wandb-api-key # required for validators onlyTo ensure that your project dependencies are isolated and do not interfere with other projects, it's recommended to use a virtual environment. Follow these steps to set up a virtual environment:
-
Navigate to your project directory:
cd brainplay-subnet -
Create a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
- On macOS and Linux:
source venv/bin/activate - On Windows:
.\venv\Scripts\activate
- On macOS and Linux:
-
Verify the virtual environment is active: You should see
(venv)at the beginning of your command line prompt, indicating that the virtual environment is active. -
Deactivate the virtual environment: When you're done working in the virtual environment, you can deactivate it by simply running:
deactivate
By using a virtual environment, you ensure that your project's dependencies are managed separately from other projects, reducing the risk of version conflicts.
Ensure you have the required dependencies installed. You can use the following command to install them:
pip install -e .Run the validator manually and handle updates yourself:
python neurons/validator.py --wallet.name test_validator --wallet.hotkey h1 --netuid 117 --logging.infoRun only SuperMario:
python neurons/validator.py \
--wallet.name owner \
--wallet.hotkey default \
--netuid 335 \
--subtensor.network test \
--wandb.off \
--logging.info \
--competition supermarioRun one process per supported competition from the main launcher by omitting --competition:
python neurons/validator.py \
--wallet.name owner \
--wallet.hotkey default \
--netuid 335 \
--subtensor.network test \
--wandb.off \
--logging.infoor if you're using PM2
pm2 start neurons/validator.py --name brainplay-manual-validator -- --wallet.name test_validator --wallet.hotkey h1 --netuid 117 --logging.infoNote: With this method, you need to manually pull updates and restart the validator when new versions are available.
Set up automatic updates that keep your validator current with the latest code:
-
First-time setup (run once after cloning):
# Set up git hooks and script permissions chmod +x scripts/*.sh && chmod +x .git/hooks/post-merge 2>/dev/null || ./scripts/setup_hooks.sh
Note: This setup configures git to ignore file permission changes, preventing conflicts during future pulls.
-
Run the auto-validator:
./scripts/run_auto_validator.sh --wallet.name brainplay_validator --wallet.hotkey default --netuid 117 --logging.info
Benefits of Auto-Update:
- ✅ Automatically checks for updates every 5 minutes
- ✅ Pulls latest code and restarts validator when updates are available
- ✅ Maintains validator uptime and ensures you're always running the latest version
- ✅ Handles script permissions automatically after git pulls
- ✅ Creates backups before updates
- ✅ Comprehensive logging of all operations
v2.0 miners do not run a long-lived neurons/miner.py process. The miner flow is:
- Deploy a serverless model endpoint on Targon.
- Wait for the endpoint to become ready.
- Commit that endpoint UID on-chain so validators can discover and query it.
The deployment/commit entrypoint is deploy/miner.py.
- A funded Targon account with enough credits to deploy and serve your model
- A Bittensor wallet + hotkey registered on the subnet
TARGON_API_KEYavailable in.envor already stored via the Targon CLI- The repo installed with:
pip install -e .python deploy/miner.py \
--competition supermario \
--model "your-org/your-model" \
--wallet owner \
--hotkey default \
--network test \
--netuid 335What this command does:
- Reads the selected profile from
deploy/profiles/{competition}.json - Injects runtime env vars such as
MODEL,MINER_HOTKEY, andREASONING - Deploys one serverless container on Targon
- Waits until the
/metaendpoint reports the server is ready - Commits the endpoint UID to chain under the selected competition key(s)
- For
supermario, validators also verify the deployed image hash before using the endpoint.
Deploy only for Codenames:
python deploy/miner.py \
--competition codenames \
--model "your-org/your-model" \
--wallet owner \
--hotkey defaultDeploy only for 20 Questions:
python deploy/miner.py \
--competition twentyq \
--model "your-org/your-model" \
--wallet owner \
--hotkey defaultDeploy only for SuperMario:
python deploy/miner.py \
--competition supermario \
--model "your-org/your-vlm-model" \
--wallet owner \
--hotkey defaultDeploy one endpoint and commit it for the LLM competitions:
python deploy/miner.py \
--competition all \
--model "your-org/your-model" \
--wallet owner \
--hotkey default--competition all currently commits under codenames and twentyq. Deploy SuperMario separately with --competition supermario because it is a vision benchmark and publishes through mechid=1.
Pass extra SGLang flags if your model needs them:
python deploy/miner.py \
--competition twentyq \
--model "Qwen/Qwen2.5-32B-Instruct" \
--sglang-extra-args "--context-length 32768 --enable-torch-compile" \
--reasoning low \
--wallet owner \
--hotkey default--competition: profile name underdeploy/profiles/--model: model name/path passed into the deployment container--sglang-extra-args: extra SGLang server flags--reasoning: reasoning effort metadata exposed to validators; one ofnone,minimal,low,medium,high,xhigh--wallet: Bittensor wallet name--hotkey: Bittensor hotkey name--wallet-path: optional custom wallet directory--network: subtensor network, for examplefinneyortest--netuid: subnet netuid--commit-period: optional chain commitment period override
Miner profiles live in deploy/profiles/. Each JSON file defines the Targon serverless container spec used by deploy/miner.py.
Current profiles include:
deploy/profiles/codenames.jsondeploy/profiles/twentyq.jsondeploy/profiles/supermario.jsondeploy/profiles/mario.json(temporary alias forsupermario)deploy/profiles/all.json
At the moment these files intentionally use the same container template. They still exist separately so each competition can evolve independently later without changing the deployment workflow.
Each profile contains:
version: config schema version for Targonapp_name: logical app namecontainers: list of containers to deploycontainers[].name: container name;${NAME}is filled in bydeploy/miner.pycontainers[].resource: Targon resource tier, for exampleh100-smallcontainers[].image: container image to runcontainers[].port: exposed service portcontainers[].env: runtime environment variables injected into the containercontainers[].replicas: min/max replica settings and concurrency target
Important env placeholders used in these JSON files:
${NAME}: generated container name such asbrainplay-twentyq${MODEL}: value passed via--model${SGLANG_EXTRA_ARGS}: value passed via--sglang-extra-args${MINER_HOTKEY}: hotkey SS58 address from the wallet${REASONING}: value passed via--reasoning
codenames.json: deploys one endpoint and commits it only under thecodenameskey on-chaintwentyq.json: deploys one endpoint and commits it only under thetwentyqkey on-chainsupermario.json: deploys one endpoint and commits it only under thesupermariokey on-chainmario.json: short-lived compatibility alias; prefer--competition supermarioall.json: deploys one endpoint and commits the same endpoint under bothcodenamesandtwentyq
That means all.json is for miners who want one shared model endpoint to serve the LLM competitions. If you want different models or different runtime settings per competition, deploy them separately with codenames.json, twentyq.json, and supermario.json.
SuperMario validators expect the miner endpoint to expose the Mario run API:
POST /runsGET /runs/{run_id}GET /runs/{run_id}/steps?cursor=<n>GET /runs/{run_id}/video
The current SuperMario image hash verified by validators is:
4b9ba675ef3c8ca8b8e41dfe7636b5c72c507711befe76562d18326572efcfef
The miner keeps the original plain JSON commitment format. After deployment, the committed payload looks like this:
{
"codenames": "serv-u-xxxxxxxxxxxxxxxx",
"twentyq": "serv-u-yyyyyyyyyyyyyyyy",
"supermario": "serv-u-zzzzzzzzzzzzzzzz"
}If you deploy with --competition all, the LLM keys point to the same endpoint UID:
{
"codenames": "serv-u-xxxxxxxxxxxxxxxx",
"twentyq": "serv-u-xxxxxxxxxxxxxxxx"
}If you deploy SuperMario separately, the commitment includes the canonical supermario key:
{
"supermario": "serv-u-zzzzzzzzzzzzzzzz"
}Validators read this commitment from chain and pick the endpoint that matches the competition they are running.
