shiin

Chutes

Hardware /

Emission:12.30%

Chutes is a serverless AI compute platform that allows developers to run AI models and custom code without managing any infrastructure. The platform automatically handles all backend operations including servers, GPUs, and scaling, making AI deployment as simple as choosing a model or uploading code.

Developers often don't want compute, they want to use AI models. However, current platforms force them to configure GPUs, manage servers, and handle scaling. Traditional cloud providers like AWS rent raw compute that requires extensive setup. Even specialized platforms like Replicate cost significantly more. They also limit developers to AI models only, which means they would not support arbitrary computational tasks (AI workloads).

This creates massive overhead. Developers spend most of their time on infrastructure instead of building AI applications. Meanwhile, centralized API providers like OpenAI control model access, pricing, and availability. This limits developer choice and creates dependency risks.

Chutes' Solution

Chutes solves this through its "serverless compute" platform. This means users don't have to worry about managing any hardware. Instead, they simply choose an existing AI model or bring their own code. Chutes handles everything else - servers, GPUs, load balancing, scaling, and infrastructure.

This approach works like "Squarespace for AI." Squarespace lets users create websites without technical complexity. Similarly, Chutes lets developers deploy and use AI models without managing servers. Costs are up to 90% lower than other specialized platforms. The distributed infrastructure reduces single-point-of-failure risks and scales with the subnet's demand.

Chutes currently processes nearly 160 billion tokens daily across 400,000+ users.

Understanding the Miner's Work

Miners work by running packages called "chutes." A chute is essentially code that can be anything - AI models like LLMs, data processing applications, or any computational task that runs as a job. Miners choose which chutes to run based on their hardware and profit potential.

They must decide which models to keep "hot" based on demand patterns. Hot models stay loaded in memory and ready to run immediately. Cold models are unloaded to save compute. This creates a strategic challenge. Miners want to keep popular models hot to serve requests instantly. However, they can't keep running everything due to hardware limitations.

When users request cold models, miners compete in a race to launch them first. They do this through the bounty mechanism. The first miner to successfully start a cold model earns the bounty. These bounties increase over time until someone responds. This creates competition among miners to optimize their startup speeds.

How Validation Works

Chutes introduced a different approach to validation "auditors". This approach ensures fairness while lowering barriers to entry. One main validator , operated by Rayon Labs, runs the Chutes platform itself. This validator uses approximately 16 H200 GPUs to manage and coordinate the platform.

Both the main validator and miners continuously create detailed activity reports. These reports include every request processed, response times, and work completed. Each report gets a unique digital fingerprint. This fingerprint is recorded permanently on the blockchain. This creates a tamper-proof record of all activity.

Other validators participate as "auditors" who can verify fairness without needing expensive hardware. They download reports from the main validator. Then they check that scores are being set fairly and accurately. They also verify that rewards match actual work performed. The mathematical proofs make falsifying data nearly impossible. If someone changed a report after creating it, the digital fingerprint would no longer match.

Auditors also send test requests to verify the system works properly. They track which miners handle each test. Then they confirm these appear in the validator's reports. This proves no requests are hidden or manipulated.

Most importantly, auditors can recreate the entire reward calculation using public data. They download seven days of reports. Next, they calculate what each miner should earn. Then they compare this to actual rewards distributed. This ensures the main validator pays miners fairly. It also allows multiple independent parties to verify correct distribution.

How Miners Get Rewarded

The main validator assigns scores using four metrics measured over seven days. This longer timeframe encourages stable, reliable service. It also prevents gaming through short bursts of activity.

Compute units make up 55% of the score. They measure total productive work completed. This calculation uses compute multiplier times processing time plus any bounties earned.
Invocation count makes up 25% and tracks successful user requests handled.
Unique chute count makes up 15%. This measures the average number of unique chutes that were active during the scoring period.
Bounty count makes up 5%. This recognizes miners who complete high-priority tasks like cold starts.

Security and Hardware Validation

Chutes addresses the challenge of verifying that miners own the hardware they claim. Without proper verification, dishonest miners could game the subnet while providing no real compute.

Chutes prevents this through GraVal (Graphics Validation). When miners register in the subnet, they must prove their GPUs are real through intensive tests. The system pushes GPUs to their limits with mathematical operations. Each GPU gets a unique "fingerprint" that cannot be copied or faked.

The validation process happens continuously, not just during initial setup. Every time miners restart their servers, they must pass new GraVal challenges. To ensure they didn't change their hardware afterwards.

Beyond hardware verification, Chutes also protects miners from external attacks. Chutes miners don't announce their internet addresses publicly. This shields them from DDoS attacks that overwhelm servers with fake traffic. Instead, all communication flows through secure proxy servers. This makes it impossible for attackers to target individual miners directly.

Chutes' Performance and Results

Chutes has grown considerably since launching in late 2024. The subnet achieved 100 billion tokens processed in a single day within just 5 months of launch. It has processed over 9.1 trillion tokens across all of its models and currently processes nearly 160 billion tokens daily. These serve users and queries worldwide.

The subnet has served over 400,000 users across Chutes and Squad. This represents only direct platform usage. It doesn't include third-party integrations such as OpenRouter.

Revenue generation has begun with fiat payment integration launched in June 2024. The platform operates a base tier structure with 200 requests daily, followed by pay-as-you-go pricing. Current revenue generation reaches approximately $61,000 weekly. Usage retention currently remains at 62% after payment requirements were implemented.

Chutes operates over 10,000 GPU nodes. These provide 5 million watts of computing capacity. This is comparable to a small power plant. The subnet includes thousands of NVIDIA H200s as backbone infrastructure. It also has over 1,000 L40s and hundreds of A6000s and other enterprise GPUs.

Below is a table showing the current count of the 5 most provisioned GPUs on Chutes:

	GPU Count	Max Power (per GPU)
NVIDIA H200	3921	700 W
NVIDIA GeForce RTX 4090	1446	450 W
NVIDIA L40	1128	300 W
NVIDIA RTX A6000	891	300 W
NVIDIA GeForce RTX 3090	618	350 W

According to the team, the network operates at approximately 20-30% capacity. This means it could handle up to 5 times more traffic without adding more hardware.

Roadmap And Future Plans

Chutes is expanding beyond basic AI usage. It aims to become a comprehensive platform for AI. The roadmap shows plans across multiple quarters. Major features are already in development.

The second half of 2025 will introduce support for extended operations and model training. The planned "Long Jobs" capability will support extended computational tasks. These include training large models, running complex simulations, and other compute-intensive work. These tasks can take hours or days to complete.

The team is implementing Trusted Execution Environments (TEEs). These are hardware-guaranteed security systems designed for enterprise security requirements. TEEs provide complete isolation of user computations through hardware-level protection. This protects sensitive computational tasks. The open-source architecture allows transparent security verification. This makes Chutes suitable for sensitive enterprise work where businesses need assurance of data privacy and security.