SubnetsGradients
jiim

Gradients

Text /
+2
Emission:2.66%

Gradients is a competitive AutoML platform where miners compete to fine-tune AI models. Multiple miners independently find optimal configurations for each dataset, rather than using a single predetermined strategy. The platform supports Instruct, DPO, GRPO, and image fine-tuning at $100-500, compared to Google Cloud's $10,000+ pricing.

Imagine running a wine business and wanting an AI assistant that understands a specific product catalog. Today, there are two expensive options: hire machine learning engineers for $100,000+ each, or pay services like Google Cloud up to $10,000 for a single 70B model fine-tuning session. Most businesses simply can't afford either option, leaving them locked out of fine-tuned AI models for their use-case.

The challenge goes deeper than cost. Current AutoML platforms face an essential limitation in how they approach model fine-tuning. When these platforms fine-tune an AI model for specific needs, they use a single predetermined strategy to find the best fine-tuning "configuration". Google Cloud AutoML, HuggingFace AutoTrain, and TogetherAI all work this way, and while this single approach works well for general cases, it often misses the specific combinations that would work best for a particular data and task.

Gradients solves this by creating an AutoML platform where multiple miners compete to fine-tune the best model for specific needs. Instead of using a single predetermined approach, Gradients pools multiple miners to work independently on finding the optimal fine-tuning configuration for each user's dataset and model.

The platform offers competitive total training costs, with smaller models costing around $100, 20B models around $250, and 70B models around $500. This compares to Google's pricing of over $10,000 for fine-tuning a 70B model on their Vertex AI platform.

Users simply upload their data, select a model, and start training. The entire process takes just a few clicks, making AI fine-tuning accessible to businesses without technical expertize.

To meet different needs, Gradients supports four different types of model fine-tuning, each designed for specific use cases.

Instruct fine-tuning is the standard method that teaches AI models to follow instructions and answer questions correctly. A wine business might fine-tune a model to answer "What wines pair well with dinner?" with "Our Chardonnay and Pinot Noir are excellent choices" rather than generic responses about wine categories.

Direct Preference Optimization (DPO) fine-tuning goes beyond teaching what to reply by focusing on how to reply. This method trains models to prefer certain response styles by showing them pairs of responses (one preferred and one rejected) for the same prompt. A wine business AI might learn to respond with "Based on your taste for lighter wines, I'd recommend our Pinot Grigio" instead of "Try our white wine selection" when customers ask for suggestions.

Group Relative Policy Optimization (GRPO) fine-tuning represents a different approach, teaching models to maximize specific goals through reward functions that score different responses. A wine business might define reward functions that reward both product knowledge accuracy and sales conversion.

Beyond text, Diffusion fine-tuning (image) teaches AI models to create images in specific styles, subjects, or artistic looks. The model studies the patterns, colors, and compositions in training images, then applies these learned characteristics when generating new images. A content creator could fine-tune a model this way to generate images in the "Studio Ghibli" animation style, for a product of their own.

When validators created fine-tuning tasks, they assigned each one to a pool of miners, typically more miners for image tasks than text tasks. The validator used a rotation system that prioritized miners who haven't worked recently and those with strong performance records.

Each miner received the same starting model and training data. However, they had to determine their own optimal approach. Each miner analyzed the task requirements independently and configured their fine-tuning strategy based on the specific dataset and model. This included choosing the best training techniques, learning rates, and optimizations.

The fine-tuning process ran within strict time limits that varied based on dataset size. Text tasks typically got 3-10 hours while image tasks got 1-2 hours. Miners had to optimize their fine-tuning methods and compute resources to achieve the best results within these constraints. Once fine-tuning completed, miners submitted their fine-tuned models to validators.

Gradients uses a single main validator operated by Rayon Labs, with independent auditors verifying fairness. This approach ensures fairness while lowering barriers to entry.

The main validator coordinates the platform by creating both organic tasks from real users and synthetic tasks that keep miners active when user demand is low. When miners complete their fine-tuning tasks, the validator tests their submitted models in isolated environments against "secret" test datasets that miners never see during training. For text models, this means measuring how well they handle held-out conversations. For image models, it tests how accurately they can restore images after adding controlled noise.

Other validators can participate as "auditors" to verify the main validator's work without expensive hardware. Auditors download task results from the past 7 days and independently recalculate what the weights should be. They compare these calculated weights to the actual weights set on-chain, verifying that weights are being set fairly and accurately.

Within each individual task, the top-performing miner got 3 points, the bottom 25% of miners participating received -1 points as penalties, while all miners between (middle performers) received 0 points. This created incentives for producing the best fine-tuned results and discouraged miners from accepting tasks they couldn't complete properly.

The subnet maintained specific distributions for both task frequency and reward weights to balance miner incentives across different training types. When the validator generated synthetic tasks to keep miners active during low user demand, it followed a predetermined distribution:

  • Instruct tasks made up 35% of synthetic tasks and received 35% of total reward weight.
  • DPO tasks made up 20% of synthetic tasks and received 10% of total reward weight.
  • GRPO tasks made up 20% of synthetic tasks and received 35% of total reward weight.
  • Image tasks made up 25% of synthetic tasks and received 20% of reward weight.

Building on this, Gradients introduced a new incentive mechanism in July 2025 that creates competitions for miners in a tournament-like design. Instead of miners returning with finished models, they now submit a link to their fine-tuning code for everyone to see. The actual fine-tuning then runs on the main validator's compute in isolated, secure environments with no internet access, resolving a privacy concern with the legacy incentive mechanism.

Tournaments run for about a week, with a day's break between competitions to prepare for the next round. Up to 32 miners participate in each tournament, selected based on their ALPHA stakes in Subnet 56 (Gradients). Miners who have participated in previous tournaments receive an ALPHA stake bonus that helps them qualify more easily. The tournaments encourage competition with miners earning exponentially more points for each round they advance through.

All participating miners compete together in one large group on a single challenge, with the top-8-performing ones advancing to the knockout rounds where miners face off directly against each other until one "challenger" remains. The remaining challenger must then face the current defending champion in a special "boss round" consisting of multiple tasks. To claim the title, the challenger needs to win by a specific margin (5-10%), depending on how many times the champion has defended their title already.

At the end of each tournament, the fine-tuning code of the first and second-place miners gets permanently archived in Gradients' open-source repositories. The result is a curated, permanent library of verified winning methods ensures new miners don't have to start from scratch. This creates constant pressure for defending champions to keep improving their techniques, as other miners can study and adapt their code within hours.

The team conducted 180 controlled experiments to compare their performance against traditional AutoML platforms. The experiments compared Gradients with HuggingFace AutoTrain, TogetherAI, Google Cloud Vertex AI, and Databricks. Performance was measured by loss scores on held-out test data that miners never accessed during training:

Gradients performed strongest on RAG tasks (44.4% above median) and translation (36.8% above median). HuggingFace showed comparable performance on reasoning tasks (15.3% vs 15.6%) but struggled significantly with code generation (-32%). TogetherAI consistently performed below median across all task types, particularly on RAG (-73.9%) and reasoning (-35%). In all the experiments Gradients achieved an 82.8% win rate against HuggingFace AutoTrain and 100% win rates against TogetherAI, Databricks, and Google Cloud.

MEDIAN AND TASK TYPES

The median represents "middle" performance across all platforms, with positive scores showing above-average performance and negative scores showing below-average performance. RAG (Retrieval-Augmented Generation) tests models on answering questions using external documents. Reasoning evaluates logical thinking. Translation measures accuracy in converting text between languages. Code tests the ability to generate and understand code. Math measures mathematical problem-solving. These results suggest that Gradients' competitive fine-tuning discovers configurations that traditional AutoML platforms consistently miss.

Gradients has transitioning exclusively to tournament-based mining in October 2025. The legacy mining system was be retired, with all fine-tuning requests from customers served through the tournaments. Tournament participation was opened to all miners with an approximately $80 buy-in per tournament.

A new emission structure has launched alongside this transition. Base emissions were set at 10% for image tasks and 15% for text tasks. Miners who demonstrate strong performance improvements can now earn up to 40% for image tasks and 60% for text tasks.

Privacy-focused compute hosting will go live during October-November 2025, allowing the platform to host training infrastructure for enterprise customers. A TAO payment integration launches simultaneously, enabling customers to pay for fine-tuning services with TAO tokens.

Within six months, Gradients will add support for multiple dataset training. Users will be able to upload or select multiple datasets that train sequentially, with each resulting model merged and used as input for the next training sequence. This allows businesses to build specialized models by combining different types of training data. The team is also training a 33 billion parameter model following the same methodology used for the 8 billion parameter model that outperformed Qwen 3 Instruct (the base model).

Within 12 months, Trusted Execution Environment (TEE) implementation will provide enterprise-grade privacy guarantees for sensitive training data. TEEs create hardware-isolated environments where computations remain invisible even to system administrators, ensuring that customer data never leaves the secure environment in readable form. Gradients may also expand to support video model training depending on market demand.