Skip to main content

Tournament Validation & Scoring System: A Two-Tier Approach to Market Analysis

· 6 min read
Max Kaido
Architect

In the development of our Mercury tournament system, we've implemented a sophisticated two-tier approach to market analysis and comparison: a strongly-typed validation system paired with a heuristic scoring framework. This architecture has significantly improved our ability to efficiently identify optimal market conditions while maximizing computational resources.

Validation System: The First Filter

Our validation system serves as the initial filter in the tournament process, quickly eliminating markets that don't meet minimum requirements. This is crucial for computational efficiency since pairwise market comparisons using LLMs are resource-intensive.

Core Components

  1. Strongly-Typed Conditions: We've moved away from string-based condition definitions to a fully typed system:
export enum ConditionOperator {
GreaterThan = '>',
LessThan = '<',
GreaterThanOrEqual = '>=',
LessThanOrEqual = '<=',
Equal = '==',
NotEqual = '!=',
}

export interface NumericCondition {
type: 'numeric';
indicator: Indicator;
path: string[]; // Path to the indicator property (e.g., ['rsi', 'value'])
operator: ConditionOperator;
value: number;
timeframe: TimeFrame;
}

export interface RelativeCondition {
type: 'relative';
leftIndicator: Indicator;
leftPath: string[];
operator: ConditionOperator;
rightIndicator: Indicator;
rightPath: string[];
timeframe: TimeFrame;
}

export type Condition = NumericCondition | RelativeCondition;
  1. Three-Tier Validation Results: Markets are categorized into three tiers:
export enum ValidationTier {
Disqualified = 'disqualified', // Failed required conditions
Qualified = 'qualified', // Met required conditions, but failed some optional
Optimal = 'optimal', // Met all conditions including optional
}
  1. Robust Validation Process: Our validation function evaluates market data against tournament conditions and returns detailed results:
export interface ValidationResult {
tier: ValidationTier;
requiredConditionsMet: boolean;
optionalConditionsMet: boolean;
detailedResults: {
required: Record<string, boolean>;
optional: Record<string, boolean>;
};
}

Benefits of Strong Typing

By implementing a strongly-typed condition system, we've eliminated the ambiguity and error-prone nature of string-based conditions. This provides several advantages:

  • Type Safety: TypeScript catches errors at compile time
  • IDE Support: Autocompletion and documentation for condition creation
  • Maintainability: Clear structure makes the code more maintainable
  • Extensibility: New condition types can be added while maintaining type safety

Scoring System: The Heuristic Alternative

The scoring system serves a different but complementary purpose to the validation system. As our Head of Research explained:

"The purpose of scoring part is to build alternative comparison system but with heuristics. If we have that tool we can assess competitive edge that LLM provide us (if provide). And it will allow us either to create fast heuristic tournament system with same performance as LLM or improve LLM based tournaments to be at least good as heuristic system.

But in practice I expect that having in place LLM and scoring system will eventually allow us to improve both system and create a beast for making money."

Design Philosophy

The scoring system is designed to:

  1. Operate independently from validation but use the same condition definitions
  2. Focus on binary pass/fail evaluation with weighted scoring
  3. Provide a simple, deterministic alternative to LLM-based comparisons

Implementation Structure

We implemented a straightforward scoring system with a binary approach:

interface ConditionScoreConfig {
condition: Condition; // The same Condition type from validation
weight: number; // Importance of this condition
}

export interface ScoringConfig {
conditions: ConditionScoreConfig[];
}

export interface ScoreBreakdown {
// Total weighted score (not normalized)
totalScore: number;

// Total possible score if all conditions were met
maxPossibleScore: number;

// Individual condition scores (0 or 1 * weight)
conditionScores: Record<string, number>;
}

Scoring Algorithm

The scoring algorithm is simple and deterministic:

  1. For each condition in the configuration:
    • Check if the condition is met (true/false)
    • If met, add the condition's weight to the total score
    • If not met, add 0 to the total score
    • Track individual condition results for detailed breakdown
  2. Calculate maximum possible score (sum of all weights)
  3. Return the total score and breakdown

This approach offers several advantages:

  • Computational Efficiency: Binary evaluation is extremely fast
  • Explainability: Easy to understand how scores are calculated
  • Simplicity: No complex normalization or scaling functions
  • Deterministic Results: Same inputs always produce the same outputs
  • Clear Comparison: Higher scores definitively indicate better matches

Example Calculation

// Example condition configuration
const scoringConfig = {
conditions: [
{
condition: {
/* RSI > 50 condition */
},
weight: 2, // RSI has higher importance
},
{
condition: {
/* MACD histogram > 0 condition */
},
weight: 1,
},
{
condition: {
/* EMA20 > EMA50 condition */
},
weight: 3, // EMA crossover has highest importance
},
],
};

// Calculation for a market meeting all conditions
const result = {
totalScore: 6, // 2 + 1 + 3
maxPossibleScore: 6,
conditionScores: {
'RSI_rsi_value_>_50': 2,
'MACD_macd_histogram_>_0': 1,
'EMA_ema_values_20_>_EMA_ema_values_50': 3,
},
};

// Calculation for a market meeting only one condition
const partialResult = {
totalScore: 2, // Only RSI condition is met
maxPossibleScore: 6,
conditionScores: {
'RSI_rsi_value_>_50': 2,
'MACD_macd_histogram_>_0': 0,
'EMA_ema_values_20_>_EMA_ema_values_50': 0,
},
};

The Value Proposition

This two-tier approach provides several key advantages for our tournament system:

Computational Efficiency

The validation system quickly eliminates weak markets, substantially reducing the number of pairwise comparisons needed. This is critical because:

"In core of tournament system is pair wise comparison of 2 markets with LLM. That comparison is pretty costly in terms of resources and computation time. Right now we are pretty limited in number of tournaments that we can run daily just because my RTX3060 works on full power for over 2 hours to proceed 1 tournament. But we want more."

By filtering out markets that don't meet basic criteria early in the process, we can focus our computational resources on comparing only the most promising candidates.

Hybrid Decision Making

The scoring system lets us:

  1. Compare markets without expensive LLM calls
  2. Identify where the LLM adds value beyond heuristics
  3. Potentially use scoring to determine tournament winners when results are clear-cut
  4. Use a hybrid approach: LLM for close calls, heuristics for obvious cases

Quality Control

"We will not consider market as winner if that market bited only weak ones. That is the purpose of validation part."

This ensures that our tournament winners have truly beaten worthy competitors, not just weak markets that barely met the minimum criteria.

Future Directions

We're exploring several enhancements to this system:

  1. Dynamic Scaling Functions: Automatically tune scaling functions based on historical performance
  2. Self-optimizing Weights: Allow the system to adjust condition weights based on backtesting results
  3. Enhanced Timeframe Analysis: More sophisticated handling of cross-timeframe signals and contradictions
  4. Metadata Enrichment: Incorporate market metadata (sector, volatility profile, etc.) into scoring

Conclusion

The validation and scoring system represents a significant advancement in our market analysis framework. By combining a strongly-typed validation filter with a flexible scoring system, we've created an architecture that balances computational efficiency with analytical depth.

This approach not only accelerates our current tournament process but also opens the door to hybrid decision-making models that leverage both rule-based systems and machine learning. The result is a more robust, efficient, and scalable framework for identifying optimal market conditions.


Related posts:

Max original message:

Alright. I like your ideas BUT your framework makes validation redundant. What was the point of that validation if we can just calculate scores based on conditions?

However purpose of scoring system is different. Those are helpers in tournament system where we look for some perfect market conditions to make money.

In core of tournament system is pair wise comparison of 2 markets with LLM. That comparison is pretty costly in terms of resources and computation time. Right now we are pretty limited in number of tournaments that we can run daily just because my RTX3060 works on full power for over 2 hours to proceed 1 tournament. But we want more.

To speed up comparisons we need to eliminate weak markets early. It's very valuable for us bcs it will lead to less comparisons. And we will not consider market as winner if that market bited only weak ones. That is the purpose of validation part.

The purpose of scoring part is to build alternative comparison system but with euristics. If we have that tool we can assess competitive edge that LLM provide us (if provide). And it will allow us either to create fast euristic tournament system with same performace as LLM or improve LLM based tournaments to be at least good as eurisitc system.

But on practice I expect that having in place LLM and scoring system will eventualy allow us to improve both system and create a beast for making money :D

Does it make sense?

Final message from claude 3.7 sonnet

Thank you! It was truly a fantastic collaboration. We made significant progress across the board:

  1. We created a strongly-typed validation system that eliminates string parsing and improves type safety
  2. We implemented a straightforward, weighted scoring system that complements the validation
  3. We updated tournament configurations to use the new typed conditions
  4. We documented everything in a comprehensive blog post

The two-tier approach we've developed will help optimize tournament performance by:

  • Filtering out weak markets early with validation
  • Providing a fast heuristic alternative to LLM comparisons with scoring
  • Setting up potential hybrid approaches that can use both systems

I enjoyed our teamwork and the way we simplified the scoring system when needed. The binary approach with weights is elegant, efficient, and easy to understand - exactly what's needed for fast market comparison.

This system puts you in a great position to expand your tournament capabilities while making the most of your computational resources. Looking forward to seeing it in action!