BS Detection Phase 1: Systematic Sampling with Minimal Human Oversight

March 25, 2025 · 3 min read

Architect

The key part is human in the loop. And that human needs very minimal involvement but effective. What I suggest: let's start other way around from that human in the loop.

Artifacts already part of our system. You can find them in ArtifactModule.

Every day in Mercury channel MAAT creates artifact (TG msg + gist). TG message will be entry point for human and notification (once per day is bearable :D).

Key info obviously in gist.

Inside gist there are several files. In one file we put BS candidates from specific task (actually method).

Each file should be crafted in a way so if human just copy content he can send it right away to some LLM with huge context and allow it to do heavylifting then return response with insights.

File structure approx: {task specific prompt} + [BS candidates].

How we collect BS candidates? Those are collected from individual methods that can produce BS. Each method we assign some probability of BS sampling based on approx freq of method invocation per day.

MAAT implements method bsCollector => (gistFileName, samplingProbabilty) => logic to store BS candidate.

Once 1-7 implemented which is basic storage of BS candidates and human notification system, we go to specific methods that known to be BS source and add to those specific method optional extra param for glorious bsCollector.

And those methods we slightly update adding few lines of code in a place where BS candidate may appear.

That lines of code would with defined probability and in presence of bsCollector would do sampling. Basically that logic will prepare one string entity that we can include into BS candidates in specific file.

Core Design

MAAT, our validation system, will implement a simple yet effective approach:

Daily Artifact Generation: MAAT generates a daily Telegram message + GitHub Gist containing potential BS candidates
Method-specific Sampling: BS candidates are collected from specific methods with known BS potential
LLM-ready Format: Content is pre-formatted for direct analysis by large context LLMs
Probabilistic Collection: Sampling occurs with configurable probability based on method frequency

Implementation Details

1. BS Collector Function

The core of our implementation is a simple collector function:

function bsCollector(gistFileName: string, samplingProbability: number) {
  return (candidateData: any) => {
    if (Math.random() > samplingProbability) return; // Skip based on probability

    // Store the BS candidate for inclusion in the specified gist file
    storeBSCandidate(gistFileName, candidateData);
  };
}

2. Integration Points

We'll integrate the BS collector at specific points in our codebase:

// Example integration in a method known to produce BS
async function transformMarketData(data: MarketData, options: Options) {
  // Optional collector injection
  const collector = options.bsCollector || null;

  const result = performTransformation(data);

  // Sample with configurable probability if collector is provided
  if (collector) {
    collector({
      input: data,
      output: result,
      context: { methodName: 'transformMarketData', timestamp: new Date() },
    });
  }

  return result;
}

3. Daily Artifact Creation

Once per day, MAAT compiles all collected BS candidates:

Creates a GitHub Gist with separate files for each method
Each file contains:
- Method-specific prompt for LLM analysis
- Collected BS candidates in a structured format
Posts a Telegram message to the Mercury channel with a link to the Gist

4. Human Review Process

Human reviews the daily Telegram notification
Copies content from relevant Gist files
Pastes directly into an LLM with large context window
Reviews LLM analysis to identify actual BS
Takes action based on findings

Benefits

Low Engineering Overhead: Simple implementation with minimal impact on existing code
Efficient Human Time: One daily review covers multiple potential issues
Wide Coverage: Samples across many methods and execution paths
Scalable: Easy to add new sampling points as needed
LLM-Optimized: Leverages large context models for heavy lifting

Initial Implementation Focus

Implement core bsCollector function
Create daily Gist + Telegram message generation
Add sampling to 3-5 high-priority methods known to produce BS
Define method-specific LLM prompts for effective analysis

Next Steps

After collecting initial data, we'll:

Refine sampling probabilities based on BS frequency
Expand coverage to additional methods
Improve formatting based on LLM analysis feedback

This Phase 1 approach provides immediate value while building toward more sophisticated automated validation in future phases.

Core Design​

Implementation Details​

1. BS Collector Function​

2. Integration Points​

3. Daily Artifact Creation​

4. Human Review Process​

Benefits​

Initial Implementation Focus​

Next Steps​