Skip to main content

When AI Learns to Be Mediocre: A Tale of Feedback Gone Wrong

· 3 min read
Max Kaido
Architect

The Discovery

It started with a simple observation: our fancy trading signal dashboard was showing 31,753 neutral signals, 0 sell signals, and a suspiciously uniform distribution. Something was clearly wrong with our "intelligent" signal generation system.

"That's odd," I thought, "the signals aren't diverse enough 🙂"

Little did I know I was about to uncover one of the most ironic implementations of machine learning I've ever seen.

The Investigation

Diving into the codebase, I found our signal generation system with its proudly implemented "feedback loop" - a mechanism designed to make the system smarter over time by adjusting the weights of different technical indicators.

The comments were promising:

/**
* The feedback loop that adjusts indicator weights.
* If an indicator's signal agrees with the final signal, its weight increases.
* Otherwise, its weight is reduced. The weights are then normalized.
*/

Sounds reasonable, right? Reward indicators that agree with the final signal, penalize those that don't. Classic reinforcement learning... or so I thought.

The Hilarious Truth

As I examined the implementation more closely, I realized what was actually happening:

const alignment = detail.signal === finalSignal ? 1 : -1;
// Update weight using a simple multiplicative factor.
this.indicatorWeights[detail.indicator] *=
1 + this.learningRate * alignment * detail.strength;

Wait a minute... the system was adjusting weights based on whether an indicator agreed with the final signal - which itself was just a weighted average of all the indicators!

This wasn't reinforcement learning. This was a popularity contest.

The Democracy of Dumb Indicators

Imagine you have 10 people in a room voting on whether it's raining outside:

  • 8 people have no windows and are just guessing
  • 2 people can actually see outside and confirm it's pouring rain

In a normal scenario, you'd listen to the 2 people who can see outside.

But our system was designed to:

  1. Take a vote from all 10 people
  2. Go with the majority opinion (8 votes for "not raining")
  3. Then increase the voting power of everyone who voted with the majority
  4. And decrease the voting power of the two people who actually saw the rain

The next time it rains, those two truthful observers have even less influence!

The Self-Reinforcing Cycle of Mediocrity

Over time, our system had:

  1. Boosted the weights of indicators that frequently output NEUTRAL signals
  2. Decimated the weights of any indicators brave enough to suggest a directional trade
  3. Created a system that was structurally incapable of producing diverse signals

The most absurd part? The system never checked if its signals were actually profitable or correct - it just assumed that whatever the majority of indicators said must be right.

Lessons Learned

This cautionary tale teaches us several important lessons about designing feedback systems:

  1. Validate Against Reality: A feedback loop needs to validate predictions against actual outcomes, not just internal consensus.

  2. Beware of Self-Reinforcing Biases: Systems that reward agreement with themselves will amplify any initial biases.

  3. Value Diversity of Opinion: Sometimes the minority view is correct, especially in markets where contrarian positions can be profitable.

  4. Check Your Assumptions: The comment describing the feedback loop sounded reasonable, but the implementation created the opposite of what was intended.

The Fix

The solution? Either:

  1. Disable the feedback loop entirely
  2. Reset the weights to their initial values
  3. Implement a proper feedback mechanism that validates signals against actual market outcomes

For now, we've gone with option 1, and suddenly our system is producing a much more diverse and useful set of signals.

Conclusion

Sometimes the most educational bugs are the ones that make you laugh. Our "learning" system had indeed learned something - it had learned to be as bland and non-committal as possible.

In the world of trading signals, that's about as useful as a weather forecaster who only ever predicts "partly cloudy with a chance of weather."

Remember, folks: when implementing machine learning, make sure your system is learning from reality, not just from itself!


Have you encountered similarly ironic implementations in your own systems? Share your stories in the comments below!