When AI Learns to Be Mediocre: A Tale of Feedback Gone Wrong
The Discovery
It started with a simple observation: our fancy trading signal dashboard was showing 31,753 neutral signals, 0 sell signals, and a suspiciously uniform distribution. Something was clearly wrong with our "intelligent" signal generation system.
"That's odd," I thought, "the signals aren't diverse enough 🙂"
Little did I know I was about to uncover one of the most ironic implementations of machine learning I've ever seen.
The Investigation
Diving into the codebase, I found our signal generation system with its proudly implemented "feedback loop" - a mechanism designed to make the system smarter over time by adjusting the weights of different technical indicators.
The comments were promising:
/**
* The feedback loop that adjusts indicator weights.
* If an indicator's signal agrees with the final signal, its weight increases.
* Otherwise, its weight is reduced. The weights are then normalized.
*/
Sounds reasonable, right? Reward indicators that agree with the final signal, penalize those that don't. Classic reinforcement learning... or so I thought.
The Hilarious Truth
As I examined the implementation more closely, I realized what was actually happening:
const alignment = detail.signal === finalSignal ? 1 : -1;
// Update weight using a simple multiplicative factor.
this.indicatorWeights[detail.indicator] *=
1 + this.learningRate * alignment * detail.strength;
Wait a minute... the system was adjusting weights based on whether an indicator agreed with the final signal - which itself was just a weighted average of all the indicators!
This wasn't reinforcement learning. This was a popularity contest.
The Democracy of Dumb Indicators
Imagine you have 10 people in a room voting on whether it's raining outside:
- 8 people have no windows and are just guessing
- 2 people can actually see outside and confirm it's pouring rain
In a normal scenario, you'd listen to the 2 people who can see outside.
But our system was designed to:
- Take a vote from all 10 people
- Go with the majority opinion (8 votes for "not raining")
- Then increase the voting power of everyone who voted with the majority
- And decrease the voting power of the two people who actually saw the rain
The next time it rains, those two truthful observers have even less influence!
The Self-Reinforcing Cycle of Mediocrity
Over time, our system had:
- Boosted the weights of indicators that frequently output NEUTRAL signals
- Decimated the weights of any indicators brave enough to suggest a directional trade
- Created a system that was structurally incapable of producing diverse signals
The most absurd part? The system never checked if its signals were actually profitable or correct - it just assumed that whatever the majority of indicators said must be right.
Lessons Learned
This cautionary tale teaches us several important lessons about designing feedback systems:
-
Validate Against Reality: A feedback loop needs to validate predictions against actual outcomes, not just internal consensus.
-
Beware of Self-Reinforcing Biases: Systems that reward agreement with themselves will amplify any initial biases.
-
Value Diversity of Opinion: Sometimes the minority view is correct, especially in markets where contrarian positions can be profitable.
-
Check Your Assumptions: The comment describing the feedback loop sounded reasonable, but the implementation created the opposite of what was intended.
The Fix
The solution? Either:
- Disable the feedback loop entirely
- Reset the weights to their initial values
- Implement a proper feedback mechanism that validates signals against actual market outcomes
For now, we've gone with option 1, and suddenly our system is producing a much more diverse and useful set of signals.
Conclusion
Sometimes the most educational bugs are the ones that make you laugh. Our "learning" system had indeed learned something - it had learned to be as bland and non-committal as possible.
In the world of trading signals, that's about as useful as a weather forecaster who only ever predicts "partly cloudy with a chance of weather."
Remember, folks: when implementing machine learning, make sure your system is learning from reality, not just from itself!
Have you encountered similarly ironic implementations in your own systems? Share your stories in the comments below!
