Start from the signal's failure modes
A context layer is worth testing when it describes why the base strategy works better or worse in certain environments. That is a different standard from simply asking whether the extra feature correlates with returns somewhere in history.
For example, a short-horizon mean-reversion sleeve often cares about volatility spikes, book quality, and crowding stress long before it cares about a slower macro narrative.