Short answer
You compare a backtest against a real baseline by turning the idea into a repeatable decision rule, attaching realistic turnover and risk constraints, and checking whether the workflow still holds up once the flattering assumptions are removed.
In research workflow, the useful version of this workflow is the one that survives a clear benchmark, realistic execution assumptions, and a portfolio context that does not quietly change the rules after the backtest is done.