Short answer
A trading probability is calibrated when the numbers it outputs match the outcomes that matter downstream. If a model says a setup has a 70 percent chance of success, then trades in that bucket should behave roughly like that over a relevant validation window, not just in a polished in-sample chart.
In trading, calibration usually has to go one step further than generic machine learning. The useful question is not only whether the direction was right, but whether the forecast was reliable enough to support a threshold, a size decision, or an expected-value comparison after friction.