Improving performance evaluations using calibration