3
Accurate benchmarking of computational models is vital for the identification of the best-performing ones. However, these benchmarks, especially in biology, are rather ad-hoc and seldom pre-defined and standardized. We developed the Evaluation Framework For predicting Efficiency of Cancer Treatment (EFFECT) benchmark suite based on the DepMap and GDSC data sets to facilitate comparison of ML models predicting gene essentiality and/or drug sensitivity of in vitro cancer cell lines. We show that standard evaluation metrics like Pearson correlation are easily misled by inherent biases in the data.
back"*" indicates required fields