Monte carlo cross validation for response surface benchmark
Andre Beschorner, Matthias Voigt and Konrad Vogeler
When meta-models are fitted to the underlying data it is essential to validate the model by a trustworthy criterion. Many industry relevant applications are characterized by a high input parameter dimension and time consuming, cost intensive deterministic computations. In these typical cases the data size is not much higher than the amount of coefficients in the polynomial equation of the meta-model, which will lead to an overestimation of the model quality determined by the Coefficient of Determination (CoD).
An alternative benchmark criterion for response surfaces can be delivered by cross-validation (CV), where an overestimation of the meta-model quality is unusual.
This paper will use a published industry relevant example [4] to compare the simple CoD with a Monte Carlo cross-validation CoD (CoDMCCV ). Detailed investigations of the MCCV-method are made using a fast calculating test-model with similar characteristics as the original deterministic model. The results will show the influence of the splitting ratio, the number of cross-validation runs and the number of the deterministic samples in the database on the CoDMCCV result values and their variance. To predict the variance we will give correlations, for a code internal adjustment of MCCV calculation parameters. The discussion will point out the importance of the sample to coefficients ratio (SCR) and conclude the advantages and disadvantages of the tested method.
Beschorner, A., Voigt, M., Vogeler, K., 2014.
"Monte carlo cross validation for response surface benchmark".
Proceedings of the 12th International Probabilistic Workshop, 2014.
(online version: doi: 10.1466/20141125.01)