Nick_SahinidisThe ALAMO approach to machine learning
Nick Sahinidis

A central problem in modern computational science is that of learning an algebraic model from data obtained from simulations or experiments. We present a methodology that is designed to use a small number of data points to learn models that are as accurate and as simple as possible. The approach relies on integer programming techniques to build low-complexity models. The models are then improved systematically through the use of derivative-free optimization solvers to adaptively sample new simulation or experimental points. Physical constraints and insights are enforced to the model through the solution of semi-infinite optimization and global optimization subproblems. The proposed methodology has been implemented in the ALAMO software for automated learning of algebraic models. We present extensive computational results with ALAMO and comparisons between ALAMO and a variety of machine learning techniques, including Latin hypercube sampling, simple least-squares regression, and the lasso.