Professional Documents
Culture Documents
ML For Aso
ML For Aso
ML, a branch of artificial intelligence that employs a variety of statistical, probabilistic, and
optimization techniques, allows computers to learn from past experience and detect hard-to-
discern patterns from large, noisy or complex data sets. The aim of ML is to develop general
purpose algorithms which can automatically detect patterns in complex data through a training
process, and then use these discovered patterns to make predictions for future unknown data.
Therefore, ML is a powerful tool that allows researchers to make generalizations from limited
data rather than exhaustively examining all the possibilities.
Cross validation is used to test the accuracy of the models
ML steps.
Around 10000 profiles are generated each profile with distinct 12 parsec parameters.
XFOIL simulations were performed to obtain aero coefficients (Cl max, Cdmin, Cl/Cdmax, and Cd@CLMAX)
for these profiles.
A data base is generated for these results.
Machine Learning algorithm is used to perform a regression on train data set to prepare the
model.
Model with best R-Squared value and minimum RMSE, MAE and Rate of error is chosen.
Model is tested on the basis of unseen data which is the test data set and accuracy is checked.
The model is then tuned within the grid of depth, learning-rate and iterations which improves
the model accuracy.
The model is finalized on the basis of whole, train and test data set.
To avoid overfitting we use the standard shuffled, 10-fold cross-validation technique to get an
unbiased estimate of the model performance. This was implemented in Python language using a
scikit-learn library and the associated “CATBOOST” algorithm.
90% of the data in these subsets (representing 9000 profiles and their respective aero-coefficient)
was used for training purposes, while the remaining 10% were used for validation of the model.
The search space for the 3 parameters to use with the CBR was discretized
For each choice of the 3 parameters, Cat Boost Regressor was used on each of the 10
subsets of data, and the average r2 obtained in all 10 subsets was calculated, and used as a
measure of accuracy
The optimal values of the 3 parameters were determined (these were: learning rate=0.15;
number of iterations =1000; max depth of decision trees=8)
The CBR was trained one final time using the optimal set of parameters and the entire
dataset (representing 9000 profiles and their respective aero-coefficient); again, 90% of the
data was used for training and the remaining 10% was used for validation.