Open Access Open Access  Restricted Access Subscription or Fee Access

Retention Index Prediction of Flavor and Fragrance by Multiple Linear Regression and the Genetic Algorithm

(*) Corresponding author

Authors' affiliations



The Kovats retention indexes of 51 flavor and fragrance compounds are determined using optimal molecular descriptor data. The genetic algorithm, built with Perl, is used to select the optimum molecular descriptor. The optimal molecular descriptor is used to predict the Kovats retention index with a multiple linear regression created with R. The determination of the molecular descriptor value can be efficiently conducted with free (open source) Online Chemical Database software. Both Perl and R are also free software. The results demonstrate that the 51 flavor and fragrance compounds give 170 molecular descriptors. Among those molecular descriptors, the optimal six are selected based on 200 repetitions in order to build a multiple linear regression (MLR). The best model is selected, and the optimization indicator has an R-Square value of 0.981, Adjusted R-Square value of 0.978 and root mean square error (RMSE) value of 43.50. The constructed genetic algorithm-multiple linear regression (GA-MLR) model can also predict the Kovats retention index with a differences average of 6.6%. The obtained results demonstrate that the GA-MLR method can predict the Kovats retention index.
Copyright © 2019 Praise Worthy Prize - All rights reserved.


Genetic Algorithm; Molecular Descriptor; Multiple Linear Regression; Kovats Retention Index; Flavor and Fragrance

Full Text:



J. A. García Domínguez, J. C. Diez-Masa, and V. Davankov, Retention parameters in chromatography (IUPAC Recommendations 2001). Part A. Hold-up volume concept in column chromatography, Pure and Applied Chemistry, Vol. 73:969–992, January 2001.

Nanda Earlia, Rahmad, Mohamad Amin, Cita Rosita Sigit Prakoeswa, Khairan, and R. Idroes, The Potential Effect of Fatty Acids from Pliek U on Epidermal Fatty Acid Binding Protein: Chromatography and Bioinformatic Studies, Sains Malaysiana, Vol. 48:1019–1024, 2019.

S. Utami Tunjung Pratiwi, E. Lagendijk, S. Weert, R. Idroes, T. Hertiani, and C. Hondel, Effect of Cinnamomum burmannii Nees ex Bl. and Massoia aromatica Becc. Essential Oils on Planktonic Growth and Biofilm formation of Pseudomonas aeruginosa and Staphylococcus aureus In Vitro, Int. J. Appl. Res. Nat. Prod., Vol. 8:1–13, March 2015.

E. C. Estevam et al., Inspired by nature: The use of plant-derived substrate/enzyme combinations to generate antimicrobial activity in situ, Nat. Prod. Commun., Vol. 10, 2015.

N. Earlia, R. Suhendra, M. Amin, C. R. S. Prakoeswa, and R. Idroes, GC/MS Analysis of Fatty Acids on Pliek U Oil and Its Pharmacological Study by Molecular Docking to Filaggrin as a Drug Candidate in Atopic Dermatitis Treatment, Sci. World J., Vol. 2019, 2019.

E. Kováts, Gas chromatographic characterization of organic compounds. Part 1: Retention indices of aliphatic halides, alcohols, aldehydes and ketones, Helv. Chim. Acta, Vol. 41:1915–1932, 1958.

R. Idroes, Evaluation of Chromatographic Dead Time for Determination of Retention Index in RP-HPLC Using Multiple Homologous Series, Indones. J. Pharm., Vol. 20:133–40, 2009.

F. Gritti, Y. Kazakevich, and G. Guiochon, Measurement of hold-up volumes in reverse-phase liquid chromatography: definition and comparison between static and dynamic methods, J. Chromatogr. A, Vol. 1161:157–169, 2007.

R. Idroes, Determination of Absolute Retention Index System in High Performance Liquid Chromatography (RP-HPLC), Malaysian J. Anal. Sci, Vol. 9:224–32, 2005.

D. P. Nowotnik and R. K. Narra, A Comparison of Methods for the Determination of Dead Time in a Reversed-Phase High-Performance Liquid Chromatography System Used for the Measurement of Lipophilicity, J. Liq. Chromatogr., Vol. 16:3919–3932, December 1993.

L. Didaoui, A. Touabet, A. Y. Badjah Hadj Ahmed, B. Y. Meklati, and W. Engewald, Evaluation of Dead Time Calculation in Reversed-Phase Liquid Chromatography Using a Multiparametric Mathematical Method, J. High Resolut. Chromatogr., Vol. 22: 559–564, October 1999.<559::aid-jhrc559>;2-r

R. Idroes, A. F. Japnur, R. Suhendra, and A. Rusyana, Kovats Retention Index analysis of flavor and fragrance compound using Biplot Statistical method in gas chromatography systems, IOP Conf. Ser. Mater. Sci. Eng., Vol. 523, July 2019.

H. Noorizadeh, A. Farmany, and M. Noorizadeh, Application of GA-PLS and GA-KPLS calculations for the prediction of the retention indices of essential oils, Quim. Nova, Vol. 34:1398–1404, 2011.

H. Abdollahi et al., Prediction and optimization studies for bioleaching of molybdenite concentrate using artificial neural networks and genetic algorithm, Miner. Eng., Vol. 130:24–35, 2019.

F. Akdeniz, M. Biçil, Y. Karadede, F. E. Özbek, and G. Özdemir, Application of real valued genetic algorithm on prediction of higher heating values of various lignocellulosic materials using lignin and extractive contents, Energy, Vol. 160:1047–1054, 2018.

I. Vlašić, M. Ðurasević, and D. Jakobović, Improving genetic algorithm performance by population initialisation with dispatching rules, Comput. Ind. Eng., Vol. 137, 2019.

J. Valadi and P. Siarry, Applications of Metaheuristics in Process Engineering (2014).

T. Hancock, R. Put, D. Coomans, Y. Vander Heyden, and Y. Everingham, A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies, Chemom. Intell. Lab. Syst., Vol. 76:185–196, 2005.

R. Sujee and K. E. Kannammal, Energy efficient adaptive clustering protocol based on genetic algorithm and genetic algorithm inter cluster communication for wireless sensor networks, 2017 Int. Conf. Comput. Commun. Informatics, ICCCI 2017.

R. Vaishali, R. Sasikala, S. Ramasubbareddy, S. Remya, and S. Nalluri, Genetic algorithm based feature selection and MOE Fuzzy classification algorithm on Pima Indians Diabetes dataset, Proc. IEEE Int. Conf. Comput. Netw. Informatics, ICCNI 2017.

Husin, N., Mustapha, N., Sulaiman, M., Hybridization of Genetic Algorithm and Neural Network on Predicting Dengue Outbreak, (2017) International Review on Computers and Software (IRECOS), 12 (5), pp. 219-224.

H. Shi and M. Xu, A Data Classification Method Using Genetic Algorithm and K-Means Algorithm with Optimizing Initial Cluster Center, 2018 IEEE Int. Conf. Comput. Commun. Eng. Technol. CCET 2018.

S. Kar and M. M. J. Kabir, Comparative Analysis of Mining Fuzzy Association Rule using Genetic Algorithm, 2nd Int. Conf. Electr. Comput. Commun. Eng. ECCE 2019.

Goudjil, K., Sbartai, B., Optimization of Shear Wave Velocity (Vs) from a Post-Liquefaction Settlement Using a Genetic Algorithm Multi-Objective NSGA II, (2017) International Review of Mechanical Engineering (IREME), 11 (3), pp. 175-180.

Hastuti, K., Azhari, A., Musdholifah, A., Supanggah, R., Rule-Based and Genetic Algorithm for Automatic Gamelan Music Composition, (2017) International Review on Modelling and Simulations (IREMOS), 10 (3), pp. 202-212.

V. H. Masand and V. Rastija, PyDescriptor: A new PyMOL plugin for calculating thousands of easily understandable molecular descriptors, Chemom. Intell. Lab. Syst., Vol. 169:12–18, 2017.

Y. Wang, J. Chen, W. Tang, D. Xia, Y. Liang, and X. Li, Modeling adsorption of organic pollutants onto single-walled carbon nanotubes with theoretical molecular descriptors using MLR and SVM algorithms, Chemosphere, Vol. 214:79–84, 2019.

G. Vinotha and T. V Sundar, ScienceDirect Drug Likeness Prediction Using Structure Based Molecular Descriptors and Support Vector Machines, Mater. Today Proc., Vol. 18:1658–1669, 2019.

M. J. Martinez, M. Razuc, and I. Ponzoni, MoDeSuS: A machine learning tool for selection of molecular descriptors in qsar studies applied to molecular informatics, Biomed Res. Int., Vol. 2019, 2019.

I. Sushko et al., Online chemical modeling environment (OCHEM): Web platform for data storage, model development and publishing of chemical information, J. Comput. Aided. Mol. Des., Vol. 25:533–554, 2011.

K. L. Goodner, Practical retention index models of OV-101, DB-1, DB-5, and DB-Wax for flavor and fragrance compounds, LWT - Food Sci. Technol., Vol. 41:951–958, 2008.

M. Cassotti and F. Grisoni, Variable Selection Methods: An Introduction (2012).

J. Zhang, C. H. Zheng, Y. Xia, B. Wang, and P. Chen, Optimization enhanced genetic algorithm-support vector regression for the prediction of compound retention indices in gas chromatography, Neurocomputing, Vol. 240:183–190, 2017.

George, R., Hasanien, H., Al-Durra, A., Badr, M., Model Predictive Controller for Performance Enhancement of Automatic Voltage Regulator System, (2018) International Journal on Energy Conversion (IRECON), 6 (6), pp. 208-217.

Oloulade, A., Moukengue, A., Vianou, A., Multi-Criteria Optimization of the Functionning of a Distribution Network in Normal Operating Regime, (2018) International Review of Electrical Engineering (IREE), 13 (4), pp. 290-296.

Pishbin, S., Moghiman, M., Optimization of Cyclone Separators Using Genetic Algorithm, (2018) International Journal on Engineering Applications (IREA), 6 (3), pp. 91-99.

Boukef, H., Benrejeb, M., Borne, P., Genetic Algorithm and Based Particle Swarm Optimization Comparison for Solving a Flow-Shop Multiobjective Scheduling Problem in Pharmaceutical Industries, (2018) International Journal on Engineering Applications (IREA), 6 (6), pp. 221-226.

Tran, K., Modified GA Tuning IPD Control for a Single Tilt Tri-Rotors UAV, (2018) International Review of Aerospace Engineering (IREASE), 11 (1), pp. 1-5.


  • There are currently no refbacks.

Please send any question about this web site to
Copyright © 2005-2023 Praise Worthy Prize