• Data-Augmented Machine Learning for In Vitro Starch Digestibility Prediction and SHAP-Based Interpretation of Structure–Rheology Features
  • Yeonsong Nam*,#, Sehyeon Jin**,#, Yerin Hyun*, Cheng Li***, Ji Hun Park****, *****, ******, Jongbin Lim*, *******,† , and Yun Am Seo**,†

  • *Department of Food Bioengineering, Jeju National University, Jeju 63243, Korea
    **Department of Data Science, Jeju National University, Jeju-si, 63243, Korea
    ***Food & Nutritional Sciences Programme, School of Life Sciences, The Chinese University of Hong Kong, Shatin 999077, Hong Kong
    ****Department of Science Education, Ewha Womans University, Seoul 03760, Korea
    *****Institute for Multiscale Matter and Systems, Ewha Womans University, Seoul 03760, Korea
    ******Ecogear Inc. Jeju Factory, Jeju 63359, Korea
    *******Interdisciplinary Graduate Program in Advance Convergence Technology and Science, Jeju National University, Jeju 63243, Korea

  • 데이터 증강 기반 기계학습을 이용한 전분의 In Vitro 소화율 예측과 구조–레올로지 특성의 SHAP 기반 중요도 해석
  • 남연송*,# · 진세현**,# · 현예린* · Cheng Li*** · 박지훈****, *****, ****** · 임종빈*, *******,† · 서윤암**,†

  • Reproduction, stored in a retrieval system, or transmitted in any form of any part of this publication is permitted only by written permission from the Polymer Society of Korea.

References
  • 1. Englyst, H. N.; Kingman, S. M.; Cummings, J. H. Classification and Measurement of Nutritionally Important Starch Fractions. Eur. J. Clin. Nutr. 1992, 46, S33-S50.
  •  
  • 2. Tester, R. F.; Karkalas, J.; Qi, X. Starch—composition, Fine Structure and Architecture. J. Cereal Sci. 2004, 39, 151-165.
  •  
  • 3. Singh, N.; Singh, J.; Kaur, L.; Sodhi, N. S.; Gill, B. S. Morphological, Thermal and Rheological Properties of Starches from Different Botanical Sources. Food Chem. 2003, 81, 219-231.
  •  
  • 4. Jenkins, D. J.; Wolever, T. M.; Taylor, R. H.; Barker, H.; Fielden, H.; Baldwin, J. M.; Bowling, A. C.; Newman, H. C.; Jenkins, A. L.; Goff, D. V. Glycemic Index of Foods: a Physiological Basis for Carbohydrate Exchange. Am. J. Clin. Nutr. 1981, 34, 362-366.
  •  
  • 5. Sajilata, M. G.; Singhal, R. S.; Kulkarni, P. R. Resistant Starch—a Review. Compr. Rev. Food Sci. Food Saf. 2006, 5, 1-17.
  •  
  • 6. Englyst, H. N.; Cummings, J. H. Digestion of the Polysaccharides of Some Cereal Foods in the Human Small Intestine. Am. J. Clin. Nutr. 1985, 42, 778-787.
  •  
  • 7. Chen, L.; Zhang, H.; Liu, Q.; Pang, L. Application of Machine Learning in Food Science. Compr. Rev. Food Sci. Food Saf. 2020, 19, 2019-2035.
  •  
  • 8. Ropodi, A. I.; Panagou, E. Z.; Nychas, G. J. Data Mining Derived From Food Analyses Using Non-invasive/non-destructive Analytical Techniques; Determination of Food Authenticity, Quality & Safety in Tandem with Computer Science Disciplines. Trends Food Sci. Technol. 2016, 50, 11-25.
  •  
  • 9. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, 2009.
  •  
  • 10. Shorten, C.; Khoshgoftaar, T. M. A Survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60.
  •  
  • 11. Patki, N.; Wedge, R.; Veeramachaneni, K. The Synthetic Data Vault. In Proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, 17-19 October 2016; pp 399-410.
  •  
  • 12. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5-32.
  •  
  • 13. Smola, A. J.; Schölkopf, B. A Tutorial on Support Vector Regression. Stat. Comput. 2004, 14, 199-222.
  •  
  • 14. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, 2016; pp 785-794.
  •  
  • 15. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); Curran Associates: Red Hook, NY, 2017; pp 3146-3154.
  •  
  • 16. Zou, H.; Hastie, T. Regularization and Variable Selection via the Elastic Net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2005, 67, 301-320.
  •  
  • 17. Lundberg, S. M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); Curran Associates: Red Hook, 2017; pp 4765-4774.
  •  
  • 18. Lundberg, S. M.; Erion, G. G.; Lee, S.-I. Consistent Individualized Feature Attribution for Tree Ensembles. arXiv 2018, DOI:10.48550/arXiv.1802. 03888.
  •  
  • 19. Kotelnikov, A.; Baranchuk, D.; Rubachev, I.; Babenko, A. TabDDPM: Modelling Tabular Data with Diffusion Models. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023; pp 17564-17579.
  •  
  • 20. Zhao, Z.; Kunar, A.; Birke, R.; Van der Scheer, H.; Chen, L. Y. CTAB-GAN+:enhancing tabular data synthesis. Front. Big Data 2024, 6, 1296508.
  •  
  • Polymer(Korea) 폴리머
  • Frequency : Bimonthly(odd)
    ISSN 2234-8077(Online)
    Abbr. Polym. Korea
  • 2024 Impact Factor : 0.6
  • Indexed in SCIE

This Article

  • 2026; 50(3): 467-476

    Published online May 25, 2026

  • 10.7317/pk.2026.50.3.467
  • Received on Jan 19, 2026
  • Revised on Feb 12, 2026
  • Accepted on Feb 14, 2026

Correspondence to

  • Jongbin Lim*, *******, Yun Am Seo**
  • *Department of Food Bioengineering, Jeju National University, Jeju 63243, Korea
    **Department of Data Science, Jeju National University, Jeju-si, 63243, Korea
    *******Interdisciplinary Graduate Program in Advance Convergence Technology and Science, Jeju National University, Jeju 63243, Korea

  • E-mail: jongbinlim@jejunu.ac.kr, seoya@jejunu.ac.kr