Please use this identifier to cite or link to this item:
http://repositorio.yachaytech.edu.ec/handle/123456789/734
Title: | Deep learning neural network development for the classification of bacteriocin sequences produced by lactic acid bacteria |
Authors: | Cruz Varela, Jonathan David Gudiño Gomezjurado, Marco Esteban González Bohórquez, Lady Laura |
Keywords: | Redes neuronales Bacteriocina Vectores de incrustación Deep learning Bacteriocin Embedding vectors |
Issue Date: | Mar-2024 |
Publisher: | Universidad de Investigación de Tecnología Experimental Yachay |
Abstract: | El surgimiento de bacterias resistentes a los antibióticos demanda la exploración de compuestos naturales con mecanismos innovadores para reemplazar los antibióticos existentes. Las bacteriocinas ofrecen alternativas prometedoras para desarrollar estrategias terapéuticas y preventivas de enfermedades en la cría de animales, acuicultura y salud humana. Especialmente, las bacteriocinas producidas por BAL puesto que son reconocidas como GRAS y QPS. En este estudio se usó una red neuronal de aprendizaje profundo para la clasificación binaria de secuencias de bacteriocinas, distinguiendo aquellas producidas por BAL. Este tipo de red tiene puede aprender patrones y representaciones complejos de datos. Se extrajeron las características mediante el método k-mer y la incrustación de vectores. Se probaron diez grupos diferentes, combinando vectores de incrustación y k-mers: EV, 'EV+3-mers', 'EV+5-mers', 'EV+7-mers', 'EV+15-mers', 'EV+20-mers', 'EV+3-mers+ 5-mers', 'EV+3-mers+7-mers', 'EV+5-mers+7mers' y 'EV+15-mers+20-mers'. Se obtuvo cinco conjuntos de 100 k-mers característicos de bacteriocinas producidas por BAL, para valores de k = 3, 5, 7, 15 y 20. Se observó una diferencia significativa entre el grupo EV y '5-mers+7-mers+EV', mostrando mejores resultados en precisión y pérdida este último. Al utilizar validación cruzada con k=30, los resultados promedio para pérdida, exactitud, precisión, sensibilidad y puntuación F1 fueron de 9.900%, 90.143%, 90.300%, 90.100% y 90.100%, respectivamente. La carpeta 22 destacó con una pérdida del 8.500%, exactitud del 91.471% y la precisión, sensibilidad y puntuación F1 del 91.000%. Presentando un rendimiento que concuerda con la bibliografía existente. |
Description: | The rise of antibiotic-resistant bacteria presents a pressing need for exploring new natural compounds with innovative mechanisms to replace existing antibiotics. Bacteriocins offer promising alternatives for developing therapeutic and preventive strategies in livestock, aquaculture, and human health. Specifically, those produced by LAB are recognized as GRAS and QPS. This study was used a deep learning neural network for binary classification of bacteriocin amino acid sequences, distinguishing those produced by LAB. This type of network can learn complex patterns and representations of data. The features were extracted using the k-mer method and vector embedding. Ten different groups were tested, combining embedding vectors and k-mers: EV, 'EV+3-mers', 'EV+5-mers', 'EV+7-mers', 'EV+15-mers', 'EV+20-mers', 'EV+3-mers+5-mers', 'EV+3-mers+7-mers', 'EV+5-mers+7-mers', and 'EV+15-mers+20-mers'. As results, five sets of 100 characteristic k-mers unique to bacteriocins produced by LAB were obtained for values of k = 3, 5, 7, 15, and 20. Significant difference was observed between the EV group and '5-mers+7-mers+EV', showing superior accuracy and loss results in the last group. Employing k-fold cross-validation with k=30, the average results for loss, accuracy, precision, recall, and F1 score were 9.900%, 90.143%, 90.300%, 90.100%, and 90.100% respectively. Folder 22 stood out with 8.500% loss, 91.471% accuracy, and 91.000% precision, recall, and F1 score. Presenting a performance that agrees with the existing literature. |
URI: | http://repositorio.yachaytech.edu.ec/handle/123456789/734 |
Appears in Collections: | Biomedicina |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.