?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Predicci%C3%B3n+de+la+variedad+del+vino+aplicando+t%C3%A9cnicas+de+Machine+Learning&rft.creator=Cachata+Huapaya%2C+Cristhian+Alexis&rft.contributor=G%C3%B3mez+Canaval%2C+Sandra&rft.subject=Food+Science+and+Technology&rft.subject=Computer+Science&rft.description=La+clasificaci%C3%B3n+de+texto+es+una+forma+de+extracci%C3%B3n+de+informaci%C3%B3n+a+partir+de+un+escrito+que+permite+catalogar+el+contenido+del+mismo+en+diversas+clases.+En+este+proyecto+se+presenta+el+desarrollo+de+un+sistema+que%2C+usando+t%C3%A9cnicas+de+Machine+Learning%2C+permite+la+clasificaci%C3%B3n+de+distintas+descripciones+de+vinos+seg%C3%BAn+su+variedad.+Dichas+descripciones+forman+parte+de+las+valoraciones+de+expertos+procedentes+de+una+p%C3%A1gina+web+de+los+Estados+Unidos+de+Am%C3%A9rica+cuyos+datos+est%C3%A1n+disponibles+en+el+sitio+web+Kaggle.+El+inter%C3%A9s+de+este+conjunto+de+datos+radica+entre+otras+cosas%2C+en+los+comentarios+realizados+sobre+el+vino+respecto+a+la+variedad+de+uva+asociada+a+cada+una+de+las+descripciones.+En+particular%2C+este+Proyecto+realiza+un+proceso+de+pretratamiento+de+un+conjunto+de+textos+iniciales+que+consiste+en+la+eliminaci%C3%B3n+de+palabras+en+las+pescripciones+que+coincidan+con+la+variedad%2C+as%C3%AD+como+tambi%C3%A9n+la+eliminaci%C3%B3n+de+cualquier+signo+ortogr%C3%A1fico.+A+continuaci%C3%B3n%2C+se+construyen+diversos+modelos%2C+se+eval%C3%BAan+y+se+optimizan+para+posteriormente+seleccionar+el+que+obtenga+los+mejores+resultados+a+la+luz+de+los+valores+exhibidos+para+las+m%C3%A9tricas+de+calidad+aplicadas.+Cabe+la+pena+resaltar+que%2C+aunque+se+han+encontrado+diversos+problemas%2C+entre+los+que+se+encuentran+la+poca+cantidad+de+datos+que+se+dispone+y+el+gran+desbalanceo+entre+las+distintas+clases%2C+el+modelo+seleccionado+consigui%C3%B3+predecir+correctamente+el+75%25+de+las+muestras+de+comentarios+sobre+su+calidad.+Finalmente%2C+se+considera+que+los+resultados+obtenidos+por+este+Proyecto+son+interesantes+para+trasladar+el+conocimiento+adquirido+y+el+modelo+inicial+a+otros+conjuntos+de+datos+que+hayan+sido+correctamente+preprocesados.+Igualmente+ser%C3%ADa+interesante+trasladar+el+modelo+a+escenarios+de+uso+similares%2C+como+por+ejemplo+a+datasets+con+informaci%C3%B3n+de+la+cata+de+otros+productos+como+el+t%C3%A9+o+el+caf%C3%A9.%0D%0AAbstract%3A%0D%0AText+classification+is+a+way+of+information+extraction+from+a+document+that+allows+its+content+to+be+cataloged+in+various+classes.+This+project+presents+the+development+of+a+system+that%2C+using+Machine+Learning+techniques%2C+allows+the+classification+of+different+wine+descriptions+according+to+their+variety.+These+descriptions+are+part+of+the+evaluations+of+experts+from+a+website+in+the+United+States+of+America+whose+data+are+available+on+the+Kaggle+website.+The+interest+of+this+data+set+lies%2C+among+other+things%2C+in+the+comments+made+on+the+wine+with+regard+to+the+grape+variety+associated+with+each+of+the+descriptions.+Specially%2C+this+Project+performs+a+pre-treatment+process+of+a+set+of+original+texts+consisting+of+the+removal+of+words+in+the+descriptions+that+match+the+variety%2C+as+well+as+the+removal+of+any+spelling+signs.+Then%2C+various+models+are+constructed%2C+evaluated%2C+optimized%2C+and+selected+to+obtain+the+best+results+according+to+the+values+displayed+for+the+applied+quality+metrics.+It+is+worth+noting+that%2C+although+several+problems+were+found%2C+including+the+small+amount+of+data+available+and+the+large+imbalance+between+the+different+classes%2C+the+selected+model+managed+to+correctly+predict+75%25+of+the+sample+comments+on+its+quality.+Finally%2C+it+is+considered+that+the+results+obtained+by+this+Project+are+interesting+to+transfer+the+knowledge+acquired+and+the+initial+model+to+other+data+sets+that+have+been+correctly+pre-processed.+It+would+also+be+interesting+to+transfer+the+model+to+similar+use+scenarios%2C+such+as+datasets+with+tasting+information+on+other+products+such+as+tea+or+coffee.&rft.publisher=E.T.S.I+de+Sistemas+Inform%C3%83%C2%A1ticos+(UPM)&rft.rights=https%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby-nc-nd%2F3.0%2Fes%2F&rft.date=2020-07&rft.type=info%3Aeu-repo%2Fsemantics%2FbachelorThesis&rft.type=Final+Project&rft.type=PeerReviewed&rft.format=application%2Fpdf&rft.language=spa&rft.format=application%2Fzip&rft.language=spa&rft.rights=info%3Aeu-repo%2Fsemantics%2FrestrictedAccess&rft.identifier=https%3A%2F%2Foa.upm.es%2F64362%2F