?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Establecer+relaciones+entre+usuarios+y+sus+intereses+mediante+Web+Scraping&rft.creator=Castro+Blanco%2C+Antonio&rft.contributor=Lara+Cabrera%2C+Ra%C3%BAl&rft.subject=Computer+Science&rft.description=En+la+actualidad+se+puede+adquirir+una+gran+cantidad+de+datos+de+internet%2C+ya+sea+por+cookies+o+cualquier+informaci%C3%B3n+que+puedan+dar+los+usuarios+de+cualquier+p%C3%A1gina+web.+Esto+es+de+vital+importancia+para+fines+comerciales%2C+econ%C3%B3micos+o+de+car%C3%A1cter+personal%2C+debido+a+que+estos+datos+determinan+los+intereses+de+cada+uno+de+los+usuarios+que+navegan+internet+o+los+datos+establecidos+en+p%C3%A1ginas+web.+Un+rastreador+web+tiene+como+funci%C3%B3n+inspeccionar+las+p%C3%A1ginas+de+internet+de+forma+automatizada+para+guardar+copias+enteras+o+partes+de+esas+p%C3%A1ginas.+Esto+hace+que+se+pueda+obtener+continuamente+informaci%C3%B3n+clave+de+los+internautas+de+un+sitio+web%2C+para+su+uso+personal+o+comercial+de+cualquier+empresa+interesada.+Este+Trabajo+de+Fin+de+Grado+ten%C3%ADa+como+objetivo+el+uso+de+web+scraping+para+realizar+una+base+de+datos+centrada+en+relaciones+de+los+intereses+de+los+usuarios+dentro+del+que+era+el+segundo+mayor+foro+de+habla+hispana%2C+Meristation.+Durante+el+transcurso+de+este+proyecto+se+ha+cerrado+el+foro+de+Meristation%2C+por+lo+que+ser%C3%A1n+dos+partes%2C+la+primera+hablando+del+proyecto+de+Meristation+y+la+segunda+se+hablar%C3%A1+del+proyecto+de+3dJuegos.+Meristation+ten%C3%ADa+un+gran+inter%C3%A9s+para+el+uso+de+rastreadores+web%2C+porque+estaba+dedicado+en+su+gran+mayor%C3%ADa+a+los+videojuegos.+Los+usuarios+pod%C3%ADan+crear+un+tema+en+cualquiera+de+sus+subforos%2C+permitiendo+a+los+internautas+comentar+acerca+de+ello.+Esto+consegu%C3%ADa+que+se+pudiera+establecer+una+relaci%C3%B3n+entre+el+inter%C3%A9s+en+videojuegos+o+videoconsolas+y+los+usuarios+que+interactuaban+a+partir+de+eso%2C+resultando+en+informaci%C3%B3n+de+gran+importancia+para+las+empresas+enfocadas+a+videojuegos+para+poder+realizar+anuncios+de+una+forma+m%C3%A1s+personalizada.+Esta+parte+del+trabajo+se+centraba+sobre+todo+en+la+relaci%C3%B3n+que+hay+entre+los+distintos+subforos+de+Meristation%2C+los+temas+que+puedan+crear+los+usuarios+en+esos+subforos+y+las+respuestas+de+los+dem%C3%A1s+internautas+a+esos+temas%2C+debido+a+que+se+puede+deducir+que+esos+usuarios+estaban+interesados+tanto+en+el+subforo+como+en+el+tema+en+cuesti%C3%B3n+dado+por+el+usuario+que+lo+ha+creado.+Al+conseguir+establecer+las+relaciones+entre+los+usuarios+y+los+temas+de+los+subforos+junto+a+los+comentarios+de+los+dem%C3%A1s+internautas+interesados%2C+se+guarda+en+un+fichero+JSON+para+despu%C3%A9s+poder+manipular+la+informaci%C3%B3n+mediante+una+base+de+datos+de+MongoDB.+En+la+realizaci%C3%B3n+de+este+Trabajo+de+Fin+de+Grado+se+ha+encontrado+un+grave+problema+debido+a+que+se+centraba+en+Meristation+y+dos+d%C3%ADas+despu%C3%A9s+de+haber+realizado+el+proyecto%2C+Meristation+ha+procedido+a+cerrar+el+foro%2C+siendo+el+segundo+m%C3%A1s+grande+de+Espa%C3%B1a+y+el+primero+en+videojuegos.+Esto+implica+que+no+se+podr%C3%ADa+demostrar+el+c%C3%B3digo+realizado+en+el+proyecto%2C+por+lo+que+ha+sido+obligatorio+usar+otra+p%C3%A1gina+web+para+complimentar+la+defensa+de+este+TFG.+Por+tanto%2C+varios+apartados+se+han+divido+en+dos+partes%2C+la+primera+realizada+en+Meristation+y+la+segunda+en+3Djuegos%2C+otro+de+los+foros+m%C3%A1s+importantes+de+videojuegos+en+Espa%C3%B1a.+Su+estructura+es+muy+parecida+a+la+que+exist%C3%ADa+en+la+zonaforo+de+Meristation%2C+por+lo+que+al+poseer+subforos+de+temas+tan+variados+sigue+siendo+un+reclamo+comercial+para+una+gran+cantidad+de+empresas%2C+adem%C3%A1s+de+poder+usar+un+sitio+web+con+el+crecimiento+exponencial+de+audiencia+de+3dJuegos+en+los+%C3%BAltimos+a%C3%B1os.+Para+realizar+este+TFG%2C+primero+se+har%C3%A1+una+breve+introducci%C3%B3n+sobre+crawling%2C+las+formas+de+realizarlo+y+los+beneficios+de+su+uso+ya+sea+a+corto+plazo+como+a+largo+plazo.+Despu%C3%A9s%2C+se+relata+acerca+de+los+lenguajes+en+los+que+se+puede+centrar+un+proyecto+de+web+scraping+especialmente+el+lenguaje+usado+para+este+proyecto%2C+Python%2C+y+sus+formas+de+rastrear+webs%2C+detallando+las+m%C3%A1s+importantes.+Tras+explicar+los+distintos+m%C3%A9todos+de+web+Scraping+de+Python%2C+se+explica+por+qu%C3%A9+se+ha+usado+Scrapy.+Se+explica+c%C3%B3mo+funciona+y+c%C3%B3mo+se+ha+utilizado+en+relaci%C3%B3n+al+TFG%2C+ya+sea+en+la+parte+de+Meristation+o+en+la+de+3dJuegos.+Para+acabar+el+apartado+de+desarrollo%2C+se+explica+c%C3%B3mo+se+han+tratado+los+datos+de+rastreo+en+JSON+y+su+posterior+guardado+en+una+base+de+datos+de+MONGODB.+A+continuaci%C3%B3n+se+examinar%C3%A1n+los+resultados+del+proyecto+y+los+problemas+que+ha+podido+causar+en+su+realizaci%C3%B3n.+Por+%C3%BAltimo%2C+al+haber+terminado+se+comprueba+que+se+cumplen+los+objetivos+pactados+antes+de+empezar+el+proyecto+y+se+realiza+una+conclusi%C3%B3n%2C+explicando+los+motivos+y+usos+que+se+podr%C3%ADan+dar+al+haber+cumplido+el+objetivo%2C+adem%C3%A1s+de+una+reflexi%C3%B3n+sobre+el+futuro+del+web+scraping+en+el+mundo.%0D%0AAbstract%3A+%0D%0ANowadays%2C+a+large+amount+of+data+can+be+acquired+from+the+Internet%2C+either+by+cookies+or+any+information+that+the+users+of+any+web+page+can+give.+This+is+of+vital+importance+for+commercial%2C+economic+or+personal+purposes%2C+because+these+data+can+be+used+to+get+the+interests+of+each+of+the+users+who+browse+the+Internet+or+the+data+established+on+web+pages.+A+web+crawler+has+the+function+of+inspecting+internet+pages+in+an+automated+way+to+save+entire+copies+or+parts+of+those+pages.+This+makes+it+possible+to+continuously+obtain+key+information+from+the+Internet+users+of+a+website%2C+for+their+personal+or+commercial+use+of+any+interested+company.+This+Final+Degree+Project+aimed+to+use+web+scraping+to+create+a+database+focused+on+user+interests+relationships+within+what+was+the+second+largest+Spanish-speaking+forum%2C+Meristation.+During+the+course+of+this+project%2C+the+Meristation+forum+has+been+closed%2C+so+there+will+be+two+parts%2C+the+first+one+talking+about+the+Meristation+project+and+the+second+one+talking+about+the+3dJuegos+project.+Meristation+had+a+great+interest+for+the+use+of+web+trackers%2C+because+it+was+mostly+dedicated+to+video+games.+Users+could+create+a+topic+in+any+of+its+sub-forums%2C+allowing+Internet+users+to+comment+on+it.+This+made+it+possible+to+establish+a+relationship+between+the+interest+in+video+games+or+video+consoles+and+the+users+who+interacted+from+that%2C+resulting+in+information+of+great+importance+for+companies+focused+on+video+games+to+be+able+to+make+advertisements+in+a+more+personal+way.+This+part+of+the+work+focused+mainly+on+the+relationship+between+the+different+sub-forums+of+Meristation%2C+the+topics+that+users+can+create+in+those+sub-forums+and+the+responses+of+other+Internet+users+to+those+topics%2C+since+it+can+be+deduced+that+those+users+they+were+interested+in+both+the+sub-forum+and+the+topic+in+question+given+by+the+user+who+created+it.+By+managing+to+establish+the+relationships+between+users+and+the+topics+of+the+sub-forums+together+with+the+comments+of+other+interested+Internet+users%2C+it+is+saved+in+a+JSON+file+so+that+the+information+can+later+be+manipulated+through+a+MongoDB+database.+In+the+realization+of+this+Final+Degree+Project+a+serious+problem+has+been+found+because+it+was+focused+on+Meristation+and+two+days+after+having+finished+the+project%2C+Meristation+proceeded+to+close+the+forum%2C+being+the+second+largest+in+Spain+and+the+first+in+video+games.+This+implies+that+the+code+made+in+the+project+could+not+be+demonstrated%2C+so+it+has+been+mandatory+to+use+another+web+page+to+continue+the+defense+of+this+TFG.+Therefore%2C+several+sections+have+been+divided+into+two+parts%2C+the+first+one+carried+out+in+Meristation+and+the+second+in+3Djuegos%2C+another+of+the+most+important+videogames+forums+in+Spain.+Its+structure+is+very+similar+to+the+one+that+existed+in+the+Meristation+forum%2C+so+having+sub-forums+on+such+varied+topics+is+still+a+commercial+claim+for+a+large+number+of+companies%2C+in+addition+to+being+able+to+use+a+website+with+exponential+audience+growth+from+3dJuegos+in+recent+years.+To+carry+out+this+TFG%2C+first+a+brief+introduction+will+be+made+about+crawling%2C+the+ways+of+doing+it+and+the+benefits+of+its+use%2C+both+in+the+short+term+and+in+the+long+term.+Then%2C+it+tells+about+the+languages+in+which+a+web+scraping+project+can+focus%2C+especially+the+language+used+for+this+project%2C+Python%2C+and+its+ways+of+crawling+webs%2C+detailing+the+most+important+ones.+After+explaining+the+different+methods+of+Python%E2%80%99s+web+Scraping%2C+it+is+explained+why+Scrapy+has+been+used.+It+explains+how+it+works+and+how+it+has+been+used+in+relation+to+the+TFG%2C+either+in+the+Meristation+part+or+in+the+3dJuegos+part.+To+finish+the+development+section%2C+it+is+explained+how+the+trace+data+has+been+treated+in+JSON+and+its+subsequent+saving+in+a+MONGODB+database.+Next%2C+the+results+of+the+project+and+the+problems+it+may+have+caused+in+its+implementation+will+be+examined.+Finally%2C+upon+completion%2C+it+is+verified+that+the+agreed+objectives+are+met+before+starting+the+project+and+a+conclusion+is+made%2C+explaining+the+reasons+and+uses+that+could+be+given+when+the+objective+has+been+met%2C+as+well+as+a+reflection+on+the+future+of+web+scraping.+in+the+world.&rft.publisher=E.T.S.I+de+Sistemas+Inform%C3%83%C2%A1ticos+(UPM)&rft.rights=https%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby-nc-nd%2F3.0%2Fes%2F&rft.date=2021-07&rft.type=info%3Aeu-repo%2Fsemantics%2FbachelorThesis&rft.type=Final+Project&rft.type=PeerReviewed&rft.format=application%2Fpdf&rft.language=spa&rft.format=application%2Fzip&rft.language=spa&rft.rights=info%3Aeu-repo%2Fsemantics%2FrestrictedAccess&rft.identifier=https%3A%2F%2Foa.upm.es%2F68413%2F