Learning Recurring Concepts from Data Streams in Ubiquitous Environments

Bártolo Gomes, Joao Paulo (2011). Learning Recurring Concepts from Data Streams in Ubiquitous Environments. Tesis (Doctoral), Facultad de Informática (UPM) [antigua denominación]. https://doi.org/10.20868/UPM.thesis.9858.

Descripción

Título:	Learning Recurring Concepts from Data Streams in Ubiquitous Environments
Autor/es:	Bártolo Gomes, Joao Paulo
Director/es:	Menasalvas Ruiz, Ernestina https://orcid.org/0000-0002-5615-6798 Sousa, Pedro
Tipo de Documento:	Tesis (Doctoral)
Fecha de lectura:	2011
Materias:	Informática
ODS:	09. Industria, innovación e infraestructura
Escuela:	Facultad de Informática (UPM) [antigua denominación]
Departamento:	Lenguajes y Sistemas Informáticos e Ingeniería del Software
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of JoaoPaulo_Barloto_Gomes.pdf]

Vista Previa

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB) | Vista Previa

Resumen

Due to recent scientific and technological advances in information systems it is now possible to continuously record data at high speeds in a wide range of devices. The need to make sense of such massive amounts of data opens an opportunity to create new data stream classification techniques to model and predict the behavior of streaming data. When learning from data streams, the problem of concept drift means that the underlying data distributions can change over time. This has a strong impact on classification techniques, as predictive models become invalid and have to be updated. Furthermore, these changes in concept are usually a consequence of changes in context, and this relationship could be exploited to handle concept drift. Recurring concepts is a particular case of concept drift, where concepts that have drifted can suddenly reoccur. In this situation it may be possible to avoid relearning these previously observed concepts. However, the few existing approaches that take advantage of concept recurrence are neither designed to take context into consideration nor to take into account the resources required to store representations of past concepts. Both issues are of particular significance for ubiquitous data stream mining, where the learning process is executed in dynamically changing environments using resource constrained devices. Moreover, most existing techniques assume that the underlying data stream feature space is static. However, in many real-world applications the set of features and their relevance to the target concept may change over time. Despite its importance, this issue has received little attention, particularly on how it can be eficiently addressed when tracking recurring concepts. Sharing knowledge among ubiquitous devices to collaboratively improve the modeling of local concepts is another interesting idea which has not been properly explored. This could improve the accuracy of the local model as it would benefit from patterns similar to the local concept that were observed in other ubiquitous devices, but not yet locally. In addition, the deployment of data stream classification as an autonomous and adaptive service to support the data analysis requirements of ubiquitous applications is still an open issue that lacks research in the field of ubiquitous data stream mining. This PhD thesis addresses the aforementioned open issues, focusing on learning anytime, anywhere classification models from data streams in ubiquitous environments, where the underlying concepts may change over time, with special emphasis on recurring concepts. Four main contributions are presented: _ The MReC (Mining Recurring Concepts) approach that integrates context with previously learned concepts to improve the adaptation to recurring concepts. Moreover, to deal with situations of resource constraints, an intelligent strategy to discard models is also proposed. _ The MReC-DFS (Mining Recurring Concepts in a Dynamic Feature Space) approach, that extends MReC to address the challenges of a dynamic feature space while simultaneously reducing the memory cost of storing past models. In addition, a novel incremental feature selection method is proposed that dynamically determines the threshold used to select the most relevant features for a certain concept. _ A Collaborative Data Stream Mining (Coll-Stream) approach that explores the knowledge available in the community to improve local classification accuracy. Coll-Stream integrates community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the instance space. _ A UDSM (Ubiquitous Data Stream Mining) Service to support the data analysis requirements of ubiquitous applications. As the basis for our service we describe a general mechanism, which autonomously adapts the execution of the data stream classification process to each situation, using context and resource awareness. Finally, the experimental validation of the proposed contributions using synthetic and real datasets allows us to achieve the objectives and answer the research questions proposed for this dissertation.

Más información

ID de Registro:	9858
Identificador DC:	https://oa.upm.es/9858/
Identificador OAI:	oai:oa.upm.es:9858
Identificador DOI:	10.20868/UPM.thesis.9858
Depositado por:	Archivo Digital UPM 2
Depositado el:	20 Dic 2011 07:29
Ultima Modificación:	10 Oct 2022 09:20

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Learning Recurring Concepts from Data Streams in Ubiquitous Environments

Cita

Descripción

Texto completo

Resumen

Más información

Acciones

Metrics

Altmetrics probando

Dimensions

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional