Many real-world problems deal with collections of high-dimensional data, i.e., data with many different features. A dataset exhibiting a high number of features incurs the so-called curse of dimensionality: when the dimensionality increases, the volume of the space increases at a fast rate, causing the sparseness of the data. This makes challenging clustering high-dimensional data for outlier detection purposes. In this paper, we design and implement a distributed peer to peer version of an algorithm that addresses the curse of dimensionality by generating candidate subspaces from the high-dimensional space through Principal Component Analysis. The experimental results show that if the parameters of the distributed algorithm are properly set, then the distributed algorithm converges to the results provided by the sequential algorithm, which is a fundamental and highly desirable property.

An Adaptive Clustering Approach for Distributed Outlier Detection in Data Streams

Cafaro M.;Pulimeno M.;Epicoco I.
2023-01-01

Abstract

Many real-world problems deal with collections of high-dimensional data, i.e., data with many different features. A dataset exhibiting a high number of features incurs the so-called curse of dimensionality: when the dimensionality increases, the volume of the space increases at a fast rate, causing the sparseness of the data. This makes challenging clustering high-dimensional data for outlier detection purposes. In this paper, we design and implement a distributed peer to peer version of an algorithm that addresses the curse of dimensionality by generating candidate subspaces from the high-dimensional space through Principal Component Analysis. The experimental results show that if the parameters of the distributed algorithm are properly set, then the distributed algorithm converges to the results provided by the sequential algorithm, which is a fundamental and highly desirable property.
2023
978-3-031-20858-4
978-3-031-20859-1
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11587/479964
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact