Treffer: Benchmarking the continuous behavior of stream processing techniques on the problem of enumerating weakly connected components
Weitere Informationen
The problem of enumerating subgraphs satisfying some properties (e.g., patterns and mo- tifs) corresponds to listing such subgraphs. In general, because of the large/huge graphs re- lating objects in different domains (biology, mobile networks, social networks, genetics, etc.), waiting until achieving the whole enumeration of subgraphs satisfying some properties can be too expensive and unmanageable for some families of problems. Thus, the possibility of incrementally enumerating these subgraphs (i.e., a subgraph is returned as soon as they are identified) rises as a promising approach for some families of problems (e.g., network analysis and motif identification). In effect, following a search approach over the listed subgraphs allows stopping the enumeration when it is convenient. The implementation of incremental enumer- ation solutions suits well to stream processing paradigms and frameworks. In this work, as a proof of concept, we conduct a comparative study of the continuous behavior of two stream processing-based solutions for the problem of incrementally enumerating weakly connected components of (bounded) undirected graphs. One of these solutions is implemented using the Go programming language following the Dynamic Pipeline Paradigm, and the other solution is implemented on top of the DataSet API provided by the Apache Flink Framework. Finally, we analyze and report experiments against large graphs having up to 2, 746 connected compo- nents. Obtained results satisfy our expectations. To assess the incremental delivery of results, we measure the the diefficiency metrics, i.e., the continuous efficiency of the implementation of the algorithm for generating incremental results. ; El problema de l'enumeració dels subgrafs que satisfan determinades propietats (per exem- ple patrons i disenys) correspon a la mostra d'aquests subgrafs. En general, a causa dels grafs grans/enormes que relacionen objectes en diferents dominis (biologia, xarxes mòbils, xarxes socials, genètica, etc.), esperar fins a aconseguir ...