Graph Neural Networks

  • Graph Random Neural Features for Distance-Preserving Graph Representations.
    D. Zambon, C. Alippi, L. Livi International Conference on Machine Learning (2020)
    Graph Random Neural Features (GRNF) is a novel embedding method from graph-structured data to real vectors based on a family of graph neural networks. GRNF can be used within traditional processing methods or as a training-free input layer of a graph neural network. The theoretical guarantees that accompany GRNF ensure that the considered graph distance is metric, hence allowing to distinguish any pair of non-isomorphic graphs, and that GRNF approximately preserves its metric structure.
  • Spectral Clustering with Graph Neural Networks for Graph Pooling.
    F. M. Bianchi, D. Grattarola, C. Alippi International Conference on Machine Learning (2020)
    We propose a graph clustering approach that addresses some limitations of the spectral clustering algorithm. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.
  • Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling.
    F. M. Bianchi, D. Grattarola, L. Livi, C. Alippi IEEE Transactions on Neural Networks and Learning Systems (2020)
    We propose Node Decimation Pooling (NDP), a pooling operator for GNNs that generates coarsened versions of a graph by leveraging on its topology only. During training, the GNN learns new representations for the vertices and fits them to a pyramid of coarsened graphs, which is computed in a pre-processing step. As theoretical contributions, we first demonstrate the equivalence between the MAXCUT partition and the node decimation procedure on which NDP is based. Then, we propose a procedure to sparsify the coarsened graphs for reducing the computational complexity in the GNN; we also demonstrate that it is possible to drop many edges without significantly altering the graph spectra of coarsened graphs.
  • Graph Neural Networks with Convolutional ARMA Filters.
    F. M. Bianchi, D. Grattarola, L. Livi, C. Alippi IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
    We propose a novel graph convolutional layer based on auto-regressive moving average (ARMA) filters that, compared to the polynomial ones, provide a more flexible response thanks to a rich transfer function that accounts for the concept of state. We implement the ARMA filter with a recursive and distributed formulation, obtaining a convolutional layer that is efficient to train, is localized in the node space and can be applied to graphs with different topologies.
  • Learning in Non-Stationary Environments

  • Graph Edit Networks.
    B. Paassen, D. Grattarola, D.Zambon, C. Alippi, B. Hammer International Conference on Learning Representations (2021)
    While graph neural networks have made impressive progress in classification and regression on graphs, few approaches to date perform time series prediction on graphs and those that do are mostly limited to edge changes. We suggest that graph edits are a more natural interface for graph-to-graph learning. In particular, graph edits are general enough to describe any graph-to-graph change, not only edge changes, they are sparse, making them easier to understand for humans and more efficient computationally, and they are local, avoiding the need for pooling layers in graph neural networks. In this paper, we propose a simple linear layer - the graph edit network - which takes node embeddings as input and generates a sequence of graph edits that transform the input graph to the output graph. Theoretically, we show that a mapping between the node sets of two graphs is sufficient to construct training data for a graph edit network and that an optimal mapping yields edit scripts that are almost as short as the graph edit distance between the graphs. We further provide a proof-of-concept.
  • Change-Point Methods on a Sequence of Graphs.
    D. Zambon, C. Alippi, L. Livi IEEE Transactions on Signal Processing (2019)
    Given a finite sequence of graphs, we propose a methodology to identify possible changes in stationarity in the stochastic process that generated such graphs. We consider a general family of attributed graphs for which both topology (vertices and edges) and associated attributes are allowed to change over time, without violating the stationarity hypothesis. Novel Change-Point Methods (CPMs) are proposed that map graphs onto vectors, apply a suitable statistical test in vector space and detect changes –if any– according to a user-defined confidence level; an estimate for the change point is provided as well. We ground our methods on theoretical results that show how the inference in the numerical vector space is related to the one in graph domain, and vice-versa.
  • Deep Learning for Time Series Forecasting: The Electric Load Case.
    Alberto Gasparin, Slobodan Lukovic, C. Alippi Preprint (2019)
    We review and experimentally evaluate on two real-world datasets the most recent trends in electric load forecasting, by contrasting deep learning architectures on short term forecast (one day ahead prediction). Specifically, we focus on feed-forward and recurrent neural networks, sequence to sequence models and temporal convolutional neural networks along with architectural variants.
  • Autoregressive Models for Sequences of Graphs.
    D. Zambon, D. Grattarola, L. Livi, C. Alippi International Joint Conference on Neural Networks (2019)
    This paper proposes an autoregressive (AR) model for sequences of graphs, which generalizes traditional AR models. A first novelty consists in formalizing the AR model for a very general family of graphs, characterized by a variable topology, and attributes associated with nodes and edges. A graph neural network is also proposed to learn the AR function associated with the graph-generating process, and subsequently predict the next graph in a sequence.
  • Change Detection in Graph Streams by Learning Graph Embeddings on Constant-Curvature Manifolds.
    D. Grattarola, D. Zambon, L. Livi, C. Alippi IEEE Transactions on Neural Networks and Learning Systems (2019)
    We focus on the problem of detecting changes in stationarity in a stream of attributed graphs. To this end, we introduce a novel change detection framework based on neural networks and Constant-curvature manifolds (CCMs), that takes into account the non-Euclidean nature of graphs. Our contribution in this work is twofold. First, via a novel approach based on adversarial learning, we compute graph embeddings by training an autoencoder to represent graphs on CCMs. Second, we introduce two novel change detection tests operating on CCMs.
  • Anomaly and Change Detection in Graph Streams through Constant-Curvature Manifold Embeddings.
    D. Zambon, L. Livi, C. Alippi IEEE International Joint Conference on Neural Networks (2018)
    We investigate how embedding graphs on constant-curvature manifolds (hyper-spherical and hyperbolic manifolds) impacts on the ability to detect changes in sequences of attributed graphs. The proposed methodology consists in embedding graphs into a geometric space and perform change detection there by means of conventional methods for numerical streams.
  • Concept Drift and Anomaly Detection in Graph Streams.
    D. Zambon, C. Alippi, L. Livi IEEE Transactions on Neural Networks and Learning Systems (2018)
    We consider stochastic processes generating graphs and propose a methodology for detecting changes in stationarity of such processes. The methodology acts by embedding every graph of the stream into a vector domain, where a conventional multivariate change detection procedure can be easily applied. We ground the soundness of our proposal by proving several theoretical results.
  • Detecting Changes in Sequences of Attributed Graphs.
    D. Zambon, L. Livi, C. Alippi IEEE Symposium Series on Computational Intelligence (2017)
    We consider a methodology for detecting changes in sequences of graphs. Changes are recognized by embedding each graph into a vector space, where conventional change detection procedures exist and can be easily applied. We introduce the methodology and focus on expanding experimental evaluations on controlled yet relevant examples involving geometric graphs and Markov chains.
  • Reinforcement Learning

  • Deep Reinforcement Learning with Weighted Q-Learning.
    A. Cini, C. D'Eramo, J. Peter, C. Alippi Preprint (2020)
    Overestimation of the maximum action-value is a well-known problem that hinders Q-Learning performance, leading to suboptimal policies and unstable learning. Among several Q-Learning variants proposed to address this issue, Weighted Q-Learning (WQL) effectively reduces the bias and shows remarkable results in stochastic environments. WQL uses a weighted sum of the estimated action-values, where the weights correspond to the probability of each action-value being the maximum; however, the computation of these probabilities is only practical in the tabular settings. In this work, we provide the methodological advances to benefit from the WQL properties in Deep Reinforcement Learning (DRL), by using neural networks with Dropout Variational Inference as an effective approximation of deep Gaussian processes. In particular, we adopt the Concrete Dropout variant to obtain calibrated estimates of epistemic uncertainty in DRL. We show that model uncertainty in DRL can be useful not only for action selection, but also action evaluation. We analyze how the novel Weighted Deep Q-Learning algorithm reduces the bias wrt relevant baselines and provide empirical evidence of its advantages on several representative benchmarks.
  • Dynamics of RNNs

  • Learn to Synchronize, Synchronize to Learn.
    P. Verzelli, C. Alippi, L. Livi Preprint (2020)
    In recent years, the machine learning community has seen a continuous growing interest in research aimed at investigating dynamical aspects of both training procedures and perfected models. Of particular interest among recurrent neural networks, we have the Reservoir Computing (RC) paradigm for its conceptual simplicity and fast training scheme. Yet, the guiding principles under which RC operates are only partially understood. In this work, we study the properties behind learning dynamical systems with RC and propose a new guiding principle based on Generalized Synchronization (GS) granting its feasibility. We show that the well-known Echo State Property (ESP) implies and is implied by GS, so that theoretical results derived from the ESP still hold when GS does. However, by using GS one can profitably study the RC learning procedure by linking the reservoir dynamics with the readout training. Notably, this allows us to shed light on the interplay between the input encoding performed by the reservoir and the output produced by the readout optimized for the task at hand. In addition, we show that - as opposed to the ESP - satisfaction of the GS can be measured by means of the Mutual False Nearest Neighbors index, which makes effective to practitioners theoretical derivations.
  • Input Representation in Recurrent Neural Networks Dynamics.
    P. Verzelli, C. Alippi, L. Livi, P. Tino Preprint (2020)
    Reservoir computing is a popular approach to design recurrent neural networks, due to its training simplicity and its approximation performance. The recurrent part of these networks is not trained (e.g. via gradient descent), making them appealing for analytical studies, raising the interest of a vast community of researcher spanning from dynamical systems to neuroscience. It emerges that, even in the simple linear case, the working principle of these networks is not fully understood and the applied research is usually driven by heuristics. A novel analysis of the dynamics of such networks is proposed, which allows one to express the state evolution using the controllability matrix. Such a matrix encodes salient characteristics of the network dynamics: in particular, its rank can be used as an input-indepedent measure of the memory of the network. Using the proposed approach, it is possible to compare different architectures and explain why a cyclic topology achieves favourable results.
  • Echo State Networks with Self-Normalizing Activations on the Hyper-Sphere.
    P. Verzelli, C. Alippi, L. Livi Nature Scientific Reports (2019)
    We propose a model of echo state networks that eliminates critical dependence on hyper-parameters, resulting in networks that provably cannot enter a chaotic regime and, at the same time, denotes nonlinear behavior in phase space characterized by a large memory of past inputs, comparable to the one of linear networks. Our contribution is supported by experiments corroborating our theoretical findings, showing that the proposed model displays dynamics that are rich-enough to approximate many common nonlinear systems used for benchmarking.
  • A Characterization of the Edge of Criticality in Binary Echo State Networks.
    P. Verzelli, L. Livi, C. Alippi IEEE International Workshop on Machine Learning for Signal Processing (2018)
    We propose binary echo state networks (ESNs), which are architecturally equivalent to standard ESNs but consider binary activation functions and binary recurrent weights. For these networks, we derive a closed-form expression for the edge of criticality (EoC) in the autonomous case and perform simulations in order to assess their behavior in the case of noisy neurons and in the presence of a signal. We propose a theoretical explanation for the fact that the variance of the input plays a major role in characterizing the EoC.
  • Others

  • Adversarial Autoencoders with Constant-Curvature Latent Manifolds.
    D. Grattarola, L. Livi, C. Alippi Applied Soft Computing (2019)
    We introduce the constant-curvature manifold adversarial autoencoder (CCM-AAE), a probabilistic generative model trained to represent a data distribution on a constant-curvature Riemannian manifold (CCM). Our method works by matching the aggregated posterior of the CCM-AAE with a probability distribution defined on a CCM, so that the encoder implicitly learns to represent data on the CCM to fool the discriminator network. The geometric constraint is also explicitly imposed by jointly training the CCM-AAE to maximize the membership degree of the embeddings to the CCM.