The Graphical Model Zoo

There are many ways to express complicated probability distributions. Some ways are more general than others.

This post collates the sources I used to make the figure depicting the hierarchy of probabilistic graphical models. Some excerpts have been pasted verbatim, but others are edited for readability. Please let me know if you find any errors in the hierarchy.

The Graphical Model Zoo

Energy-based models

EBMs do not need to be expressed as a graph, but they often are, e.g. in Boltzmann Machines. Energy-Based Models (EBMs) describe dependencies between probabilistic variables by associating an energy (a scalar) to every possible configuration of the variables (LeCun et al., 2006). Any joint probability distribution can be defined in terms of a suitably expressive energy function, such as a graph that defines the conditional dependencies between variables. Hence, here we define EBMs as the largest class of models.

Sources:

[Probabilistic] Graphical Models

equivalent to Structured Probabilistic Models.

According to Bishop (Chapter 8), in a probabilistic graphical model, each node represents a random variable (or group of random variables), and the links express probabilistic relationships between these variables. The graph then captures the way in which the joint distribution over all of the random variables can be decomposed into a product of factors each depending only on a subset of the variables.

Sources:

Dependency Networks

The nodes in dependency networks consist of conditional probability tables, which determines the realization of the random variable given its parents.

Neville and Jensen (2007): “The primary distinction between Bayesian networks, Markov networks, and dependency networks (DNs) is that dependency networks are an approximate representation. DNs approximate the joint distribution with a set of conditional probability distributions (CPDs) that are learned independently. This approach to learning results in significant efficiency gains over exact models. However, because the CPDs are learned independently, DNs are not guaranteed to specify a consistent joint distribution, where each CPD can be derived from the joint distribution using the rules of probability.”

Sources:

Relational Dependency Networks

Wikipedia: Relational dependency networks (RDNs) are graphical models which extend dependency networks to account for relational data. Relational data is data organized into one or more tables, which are cross-related through common fields. A relational database is the canonical example of a system that serves to maintain relational data. A relational dependency network can be used to characterize the knowledge contained in a database.

Sources:

Factor Graphs

The Hammersley–Clifford theorem shows that Bayesian networks and Markov Random Fields can be represented as factor graphs.

Sources:

Markov Random Fields

equivalent to Undirected Graphical Models, Markov Networks, and Consistent Dependency Networks

The Hammersley–Clifford theorem implies that Markov Random Fields can be expressed as factor graphs. Loeliger (2004) demonstrates equivalent ways to express MRFs as factor graphs. Heckerman et al. (2000) show how they are equivalent to consistent dependency networks.

Sources:

Bayesian Networks

equivalent to Directed Graphical Models, Bayes Networks, Belief Networks, Decision Networks, Bayes(ian) Models and Probabilistic Directed Acyclic Graphical Models.

The Hammersley–Clifford theorem implies that Bayesian networks can be expressed as factor graphs. Loeliger (2004) demonstrates equivalent ways to express Bayesian nets as factor graphs.

Sources:

Hidden Markov Model & Naive Bayes Classifiers & Variable-order Markov models

Each is a special case of Bayesian networks. Sources:

Gibbs Random Field

In a Markov Random Field where the joint probability density of the random variables is positive everywhere, it is also referred to as a Gibbs random field, because, according to the Hammersley–Clifford theorem, it can then be represented by a Gibbs measure for an appropriate (locally defined) energy function.

Sources:

Constraint Graph

Factor graphs are generalisations of constraint graphs. As far as I can see, no other class of graphical model can be expressed in terms of a constraint graph, but I could easily be wrong; I couldn’t find any good, free sources on constraint graphs.

Sources:

Conditional Random Field

A variant of a Markov Random Field where each random variable can be conditioned on a set of global observations.

Sources:

Ising Model

The Ising model is the canonical Markov random field; indeed, the Markov random field was introduced as the general setting for the Ising model by Kindermann and Snell (1980).

Sources:

Neural Random Field

Neural random fields are defined by using neural networks to implement potential functions in Markov random fields. Sources:

Continuous Hopfield-like Networks

These include networks such as the EBM first studied by Bengio and Fischer (2015), where the neural network used to implement the conditional dependencies between variables is a linear network. The overall network is nevertheless nonlinear due to the saturating threshold functions that constrain the variables to [0,1].

Sources:

Boltzmann Machine

A Boltzmann machine of finite size is a Gibbs random field if weights of infinite magnitude are disallowed, since the energy can thus not be infinite and therefore the joint probability density is positive everywhere.

They are the stochastic version of the original Hopfield Networks.

Sources:

Written on September 28, 2021