My understanding of Graphical Models

What makes them unique?

  • They can deal with missing values, which is interesting
  • They can be used for "abduction", searching for "the best possible explanation" of an outcome
  • They're explicitly modelling all features' effect on potentially all other features (in an undirected, Markov Network-case, or at least the features that it has causal connection with - in the Bayesian Network-case) - this creates a combinatorial explosion of potential conditional probability distributions to estimate, which makes them either really slow or impossible to train sophisticated-enough models
  • It also makes them more versatile models - they don't just learn what needs to be necessarily learned for the specific task you're training them for. that makes them somewhat "portable"?
  • For continuous variables, you need to specify the distribution of what those values are coming from, or transforming your data to gaussian (then choosing a Gaussian network)
  • In a continuous case, it's not straightforward to train models that can capture non-linear relationships (haven't found any implementation of that in python)
  • Because of that, it's very common to quantize your data, which bring back the combinatorial explosion, as well as some potentially problematic distribution assumptions
  • So it makes a lot of sense why people use them in context with little data and/or lot of need for explainability
  • But I assume they'll always underperform (or they're simply inappropriately costly to train) in the context when lots of data are available
  • Unless someone finds the "backprop" learning method for Graphical Models