From Deep Learning of Disentangled Representations to Higher-level Cognition

Share it with your friends Like

Thanks! Share it with your friends!


One of the main challenges for AI remains unsupervised learning, at which humans are much better than machines, and which we link to another challenge: bringing deep learning to higher-level cognition. We review earlier work on the notion of learning disentangled representations and deep generative models and propose research directions towards learning of high-level abstractions. This follows the ambitious objective of disentangling the underlying causal factors explaining the observed data. We argue that in order to efficiently capture these, a learning agent can acquire information by acting in the world, moving our research from traditional deep generative models of given datasets to that of autonomous learning or unsupervised reinforcement learning. We propose two priors which could be used by an agent acting in its environment in order to help discover such high-level disentangled representations of abstract concepts. The first one is based on the discovery of independently controllable factors, i.e., in jointly learning policies and representations, such that each of these policies can independently control one aspect of the world (a factor of interest) computed by the representation while keeping the other uncontrolled aspects mostly untouched. This idea naturally brings fore the notions of objects (which are controllable), agents (which control objects) and self. The second prior is called the consciousness prior and is based on the hypothesis that our conscious thoughts are low-dimensional objects with a strong predictive or explanatory power (or are very useful for planning). A conscious thought thus selects a few abstract factors (using the attention mechanism which brings these variables to consciousness) and combines them to make a useful statement or prediction. In addition, the concepts brought to consciousness often correspond to words or short phrases and the thought itself can be transformed (in a lossy way) into a brief linguistic expression, like a sentence. Natural language could thus be used as an additional hint about the abstract representations and disentangled factors which humans have discovered to explain their world. Some conscious thoughts also correspond to the kind of small nugget of knowledge (like a fact or a rule) which have been the main building blocks of classical symbolic AI. This, therefore, raises the interesting possibility of addressing some of the objectives of classical symbolic AI focused on higher-level cognition using the deep learning machinery augmented by the architectural elements necessary to implement conscious thinking about disentangled causal factors.

See more at


Nguyễn Ngọc Ly 🍀 says:

You can turn artificial neural networks inside-out by using fixed dot products (weighted sums) and adjustable (parametric) activation functions. The fixed dot products can be computed very quickly using fast transforms like the FFT. Also the number of overall parameters required is vastly reduced. The dot products of the transform act as statistical summary measures. Ensuring good behavour. See Fast Transform (fixed filter bank) neural networks.
The variance equation for linear combinations of random variables is very useful for understanding dot products in neural networks especially in conjunction with cosine angle.
Also ReLU is a switch. The electricty in your house is a sine wave. Turn on a switch and the output is f(x)=x. Again the same sine wave as the input. Off(x)=0. A ReLU neural network then is a switched composition of dot products. If the switch states are known then there is a linear mapping between the input vector and the output vector which you can check out with various metrics.

Thomas Bingel says:

Thanks for the interesting talk! Please post the slides as well!

Jin Lin says:

I want to talk with the guy talking about barycentres and wasserstein distance!

Impolite Vegan says:

"We don't need to die a thousands deaths to know how to prevent dying"
I mean what is million of years of evolution if not dying thousands(millions actually) deaths?

Jon Chuang says:

51:00 I like the idea of a two-level system but disagree with the mutual information criterion.

robothales says:

46:00 re: attention as gating the conscious and unconscious thoughts – can you imagine a machine which can widen and narrow its aperture of attention to accomplish different tasks?

Jonathan Stray says:

Would be nice if the camera was on the slides in this video, rather than mostly on the speaker. Anyone know where the slides might be found? Sadly the link posted below is dead. This post has most of the slides though

run vnc says:

Sounds right to me. But why do they assume that the traditional neural net and deep learning are the best or only possible fundamental structures and processes for a system with these capabilities of disentangled abstractions working together with granular representations?

Rahul Deora says:

Someone should write a detailed blog explaining stuff in this

Rahul Deora says:

Someone should write a detailed blog explaining stuff in this

dikey y says:

In 12:07, are cognitive states low dimensional if that is the case are they sparse? If they are both sparse and low dimensional it contradicts with what he said in his MSS talk in 2012, where he states high dimensional and sparse is better than low dimensional

Jae Oppa says:

amazing talk.

Matías Grinberg says:

Adversarial examples is almost always used as an example of complete AI failure, because it is "obvious" that the object preserves identity. But one could arguably do the same to us! As it was already demonstrated in

Martin Lichtblau says:

Humans use fuzzy approaches, while computers use precise numbers. Which one can work in this complex world?

Micheal Bee says:

Doesn't translation into an abstract space necessitate a loss of information?

scose says:

Sampling rate * bit depth is a big overestimate of the amount of information in speech audio signals – look at the compression ratios that audio codecs can achieve

Siarez says:

Who is the gentleman at 1:09:35 asking a question, and bringing up gradual learning?

Naimul Haq says:

No matter how much machine learning or data processing we employ, human intervention will always remain. When Trump is in power, his popularity soars, even though he loses all re-elections to the senate. When Obama is in power, Hillary leads Trump in polls, even though she lost.
When will humans offset the influence of Cambridge Analytica and other manipulations. When will fake news stop. Can the evil Demon be banished from the net?

However disentangled leading to representations in higher cognition is interesting. I thought Turing predicted machines can never mimic a human cognition, even consciousness.

muckvix says:

Anyone has a link to the slides? And come on camera people, it's not a beauty pageant, it's ok if you show slides instead of the speaker's face 🙂

Write a comment


Area 51