Stuart Russell - Provably Beneficial Artificial Intelligence

Stuart Russell – Provably Beneficial Artificial Intelligence – AI Ethics @IJCAI (Full Version)

Exclusive interview with Stuart Russell. He discusses the importance of achieving friendly AI - Strong AI that is provably (probably approximately) beneficial.

Points of discussion:
A clash of intuitions about the beneficiality of Strong Artificial Intelligence
- A clash of intuitions: Alan Turing raised the concern that if we were to build an AI smarter than we are, we might not be happy about the results. While there is a general notion amoungst AI developers etc that building smarter than human AI would be good.
- But it's not clear why the objectives of Superintelligent AI will be inimicable to our values - so we need to solve what some poeple call the value alignment problem.
- we as humans learn values in conjunction with learning about the world

The Value Alignment problem

Basic AI Drives: Any objective generates sub-goals

- Designing an AI not want to disable it's off switch
- 2 principles
- 1) its only objective is to maximise your reward function (this is not an objective programmed into the machine but is a kind of (non-observed) latent variable
- 2) the machine has to be explicitly uncertain about what that objective is
- if the robot thinks it knows what your objective functions are, then it won't believe that it will make you unhappy and therefore has an incentive to disable the off switch
- the robot will only want to be switched off if thinks it will makes you unhappy

- How will the machines do what humans want if they can't see their objective functions?
- one answer is to allow the machines to observe human behaviour, and interpret that behaviour as providing evidence of an underlying preference structure - inverse reinforcement learning

Aggregated Volition: How does an AI optimise for many peoples values?
- Has the benefit of symmetry
- difficulties in commensurbaility of different human preferences
- Problem: If someone feels more strongly about a value X should they get more of a share of value X?

How to deal with people who's preferences include the suffering of others?

Should a robot be more obligated to its owner than to the rest of the world?
- should this have something to do with how much you pay for the robot?

Moral philosophy will be a key industry sector

Issues of near term Narrow AI vs future Strong AI
- Very easy to confuse the near term killer robot question with the existential risk question

Differences in the issues with the risk of the misuse of Narrow AI and the risk of Strong AI
- Weaponised Narrow AI

Should we replace the gainful employment of humans with AI?

A future where humans lose a sense of meaning & dignity

Hostility to the idea of Superintelligence and AI Friendline
- there seems to be something else going on for AI experts to make rational arguments as simple minded as 'If the AI goes bad, just turn the AI off'
- beating alphago is no problem - we just need to play better moves
- it's theoretically possibe that AI could pose existential risk - but it's also possible that a black hole could appear in near earth orbit - we don't spend any time worrying about that so why should we spend time worrying about the existential risk of AI?

Defensive psychological reactions to feeling one's research is under attack
- People proposing AI safety are not anti AI any more than people wanting to contain a nuclear reaction are anti physics

Provably beneficial AI
- where the AI systems responsibility is to figure out what you want
- though the data to train the AI may be sometimes unrepresentative - leading to a small prossibility of deviation from true beneficiality - probably approximately beneficial AI

Convincing the AI community that AI friendliness is important

Will there be a hard takeoff to superintelligence?

What are the benefits of building String AI?

Center for Human-Compatible AI - UC Berkley
http://humancompatible.ai/

Stuart Jonathan Russell is a computer scientist known for his contributions to artificial intelligence. He is a Professor of Computer Science at the University of California, Berkeley and Adjunct Professor of Neurological Surgery at the University of California, San Francisco.
https://en.wikipedia.org/wiki/Stuart_J._Russell

Many thanks for watching!

Consider supporting SciFuture by:
a) Subscribing to the SciFuture YouTube channel: http://youtube.com/subscription_center?add_user=TheRationalFuture
b) Donating via Patreon: https://www.patreon.com/scifuture and/or
c) Sharing the media SciFuture creates: http://scifuture.org

Kind regards,
Adam Ford
- Science, Technology & the Future

Exclusive interview with Stuart Russell. He discusses the importance of achieving friendly AI – Strong AI that is provably (probably approximately) beneficial.

Points of discussion:
A clash of intuitions about the beneficiality of Strong Artificial Intelligence
– A clash of intuitions: Alan Turing raised the concern that if we were to build an AI smarter than we are, we might not be happy about the results. While there is a general notion amoungst AI developers etc that building smarter than human AI would be good.
– But it’s not clear why the objectives of Superintelligent AI will be inimicable to our values – so we need to solve what some poeple call the value alignment problem.
– we as humans learn values in conjunction with learning about the world

The Value Alignment problem

Basic AI Drives: Any objective generates sub-goals

– Designing an AI not want to disable it’s off switch
– 2 principles
– 1) its only objective is to maximise your reward function (this is not an objective programmed into the machine but is a kind of (non-observed) latent variable
– 2) the machine has to be explicitly uncertain about what that objective is
– if the robot thinks it knows what your objective functions are, then it won’t believe that it will make you unhappy and therefore has an incentive to disable the off switch
– the robot will only want to be switched off if thinks it will makes you unhappy

– How will the machines do what humans want if they can’t see their objective functions?
– one answer is to allow the machines to observe human behaviour, and interpret that behaviour as providing evidence of an underlying preference structure – inverse reinforcement learning

Aggregated Volition: How does an AI optimise for many peoples values?
– Has the benefit of symmetry
– difficulties in commensurbaility of different human preferences
– Problem: If someone feels more strongly about a value X should they get more of a share of value X?

How to deal with people who’s preferences include the suffering of others?

Should a robot be more obligated to its owner than to the rest of the world?
– should this have something to do with how much you pay for the robot?

Moral philosophy will be a key industry sector

Issues of near term Narrow AI vs future Strong AI
– Very easy to confuse the near term killer robot question with the existential risk question

Differences in the issues with the risk of the misuse of Narrow AI and the risk of Strong AI
– Weaponised Narrow AI

Should we replace the gainful employment of humans with AI?

A future where humans lose a sense of meaning & dignity

Hostility to the idea of Superintelligence and AI Friendline
– there seems to be something else going on for AI experts to make rational arguments as simple minded as ‘If the AI goes bad, just turn the AI off’
– beating alphago is no problem – we just need to play better moves
– it’s theoretically possibe that AI could pose existential risk – but it’s also possible that a black hole could appear in near earth orbit – we don’t spend any time worrying about that so why should we spend time worrying about the existential risk of AI?

Defensive psychological reactions to feeling one’s research is under attack
– People proposing AI safety are not anti AI any more than people wanting to contain a nuclear reaction are anti physics

Provably beneficial AI
– where the AI systems responsibility is to figure out what you want
– though the data to train the AI may be sometimes unrepresentative – leading to a small prossibility of deviation from true beneficiality – probably approximately beneficial AI

Convincing the AI community that AI friendliness is important

Will there be a hard takeoff to superintelligence?

What are the benefits of building String AI?

—

Center for Human-Compatible AI – UC Berkley
http://humancompatible.ai/

Many thanks for watching!

Kind regards,
Adam Ford
– Science, Technology & the Future

THE FUTURE IS HERE

AI Now

Cabot Microelectronics Jobs, Semiconductor Careers — Eikoh – Operations, Japan

DELTA Microelectronics – Integrated circuit failure analysis

From Engineering Science at Oxford a career in microelectronics

Introdução à Verificação Digital: Tiago Vidigal (Chipus Microelectronics) – Dia 2

The Future of Artificial Intelligence (AI) and Biotechnology

Biotechnology in One Shot | Secure 40 Marks | NEET Exam 2024 | MD Sir

OpenAI GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Foundation Models: An Explainer for Non-Experts

Generative AI Foundations Full Course Part 1 @iNeuroniNtelligence

Deep Reinforcement Learning DQN OpenAI Gym | Python