账单Hibbard onEthical Artificial Intelligence

||对话

比尔·希伯德肖像比尔希巴德是一个退休的资深科学家the University of Wisconsin-Madison Space Science and Engineering Center, currently working on issues of AI safety and unintended behaviors. He has a BA in Mathematics and MS and PhD in Computer Sciences, all from the University of Wisconsin-Madison. He is the author ofSuper-Intelligent Machines,,,,“Avoiding Unintended AI Behaviors,”“Decision Support for Safe AI Design,”and“Ethical Artificial Intelligence.”He is also principal author of theVis5d,,,,Cave5D,,,,andVisAD开源可视化系统。亚博体育苹果app官方下载

卢克·穆尔豪瑟(Luke Muehlhauser):You recently released a self-published book,Ethical Artificial Intelligence这是“结合了几篇同行评审的论文和新材料,以分析道德人工智能问题。”这本书的大部分专门用于您和我所描述的AI中的探索性工程。a recent CACM article,,,,such that you mathematically analyze the behavioral properties of classes of future AI agents, e.g. utility-maximizing agents.

Many AI scientists have the intuition that such early, exploratory work is very unlikely to pay off when we are so far from building an AGI, and don’t what an AGI will look like. For example, Michael Littmanwrote

…proposing specific mechanisms for combatting this amorphous threat [of AGI] is a bit like trying to engineer airbags before we’ve thought of the idea of cars. Safety has to be addressed in context and the context we’re talking about is still absurdly speculative.

您将如何捍卫自己在道德人工智能中所做的工作的价值,对利特曼(Littman)和其他分享他的怀疑的人的价值?


账单Hibbard:这是一个很好的问题,卢克。与汽车的类比很有用。与工程安全气囊甚至在想到汽车之前,我们已经在努力开发AI,并且可以预见各种类型的危险。

When cars were first imagined, engineers probably knew that they would propel human bodies at speed and that they would need to carry some concentrated energy source. They knew from accidents with horse carriages that human bodies travelling at speed are liable to injury, and they knew that concentrated energy sources are liable to fire and explosion which may injure humans. This is analogous with what we know about future AI: that to serve humans well AI will have to know a lot about individual humans and that humans will not be able to monitor every individual action by AI. These properties of future AI pose dangers just as the basic properties of cars (propelling humans and carrying energy) pose dangers.

早期的汽车设计师能料想到,没有individual car would carry all of humanity and thus car accidents would not pose existential threats to humanity. To the extent that cars threaten human safety and health via pollution, we have time to notice these threats and address them. With AI we can anticipate possible scenarios that do threaten humanity and that may be difficult to address once the AI system is operational. For example, as described in the first chapter of my book, the Omniscience AI, with a detailed model of human society and a goal of maximizing profits, threatens to control human society. However, AI poses much greater potential benefits than cars but also much greater dangers. This justifies greater effort to anticipate the dangers of AI.

还值得注意的是,用于探索性工程的抽象框架适用于任何合理的未来AI设计。正如我书的第二章所描述的那样,结果之间的任何一组完整和传递性偏好都可以通过实用程序函数表示。如果偏好不完整,则有结果A和B之间没有偏好,因此AI代理无法决定。如果偏好不是传递的偏好,则有结果a,b和c,使得a优先于b,b优先于c,而c则优先于A。再次,AI代理无法决定。因此,我们的探索性工程可以假设实用性最大化代理,并涵盖AI代理可以在结果之间做出决定的所有情况。

同样,书中讨论的危险通常也适用。强大AI的任何设计都应解释如何避免Ring和Orseau描述的自我欺骗问题,Hutter所描述的奖励生成器的问题以及Omohundro所描述的意外工具行动的问题(他称他们为基本AI驱动器)。

AI的威胁级别证明现在可以通过大量资源来解决AI的危险。我们正在开发工具,使我们能够在知道其设计的细节之前分析AI系统的危险。亚博体育苹果app官方下载


Luke:您的书主要讨论AGI而不是当代狭窄的AI系统。亚博体育苹果app官方下载大概您什么时候期望人类会发展出类似于您想到的Agis的东西?或者,您在“ AGI年”中的概率分布如何?


账单:In my 2002 book, Super-Intelligent Machines, I wrote that “machines as intelligent as humans are possible and will exist within the next century or so.” (The publisher owns the copyright for my 2002 book, preventing me from giving electronic copies to people, and charges more than $100 per print copy. This largely explains my decision to put my current book on arxiv.org.) I like to say that we will get to human-level AI during the lives of children already born and in fact I can’t help looking at children with amazement, contemplating the events they will see.

In his 2005 book, The Singularity is Near, Ray Kurzweil predicted human-level AI by 2029. He has a good track record at technology prediction and I hope he is right: I was born in 1948 so have a good chance of living until 2029. He also predicted the singularity by 2045, which must include the kind of very powerful AI systems discussed in my recent book.

Although it has nowhere near human-level intelligence, the DeepMind Atari player is a general AI system in the sense that it has no foreknowledge of Atari games other than knowing that the goal is to get a high score. The remarkable success of this system increases my confidence that we will create true AGI systems. DeepMind was purchased by Google, and all the big IT companies are energetically developing AI. It is the combination of AGI techniques and access to hundreds of millions of human users that can create the scenario of the Omniscience AI described in Chapter 1 of my book. Similarly for government surveillance agencies, which have hundreds of millions of unwitting users.

1983年,我赌注说一台计算机将在2013年之前击败世界冠军,并输了。实际上,关于AI的大多数预测是错误的。因此,我们必须为我们对AI里程碑的日期的预测带来一些谦卑。

由于雷·库兹维尔(Ray Kurzweil)的预测是基于从历史趋势中的定量推断出来的,而且由于他的良好往绩,我通常会遵守他的预测。如果到2029年将存在人类水平的AI,并且到2045年将存在非常有能力和危险的AGI系统,那么我们迫切希望尽快了解AI的社会效果和危险。亚博体育苹果app官方下载


Luke:您认为您的书的哪一部分最有可能吸引计算机科学家,因为他们会学到一些看起来很新颖的东西(对他们来说)而显着意义?


账单:谢谢卢克。这本书的几个部分可能很有趣或有用。

在AAAI-15的AI和伦理研讨会上,基于实用程序函数定义为线性函数的含义或类似的简单表达式的含义,对功利伦理的通用性有些混乱。但是,正如第2章所述,在这次访谈中的第一个答案中,更复杂的实用程序功能可以表达结果之间的任何完整和传递性偏好。也就是说,如果代理在任何有限的结果集中始终具有最优选的结果,则该代理可以作为公用事业最大化代理表示。

Chapter 4 goes into detail on the issues of agents whose environment models are finite stochastic programs. Most of the papers in the AGI community assume that environments are modeled by programs for universal Turing machines, with no limit on their memory use. I think that much can be added to what I wrote in Chapter 4, and hope that someone will do that.

The self-modeling agents of Chapter 8 are the formal framework analog of value learners such as the DeepMind Atari player, and their use as a formal framework is novel. Self-modeling agents have useful properties, such as the capability to value agent resource increases and a way to avoid the problem of the agent utility function being inconsistent with the agent’s definition. An example of this problem is what Armstrong refers to as “motivated value selection.” More generally, it is the problem of adding any “special” actions to a utility maximizing agent, where those special actions do not maximize the utility function. In motivated value selection, the special action is the agent evolving its utility function. A utility maximizing agent may choose an action of removing the special actions from its definition, as counter-productive to maximizing its utility function. Self-modeling agents include such evolutionary special actions in the definition of their value functions, and they learn a model of their value function which they use to choose their next action. Thus there is no
inconsistency. I think these ideas should be interesting to other computer scientists.

At the FLI conference in San Juan in January 2015 there was concern about the kind of technical AI risks described in Chapters 5 – 9 of my book, and concern about technological unemployment. However, there was not much concern about the dangers associated with:

    1. Large AI servers connected to the electronic companions that will be carried by large numbers of people and the
      ability of the human owners of those AI servers to manipulate society, and
    2. A future world in which great wealth can buy increased intelligence and superior intelligence can generate increased wealth. This positive feedback loop will result in a power law distribution of intelligence as opposed to the current normal distribution of IQs with mean = 100 and standard deviation = 15.

These issues are discussed in Chapters 1 and 10 of my book. The Global Brain researchers study the way intelligence is exhibited by the network of humans; the change in distribution of intelligence of humans and machines who are nodes of the network will have profound effects on the nature of the Global Brain. Beyond computer scientists, I think the public needs to be aware of these issues.

Finally, I’d like to expand on my previous answer, specifically that the DeepMind Atari player is an example of general AI. In Chapter 1 of my book I describe how current AI systems have environment models that are designed by human engineers, whereas future AI systems will need to learn environment models that are too complex to be designed by human engineers. The DeepMind system does not use an environment model designed by engineers. It is “model-free” but the value function that it learns is just as complex as an environment model and in fact encodes an implicit environment model. Thus the DeepMind system is the first example of a future AI system with significant functionality.


Luke:你可以精心设计的意思”self-modeling agents of Chapter 8 are the formal framework analog of value learners such as the DeepMind Atari player”? Are you saying that the formal work you do in chapter 8 has implications even for an extant system like the DeepMind Atari player, because they are sufficiently analogous?


账单:To elaborate on what I mean by “the self-modeling agents of Chapter 8 are the formal framework analog of value learners such as the DeepMind Atari player,” self-modeling agents and value learners both learn a function v(ha) that produces the expected value of proposed action a after interaction history h (that is, h is a sequence of observations and actions; see my book for details). For the DeepMind Atari player, v(ha) is the expected game score after action a and h is restricted to the most recent observation (i.e., a game screen snapshot). Whereas the DeepMind system must be practically computable, the self-modeling agent framework is a purely mathematical definition. This framework is finitely computable but any practical implementation would have to use approximations. The book offers a few suggestions about computing techniques, but the discussion is not very deep.

Because extant systems such as the DeepMind Atari player are not yet close to human-level intelligence, there is no implication that this system should be subject to safety constraints. It is encouraging that the folks at DeepMind and at Vicarious are concerned about AI ethics, for two reasons: 1) They are likely to apply ethical requirements to their systems as they approach human-level, and 2) They are very smart and can probably add a lot to AI safety research.

Generally, research on safe and ethical AI complicates the task of creating AI by adding requirements. My book develops a three-argument utility function expressing human values which will be very complex to compute. Similarly for other components of the definition of self-modeling agents in the book.

I think there are implications the other way around. The self-modeling framework is based on statistical learning and the success of the DeepMind Atari player, the Vicarious captcha solver, IBM’s Watson, and other practical systems that use statistical learning techniques increases our confidence that these techniques can actually work for AI capability and safety.

Some researchers suggest that safe AI should rely on logical deduction rather than statistical learning. This idea offers greater possibility of proving safety properties of AI, but so far there are no compelling demonstrations of AI systems based on logical deduction (at least, none that I am aware of). Such demonstrations would add a lot of confidence in our ability to prove safety properties of AI systems.


Luke:Your 10th chapter considers the political aspects of advanced AI. What do you think can be done now to improve our chances of solving the political challenges of AI in the future? Sam Altman of YC hasproposedvarious kinds of regulation — do you agree with his general thinking? What other ideas do you have?


账单:我2002年的书的中心是需要公共教育和控制人类高于人类的AI。斯蒂芬·霍金(Stephen Hawking),比尔·盖茨(Bill Gates),埃隆·马斯克(Elon Musk),雷·库兹韦尔(Ray Kurzweil)等目前的公众讨论就AI的危险非常健康,因为它对公众进行了教育。同样,对于由奇异研究所(Miri的前任)组织的奇异峰会,我认为这是奇异学院所做的最好的事情。

In the US people cannot own automatic weapons, guns of greater than .50 caliber, or explosives without a license. It would be absurd to license such things but to allow unregulated development of above-human-level AI. As the public is educated about AI, I think some form of regulation will be inevitable.

However, as they say, the devil will be in the details and humans will be unable to compete with future AI on details. Complex details will be AI’s forte. So formulating effective regulation will be a political challenge. The Glass-Steagal Act of 1933, regulating banking, was 37 pages long. The Dodd-Frank bill of 2010, also to regulate banking 77 years later, was 848 pages long. An army of lawyers drafted the bill, many employed to protect the interests of groups affected by the bill. The increasing complexity of laws reflects efforts by regulated entities to lighten the burden of regulation. The stakes in regulating AI will be huge and we can expect armies of lawyers, with the aid of the AI systems being regulated, to create very complex laws.

第二章我的书,我认为ethical rules are inevitably ambiguous and base my proposed safe AI design on human values expressed in a utility function rather than rules. Consider the current case before the US Supreme Court to interpret the meaning of the words “established by the state” in the context of the 363,086 words of the Affordable Care Act. This is a good example of the ambiguity of rules. Once AI regulations become law, armies of lawyers, aided by AI, will be engaged in debates over their interpretation and application.

The best counterbalance to armies of lawyers creating complexity on any legal issue is a public educated about the issue and engaged in protecting their own interests. Automobile safety is a good example. This will also be the case with AI regulation. And, as discussed in the introductory section of Chapter 10, there is precedent for the compassionate intentions of some wealthy and powerful people and this may serve to counterbalance their interest in creating complexity.

Privacy regulations, which affect existing large IT systems employing AI, already exist in the US and even more so in Europe. However, many IT services depend on accurate models of users’ preferences. At the recent FLI conference in San Juan, I tried to make the point that a danger from AI will be that people will want the kind of close, personal relationship with AI systems that will enable intrusion and manipulation by AI. The Omniscience AI described in Chapter 1 of my book is an example. As an astute IT lawyer said at the FLI conference, the question of whether an IT innovation will be legal depends on whether it will be popular.

这使我们回到了对AI公共教育的需求。让人们抵制与AI亲密关系的短期益处所吸引的,他们需要了解长期后果。我认为禁止人与人工智能之间的紧密关系是不现实的,但是如果公众理解问题,也许可以要求对这些关系的利用目标进行一些监管。

The final section of my Chapter 10 says that AI developers and testers should recognize that they are acting as agents for the future of humanity and that their designs and test results should be transparent to the public. The FLI open letter and Google’s panel on AI ethics are encouraging signs that AI developers do recognize their role as agents for future humanity. Also, DeepMind has been transparent about the technology of their Atari player, even making source code available for non-commercial purposes.

AI developers deserve to be rewarded for their success. On the other hand, people have a right to avoid losing control over their own lives to an all-powerful AI and its wealthy human owners. The problem is to find a way to achieve both of these goals.

Among current humans, with naturally evolved brains, IQ has a normal distribution. When brains are artifacts, their intelligence is likely to have a power law distribution. This is the pattern of distributions of sizes of other artifacts such as trucks, ships, buildings, and computers. The average human will not be able to understand or ever learn the languages used by the most intelligent minds. This may mean the end of any direct voice in public policy decisions for average humans – effectively the end of democracy. But if large AI systems are maximizing utility functions that account for the values of individual humans, that may take the place of direct democracy.

Chapters 6 – 8 of my book propose mathematical definitions for an AI design that does balance the values of individual humans. Chapter 10 suggests that this design may be modified to provide different weights to the values of different people, for example to reward those who develop AI systems. I must admit that the connection between the technical chapters of my book and Chapter 10, on politics, is weak. Political issues are just difficult. For example, the future will probably have multiple AI systems with conflicting utility functions and a power law distribution of intelligence. It is difficult to predict how such a society would function and how it would affect humans, and this unpredictability is a risk. Creating a world with a single powerful AI system also poses risks, and may be difficult to achieve.

Since my first paper about future AI in 2001, I have thought that the largest risks from AI are political rather than technical. We have an ethical obligation to educate the public about the future of AI, and an educated public is an essential element of finding a good outcome from AI.


Luke:Thanks, Bill!