White House submissions and report on AI safety

||News

In May, the White House Office of Science and Technology Policy (OSTP)announced“a new series of workshops and an interagency working group to learn more about the benefits and risks of artificial intelligence.” They hosted a June Workshop on Safety and Control for AI (videos),以及其他三个研讨会,并发布了有关AI的一般信息(请参阅Miri的主要提交在这里)。

The OSTP has now released a report summarizing its conclusions, “为人工智能的未来做准备,” and the result is very promising. The OSTP acknowledges the ongoing discussion about AI risk, and recommends “investing in research on longer-term capabilities and how their challenges might be managed”:

General AI(sometimes called Artificial General Intelligence, or AGI) refers to a notional future AI system that exhibits apparently intelligent behavior at least as advanced as a person across the full range of cognitive tasks. A broad chasm seems to separate today’s Narrow AI from the much more difficult challenge of General AI. Attempts to reach General AI by expanding Narrow AI solutions have made little headway over many decades of research. The current consensus of the private-sector expert community, with which the NSTC Committee on Technology concurs, is that General AI will not be achieved for at least decades.14

人们长期以来一直猜测计算机比人类更聪明的含义。有些人预测,足够智能的AI可以负责开发更好,更智能的系统,而这些系统又可以用来创建具有更大智能的系统,依此类推,依此类推,从原则上导致“智能爆炸”或“”亚博体育苹果app官方下载奇异性”在其中,机器迅速在智力上领先于人类。15

In a dystopian vision of this process, thesesuper-intelligentmachines would exceed the ability of humanity to understand or control. If computers could exert control over many critical systems, the result could be havoc, with humans no longer in control of their destiny at best and extinct at worst. This scenario has long been the subject of science fiction stories, and recent pronouncements from some influential industry leaders have highlighted these fears.

A more positive view of the future held by many researchers sees instead the development of intelligent systems that work well as helpers, assistants, trainers, and teammates of humans, and are designed to operate safely and ethically.

NSTC技术评估委员会是,对超级智能将军AI的长期担忧对当前政策的影响不大。如果这些担忧是合理的,联邦政府应在接近中间任期内采取的政策几乎与联邦政府没有合理的政策完全相同。建立解决长期投机风险能力的最佳方法是攻击当今已经看到的极端风险,例如当前的安全性,隐私和安全风险,同时投资于长期能力的研究以及他们的挑战如何可能亚博体育官网被管理。此外,随着该领域的研究和应用继续成亚博体育官网熟,政府和企业中AI的从业人员应在适当考虑长期的社会和道德问题的情况下进步,此外,此类进步预示了这些问题。尽管审慎指示有害有害的超智能可能有可能成为可能的可能性,但这些担忧不应成为AI公共政策的主要驱动力。

Later, the report discusses “methods for monitoring and forecasting AI developments”:

一个潜在的有用的研究是适应亚博体育官网ey expert judgments over time. As one example, a survey of AI researchers found that 80 percent of respondents believed that human-level General AI will eventually be achieved, and half believed it is at least 50 percent likely to be achieved by the year 2040. Most respondents also believed that General AI will eventually surpass humans in general intelligence.50While these particular predictions are highly uncertain, as discussed above, such surveys of expert judgment are useful, especially when they are repeated frequently enough to measure changes in judgment over time. One way to elicit frequent judgments is to run “forecasting tournaments” such as prediction markets, in which participants have financial incentives to make accurate predictions.51Other research has found that technology developments can often be accurately predicted by analyzing trends in publication and patent data52。[…]

当在宣传讲习班和会议上被问及政府如何认识到该领域的进步里程碑,尤其是那些表明通用AI的到来的人可能正在接近时,研究人员倾向于给出三种不同但相关的答案类型:亚博体育官网

1.Success at broader, less structured tasks:从这种角度来看,将通过逐渐扩大狭窄的AI系统的功能,从而使单个系统涵盖更广泛的结构化任务,从而从当前的狭窄AI到最终的通用AI进行过渡。亚博体育苹果app官方下载这个领域的一个例子里程碑是一个房屋清洁的机器人,在整个常规房屋清洁任务中都与一个人一样有能力。

2。Unification of different “styles” of AI methods:在此观点中,AI当前依赖一组单独的方法或方法,每种方法都对不同类型的应用程序有用。通往一般AI的途径将涉及这些方法的进行性统一。一个里程碑将涉及找到能够解决以前需要多种方法的较大应用程序域的单一方法。

3。Solving specific technical challenges, such as transfer learning: In this view, the path to General AI does not lie in progressive broadening of scope, nor in unification of existing methods, but in progress on specific technical grand challenges, opening up new ways forward. The most commonly cited challenge is transfer learning, which has the goal of creating a machine learning algorithm whose result can be broadly applied (or transferred) to a range of new applications.

The report also discusses the open problems outlined in “Concrete Problems in AI Safety” and cites the MIRI paper “The Errors, Insights and Lessons of Famous AI Predictions – and What They Mean for the Future。”

In related news, Barack Obama recently answered some questions about AI risk and Nick Bostrom’s超级智能in aWiredinterview。After saying that “we’re still a reasonably long way away” from general AI (video) and that his directive to his national security team is to worry more about near-term security concerns (video),奥巴马补充说:

Now, I think, as a precaution — and all of us have spoken to folks like Elon Musk who are concerned about the superintelligent machine — there’s some prudence in thinking about benchmarks that would indicate some general intelligence developing on the horizon. And if we can see that coming, over the course of three decades, five decades, whatever the latest estimates are — if ever, because there are also arguments that this thing’s a lot more complicated than people make it out to be — then future generations, or our kids, or our grandkids, are going to be able to see it coming and figure it out.

There were also a number of interesting对OSTP请求的响应。Since this document is long and unedited, I’ve sampled some of the responses pertaining to AI safety and long-term AI outcomes below. (Note that MIRI isn’t necessarily endorsing the responses by non-MIRI sources below, and a number of these excerpts are given important nuance by the surrounding text we’ve left out; if a response especially interests you, we recommend reading the original for added context.)


Respondent 77: JoEllen Lukavec Koester, GoodAI

[…]At GoodAI we are investigating suitable meta-objectives that would allow an open-ended, unsupervised evolution of the AGI system as well as guided learning – learning by imitating human experts and other forms of supervised learning. Some of these meta-objectives will be hard-coded from the start, but the system should be also able to learn and improve them on its own, that is, perform meta-learning, such that it learns to learn better in the future.

Teaching the AI system small skills using fine-grained, gradual learning from the beginning will allow us to have more control over the building blocks it will use later to solve novel problems. The system’s behaviour can therefore be more predictable. In this way, we can imprint some human thinking biases into the system, which will be useful for the future value alignment, one of the important aspects of AI safety. […]


受访者84:安德鲁·克里奇(Andrew Critch),美里(Miri)

[…]当我们开发强大的推理系统应得的“人工通用情报(AGI)”这个名称时,我们亚博体育苹果app官方下载将需要价值一致性和/或控制技术,这些技术能够符合强大的优化过程,从而产生可能看起来为“创意”或“聪明的””机器围绕我们的约束工作的方法。因此,在培训最终将开发它的科学家时,需要更多的重点在“安全心态”上:即,为了真正知道系统将是安全的,您需要创造性地搜索它可能失败的方式。亚博体育苹果app官方下载立法者和计算机安全专业人员自然地学习了本课程,从智能人类对手在其控制系统中发现漏洞的经验。亚博体育苹果app官方下载在网络安全方面,通常将大量的研发时间投入到实际试图闯入自己的安全系统的方式,作为寻找漏洞的一种方式。亚博体育苹果app官方下载

据我的估计,机器学习研究人员目前的这种倾向少于AGI的安全长期发展所需亚博体育官网的。这可以部分归因于机器学习领域最近如何迅速发展:通过成功将注意力转移到数据驱动(“机器学习”)而不是理论上驱动(“良好的老式AI”,“统计”学习理论”)方法。在数据科学中,通常要构建一些东西,看看发生了什么比尝试从第一原则进行推理以提前弄清楚发生的事情要快。虽然目前有用,但当然,我们不应使用相同的尝试方法来处理超级智能机器的最终开发,并且现在可以开始开发一种理论是有意义的- 智能机器在操作之前,即使在测试阶段也是如此。[…]


Respondent 90: Ian Goodfellow, OpenAI

[…]从长远来看,构建理解并与用户的价值保持一致的AI系统将很重要。亚博体育苹果app官方下载我们将需要开发技术来构建可以学习我们想要的东西以及如何帮助我们获得它的系统,而无需特定的亚博体育苹果app官方下载规则。亚博体育官网研究人员开始调查这一挑战。公共资金可以帮助社区尽早应对挑战,而不是在发生后对严重问题做出反应。[…]


受访者94:曼努埃尔·贝尔特兰(Manuel Beltran),波音公司

[…]Advances in picking apart the brain will ultimately lead to, at best, partial brain emulation, at worst, whole brain emulation. If we can already model parts of the brain with software, neuromorphic chips, and artificial implants, the path to greater brain emulation is pretty well set. Unchecked, brain emulation will exasperate the Intellectual Divide to the point of enabling the emulation of the smartest, richest, and most powerful people. While not obvious, this will allow these individuals to scale their influence horizontally across time and space. This is not the vertical scaling that an AGI, or Superintelligence can achieve, but might be even more harmful to society because the actual intelligence of these people is limited, biased, and self-serving. Society must prepare for and mitigate the potential for the Intellectual Divide.

(5) The most pressing, fundamental questions in AI research, common to most or all scientific fields include the questions of ethics in pursuing an AGI. While the benefits of narrow AI are self-evident and should not be impeded, an AGI has dubious benefits and ominous consequences. There needs to be long term engagement on the ethical implications of an AGI, human brain emulation, and performance enhancing brain implants. […]

The AGI research community speaks of an AI that will far surpass human intellect. It is not clear how such an entity would assess its creators. Without meandering into the philosophical debates about how such an entity would benefit or harm humanity, one of the mitigations proposed by proponents of an AGI is that the AGI would be taught to “like” humanity. If there is machine learning to be accomplished along these lines, then the AGI research community requires training data that can be used for teaching the AGI to like humanity. This is a long term need that will overshadow all other activity and has already proven to be very labor intensive as we have seen from the first prototype AGI,KristinnR.Thórisson博士的AERA S1at Reykjavik University in Iceland.


尼克•博斯特罗姆被调查者97:人类的未来stitute

[…w] e希望强调四个“铲子就绪”研究主题,这些研究主题具有解决长期问题的特殊承诺:亚博体育官网

可扩展的监督:我们如何确保学习算法在反馈信号变得稀疏或消失时的预期行为?(看Christiano 2016)。解决此问题将使学习算法能够表现得好像在人类的近距离监督下,即使自主权提高。

Interruptibility: How can we avoid the incentive for an intelligent algorithm to resist human interference in an attempt to maximise its future reward? (See our recent progress in collaboration with Google Deepmind in (Orseau & Armstrong 2016)。) Resolving this would allow us to ensure that even high capability AI systems can be halted in an emergency.

Reward hacking: How can we design machine learning algorithms that avoid destructive solutions by taking their objective very literally? (SeeRing&Orseau,2011年)。解决这将阻止算法finding unintended shortcuts to their goal (for example, by causing problems in order to get rewarded for solving them).

Value learning: How can we infer the preferences of human users automatically without direct feedback, especially if these users are not perfectly rational? (SeeHadfield-Menell et al. 2016and FHI’s approach to this problem inEvans et al. 2016)。Resolving this would alleviate some of the problems above caused by the difficulty of precisely specifying robust objective functions. […]


Respondent 103: Tim Day, the Center for Advanced Technology and Innovation at the U.S. Chamber of Commerce

[…]AI operates within the parameters that humans permit. Hypothetical fears of rogue AI are based on the idea that machines can obtain sentience—a will and consciousness of its own. These suspicions fundamentally misunderstand what Artificial Intelligence is. AI is not a mechanical mystery, rather a human-designed technology that can detect and respond to errors and patterns depending on its operating algorithms and the data set presented to it. It is, however, necessary to scrutinize the way humans, whether through error or malicious intent, can wield AI harmfully. […]


Respondent 104: Alex Kozak, X [formerly Google X]

[…]更广泛地说,我们通常同意在“亚博体育官网Concrete Problems in AI Safety,” a joint publication between Google researchers and others in the industry, are the right technical challenges for innovators to keep in mind in order to develop better and safer real-world products: avoiding negative side effects (e.g. avoiding systems disturbing their environment in pursuit of their goals), avoiding reward hacking (e.g. cleaning robots simply covering up messes rather than cleaning them), creating scalable oversight (i.e. creating systems that are independent enough not to need constant supervision), enabling safe exploration (i.e. limiting the range of exploratory actions a system might take to a safe domain), and creating robustness from distributional shift (i.e. creating systems that are capable of operating well outside their training environment). […]


Respondent 105: Stephen Smith, AAAI

[…]Research is urgently needed to develop and modify AI methods to make them safer and more robust. A discipline of AI Safety Engineering should be created and research in this area should be funded. This field can learn much by studying existing practices in safety engineering in other engineering fields, since loss of control of AI systems is no different from loss of control of other autonomous or semi-autonomous systems. […]

控制自主系统有两个关键问题:速度和规模。亚博体育苹果app官方下载基于人工智能的自主权使系统能够做出比人类可以监视这些决策更快,更广泛的范围。亚博体育苹果app官方下载在某些领域,例如金融市场的高速交易,我们已经见证了一场“军备竞赛”,以尽快做出决定。这是危险的,政府应考虑是否在某些情况下,决策速度和规模应受到限制,以便人们可以进行监督和控制这些系统。亚博体育苹果app官方下载

Most AI researchers are skeptical about the prospects of “superintelligent AI”, as put forth in Nick Bostrom’s recent book and reinforced over the past year in the popular media incommentaries by other prominent individuals from non-AI disciplines. Recent AI successes in narrowly structured problems (e.g., IBM’s Watson, Google DeepMind’s Alpha GO program) have led to the false perception that AI systems possess general, transferrable, human-level intelligence. There is a strong need for improving communication to the public and to policy makers about the real science of AI and its immediate benefits to society. AI research should not be curtailed because of false perceptions of threat and potential dystopian futures. […]

As we move toward applying AI systems in more mission critical types of decision-making settings, AI systems must consistently work according to values aligned with prospective human users and society. Yet it is still not clear how to embed ethical principles and moral values, or even professional codes of conduct, into machines. […]


Respondent 111: Ryan Hagemann, Niskanen Center

[…]人工智能不太可能预示着结束时间。它是not clear at this point whether a runaway malevolent AI, for example, is a real-world possibility. In the absence of any quantifiable risk along these lines government officials should refrain from framing discussions of AI in alarming terms that suggest that there is a known, rather than entirely speculative, risk. Fanciful doomsday scenarios belong in science fiction novels and high-school debate clubs, not in serious policy discussions about an existing, mundane, and beneficial technology. Ours is already “a world filled with narrowly-tailored artificial intelligence that no one recognizes. As the computer scientist John McCarthy once said: ‘As soon as it works, no one calls it AI anymore.’”

The beneficial consequences of advanced AI are on the horizon and potentially profound. A sampling of these possible benefits include: improved diagnostics and screening for autism; disease prevention through genomic pattern recognition; bridging the genotype-phenotype divide in genetics, allowing scientists to glean a clearer picture of the relationship between genetics and disease, which could introduce a wave of more effective personalized medical care; the development of new ways for the sight- and hearing-impaired to experience sight and sound. To be sure, many of these developments raise certain practical, safety, and ethical concerns. But there are already serious efforts underway by the private ventures developing these AI applications to anticipate and responsibly address these, as well as more speculative, concerns.

Consider OpenAI, “a non-profit artificial intelligence research company.” OpenAI’s goal “is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.” AI researchers are already thinking deeply and carefully about AI decision-making mechanisms in technologies like driverless cars, despite the fact that many of the most serious concerns about how autonomous AI agents make value-based choices are likely many decades out. Efforts like these showcase how the private sector and leading technology entrepreneurs are ahead of the curve when it comes to thinking about some of the more serious implications of developing true artificial general intelligence (AGI) and artificial superintelligence (ASI). It is important to note, however, that true AGI or ASI are unlikely to materialize in the near-term, and the mere possibility of their development should not blind policymakers to the many ways in which artificial narrow intelligence (ANI) has already improved the lives of countless individuals the world over. Virtual personal assistants, such as Siri and Cortana, or advanced search algorithms, such as Google’s search engine, are good examples of already useful applications of narrow AI. […]

The Future of Life Institute has observed that “our civilization will flourish as long as we win the race between the growing power of technology and the wisdom with which we manage it. In the case of AI technology … the best way to win that race is not to impede the former, but to accelerate the latter, by supporting AI safety research.” Government can play a positive and productive role in ensuring the best economic outcomes from developments in AI by promoting consumer education initiatives. By working with private sector developers, academics, and nonprofit policy specialists government agencies can remain constructively engaged in the AI dialogue, while not endangering ongoing developments in this technology.


Respondent 119: Sven Koenig, ACM Special Interest Group on Artificial Intelligence

[…]关于安全和控制的公开论述将受益于神秘的AI。媒体经常集中于AI技术的大成功或失败,以及在科幻故事中引起的情景,并以对AI技术未来发展的名人非专家的意见为特色。结果,部分公众对AI系统发展了超人智能的恐惧,而大多数专家都同意,AI技术目前仅在专业领域中良好,以及“超级智能”和“技术亚博体育苹果app官方下载奇异性”的概念,这将导致AI导致AI亚博体育苹果app官方下载开发超人,广泛智能行为的系统数十年来,可能永远不会实现。多年来,AI技术取得了稳步的进步,但似乎对他们能做的事情有夸张的乐观和悲观情绪。两者都是有害的。例如,在他们不应该不应该满足期望甚至造成伤害的情况下,对其能力的夸张信念可能会导致使用AI系统(也许是粗心大意)。亚博体育苹果app官方下载不可避免的失望会导致反对AI研究的强烈反对,从而减少了创新。亚博体育官网[…]


Respondent 124: Huw Price, University of Cambridge, UK

[…]3。In his first paper[1] Good tries to estimate the economic value of an ultraintelligent machine. Looking for a benchmark for productive brainpower, he settles impishly on John Maynard Keynes. He notes that Keynes’ value to the economy had been estimated at 100 thousand million British pounds, and suggests that the machine might be good for a million times that – a mega-Keynes, as he puts it.

4. But there’s a catch. “The sign is uncertain” – in other words, it is not clear whether this huge impact would be negative or positive: “The machines will create social problems, but they might also be able to solve them, in addition to those that have been created by microbes and men.” Most of all, Good insists that these questions need serious thought: “These remarks might appear fanciful to some readers, but to me they seem real and urgent, and worthy of emphasis outside science fiction.” […]


Respondent 136: Nate Soares, MIRI

[…]Researchers’ worries about the impact of AI in the long term bear little relation to the doomsday scenarios most often depicted in Hollywood movies, in which “emergent consciousness” allows machines to throw off the shackles of their programmed goals and rebel. The concern is rather that such systems may pursue their programmed goals all too well, and that the programmed goals may not match the intended goals, or that the intended goals may have unintended negative consequences. […]

We believe that there are numerous promising avenues of foundational research which, if successful, could make it possible to get very strong guarantees about the behavior of advanced AI systems — stronger than many currently think is possible, in a time when the most successful machine learning techniques are often poorly understood. We believe that bringing together researchers in machine learning, program verification, and the mathematical study of formal agents would be a large step towards ensuring that highly advanced AI systems will have a robustly beneficial impact on society. […]

从长远来看,我们建议政策制定者利用激励措施来鼓励AI系统的设计师合作,也许是通过跨国和跨学院的合作来共同努力,以阻止种族动态的发展。亚博体育苹果app官方下载鉴于专家对AI的未来的高度不确定性,以及AI研究的巨大潜力来挽救生命,解决社会问题并在不久的将来服务共同的利益,我们建议您反对广泛的监管干预措施亚博体育官网这个空间。我们建议采取努力,以鼓励跨学科的技术研究,以了解我们上面概述的AI安全和控制挑战。亚博体育官网


受访者145:Andrew Kim,Google Inc.

[…]No system is perfect, and errors will emerge. However, advances in our technical capabilities will expand our ability to meet these challenges.

为此,我们认为解决这些问题的解决方案可以并且应该基于严格的工程研究,以向这些系统的创建者提供他们可以用来解决这些问题的方法和工具。亚博体育官网亚博体育苹果app官方下载“Concrete Problems in AI Safety”, a recent paper from our researchers and others, takes this approach in questions around safety. We also applaud the work of researchers who – along with researchers like Moritz Hardt at Google – are looking at short-term questions of bias and discrimination. […]


Respondent 149: Anthony Aguirre, Future of Life Institute

[…S]ocietally beneficial values alignment of AI is not automatic. Crucially, AI systems are designed not just to enact a set of rules, but rather to accomplish a goal in ways that the programmer does not explicitly specify in advance. This leads to an unpredictability that can [lead] to adverse consequences. As AI pioneer Stuart Russell explains, “No matter how excellently an algorithm maximizes, and no matter how accurate its model of the world, a machine’s decisions may be ineffably stupid, in the eyes of an ordinary human, if its utility function is not well aligned with human values.” (2015)。

Since humans rely heavily on shared tacit knowledge when discussing their values, it seems likely that attempts to represent human values formally will often leave out significant portions of what we think is important. This is addressed by the classic stories of the genie in the lantern, the sorcerer’s apprentice, and Midas’ touch. Fulfilling the letter of a goal with something far afield from the spirit of the goal like this is known as “perverse instantiation” (Bostrom [2014])。This can occur because the system’s programming or training has not explored some relevant dimensions that we really care about (Russell 2014)。这些很容易错过,因为它们通常被人们视为理所当然,即使是付出了很多努力和大量的培训数据,人们也无法可靠地思考他们忘记的想法。

未来(甚至现在),某些AI系统的复杂性可能会超越人类亚博体育苹果app官方下载的理解,但是随着这些系统变得更加有效,我们将承受效率压力,以使他们越来越依赖它们,并将控制权归结为它们。指定一组明确规则与我们的价值观牢固相符的明确规则变得越来越困难,因为域接近复杂的开放世界模型,在(一定是复杂的)现实世界中运行,并且随着任务和环境变得如此复杂为了超出人类监督的能力或可伸缩性[。]因此,必须采取更复杂的方法来确保AI系统实现他们所实现的目标而没有不利的副作用。亚博体育苹果app官方下载请参阅参考Russell, Dewey, and Tegmark (2015),Taylor (2016), andAmodei et al.for research threads addressing these issues. […]

We would argue that a “virtuous cycle” has now taken hold in AI research, where both public and private R&D leads to systems of significant economic value, which underwrites and incentivizes further research. This cycle can leave insufficiently funded, however, research on the wider implications of, safety of, ethics of, and policy implications of, AI systems that are outside the focus of corporate or even many academic research groups, but have a compelling public interest. FLI helped to develop a set of suggested “Research Priorities for Robust and Beneficial Artificial Intelligence” along these lines (available athttp://futureoflife.org/data/documents/research_priorities.pdf);我们还支持Miri的AI安全性研究议程(亚博体育官网//www.hdjkn.com/files/TechnicalAgenda.pdf),书中建议的那样Amodei et al. (2016)。We would advocate for increased funding of research in the areas described by all of these agendas, which address problems in the following research topics: abstract reasoning about superior agents, ambiguity identification, anomaly explanation, computational humility or non-self-centered world models, computational respect or safe exploration, computational sympathy, concept geometry, corrigibility or scalable control, feature identification, formal verification of machine learning models and AI systems, interpretability, logical uncertainty modeling, metareasoning, ontology identification/ refactoring/alignment, robust induction, security in learning source provenance, user modeling, and values modeling. […]


It’s exciting to see substantive discussion of AGI’s impact on society by the White House. The policy recommendations regarding AGI strike us as reasonable, and we expect these developments to help inspire a much more in-depth and sustained conversation about the future of AI among researchers in the field.