miri常见问题

1。什么是miri的使命？
2。为什么认为ai可以超越人类？
3.为什么安全对智慧比人类AI很重要？
4.Do researchers think AI is imminent?
5。What technical problems are you working on?
6。为什么要提前努力享用AI安全？
7。我怎样才能贡献？

1. Miri的使命是什么？

我们的使命宣言是“确保创造聪明的人工智能具有积极影响。”这是一个雄心勃勃的目标，但我们相信一些early progressis possible, and we believe that the goal’s importance and difficulty makes it prudent to begin work at an early date.

我们的两项研究议亚博体育官网程，“Agent Foundations for Aligning Machine Intelligence with Human Interests“and “Value Alignment for Advanced Machine Learning Systems，“专注于三组技术问题：

高度可靠的代理设计- 学习如何指定高度自主系统，可靠地追求一些固定目标;亚博体育苹果app官方下载
价值规范— supplying autonomous systems with the intended goals; and
误差容忍— making such systems robust to programmer error.

We publish new亚博体育官网，主持人亚博体育官网那attend conferences, and基金外部研究人员亚博体育官网who are interested in investigating these problems. We also host ayabo体育官网 and an onlineresearch forum。

2。为什么认为ai可以超越人类？

机器已经比人类更聪明，在许多特定的任务中：执行计算，下棋，搜索大型数据库，检测水下矿物等。¹然而，人类智慧继续贯穿一般性的机器智能。

一个强大的国际象棋计算机是“狭隘”：它不能播放其他游戏。相比之下，人类有解决问题的能力，使我们能够适应许多域中的新背景和Excel，而不是祖先环境为我们准备的。

在没有一个formal definition of “intelligence”（因此“人工智能”），我们可以启发式引用人类的感知，推论和审议院系（而不是，例如，我们的体力或敏捷）并说智慧是“这些类型的事情”。在这一概念上，智慧是一系列独特的院系 - 虽然是一个非常重要的捆绑包，包括我们的科学能力。

我们的认知能力源于我们大脑中的高层模式，这些模式可以在硅和碳中实例化。这告诉我们，普通AI是可能的，但它并没有告诉我们它有多困难。如果智能足够难以理解，那么我们可以通过扫描和模仿人的大脑或某些试验和错误过程（如演进）来达到机器智能，而不是通过手动编码软件代理。

If machines can achieve human equivalence in cognitive tasks, then it is very likely that they can eventually outperform humans. There is little reason to expect that biological evolution, with its lack of foresight and planning, would have hit upon the optimal algorithms for general intelligence (any more than it hit upon the optimal flying machine in birds). Beyond定性改进in cognition, Nick Bostrom notesmore straightforward advantages we could realize in digital minds那e.g.:

可编辑- “更容易尝试软件中的参数变化而不是神经湿润。”²
速度- “光速大于神经传输的速度超过了一百万倍，突触尖峰耗散的热量比热力学的所需更多的热量，并且电流晶体管频率比神经元尖刺频率快得多百万倍。“
串行深度— On short timescales, machines can carry out much longer sequential processes.
storage capacity- 计算机可以合理地具有更大的工作和长期记忆。
尺寸- 计算机可以比人类大脑大得多。
重复性- 复制到新硬件上的软件可能比生物再现更快，更高的保真度。

Any one of these advantages could give an AI reasoner an edge over a human reasoner, or give a group of AI reasoners an edge over a human group. Their combination suggests that digital minds could surpass human minds more quickly and decisively than we might expect.

3.为什么安全对智慧比人类AI很重要？

Present-day AI algorithms already demand special safety guarantees when they must act in important domains without human oversight, particularly when they or their environment can change over time:

实现这些收益(从自治系统)wi亚博体育苹果app官方下载ll depend on development of entirely new methods for enabling “trust in autonomy” through verification and validation (V&V) of the near-infinite state systems that result from high levels of [adaptability] and autonomy. In effect, the number of possible input states that such systems can be presented with is so large that not only is it impossible to test all of them directly, it is not even feasible to test more than an insignificantly small fraction of them. Development of such systems is thus inherently unverifiable by today’s methods, and as a result their operation in all but comparatively trivial applications is uncertifiable.

有可能开发具有高度自主性的系统，但它是缺乏合适的V＆亚博体育苹果app官方下载V方法，可防止所有但相对较低程度的自主权进行认证。^3.

随着AI功能的提高，它将更容易地提供AI系统更大的自主性，灵活性和控制;亚博体育苹果app官方下载越来越大的激励措施来利用这些新的可能性。特别是AI系统的潜力，特别是将难以建立安全亚博体育苹果app官方下载保证：测试期间可靠的规律可能并不总是保持测试后。

The largest and most lasting changes in human welfare have come from scientific and technological innovation — which in turn comes from our intelligence. In the long run, then, much of AI’s significance comes from its potential to automate and enhance progress in science and technology. The creation of smarter-than-human AI brings with it the basic risks and benefits of intellectual progress itself, at digital speeds.

随着AI代理商变得更有能力，分析和验证其决定和目标变得更加重要（更困难）。斯图尔特罗素writes：

The primary concern is not spooky emergent consciousness but simply the ability to makehigh-quality decisions。在这里，质量是指采取的采取行动的预期结果效用，其中包括人工设计师指定的实用程序。现在我们有问题：

实用程序函数可能与人类的值完全对齐，这是（充其量）非常难以放下。

Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources – not for their own sake, but to succeed in its assigned task.

A system that is optimizing a function ofN.variables, where the objective depends on a subset of sizek那will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. This is essentially the old story of the genie in the lamp, or the sorcerer’s apprentice, or King Midas: you get exactly what you ask for, not what you want.^4.

Bostrom’s “过度智力意志“更详细地说明这两个问题：我们可能无法在编程比较智能的AI系统中正确指定我们的实际目标，并且大多数优化错过的目标的代理商会使人类对待人类的激励措施或潜在的威胁亚博体育苹果app官方下载obstacles to achieving the agent’s goal.

如果人和AI代理商的目标并不齐全，更知识渊博和技术能力的代理可以使用武力来获取它想要的东西，因为人类社区之间的许多冲突发生了。注意提前注意到这类关注，我们有机会通过将研究与我们自己的人工决策者的利益指导，从而减少这种违约情景的风险。亚博体育官网

研究人员认为亚博体育官网AI迫在眉睫吗？

2013年初,博斯特罗姆和穆勒的调查hundred top-cited living authors in AI, as ranked by Microsoft Academic Search. Conditional on “no global catastrophe halt[ing] progress,” the twenty-nine experts who responded assigned a median 10% probability to our developing a machine “that can carry out most human professions at least as well as a typical human” by the year 2023, a 50% probability by 2048, and a 90% probability by 2080.^5.

Most researchers at MIRI approximately agree with the 10% and 50% dates, but think that AI could arrive significantly later than 2080. This is in line with Bostrom’s analysis in超明：

我自己的观点是，专家调查中报告的中位数在稍后的抵达日期内没有足够的概率质量。HLMI [人级机器智能]的10％概率没有由2075年甚至2100（在整理“在没有重大负面破坏的人类科学活动）似乎过低。

Historically, AI researchers have not had a strong record of being able to predict the rate of advances in their own field or the shape that such advances would take. On the one hand, some tasks, like chess playing, turned out to be achievable by means of surprisingly simple programs; and naysayers who claimed that machines would “never” be able to do this or that have repeatedly been proven wrong. On the other hand, the more typical errors among practitioners have been to underestimate the difficulties of getting a system to perform robustly on real-world tasks, and to overestimate the advantages of their own particular pet project or technique.

鉴于专家（和非专家）在预测AI的进展方面较差，we are relatively agnostic about when full AI will be invented。It could come sooner than expected, or later than expected.

专家们还报告了10％的中位信心，即在人类等价的2年内将发展过度智能，并且在人类等价后30年内将开发过度智能化的75％信心。在这里，Miri研究亚博体育官网人员的观点显着不同于AI专家的中位数;一旦他们附近的人类等价，我亚博体育苹果app官方下载们预计AI系统将相对迅速超越人类。

5.您在研究哪些技术问题？

“与人类兴趣的智慧比对一致的人”是一个极其含糊的目标。为了高效地接近这个问题，我们试图将其分解为几个子问题。作为一个起点，我们问：“即使问题更容易，我们仍然无法解决这个问题的哪些方面？”

为了实现现实世界的目标更有效ely than a human, a general AI system will need to be able to learn its environment over time and decide between possible proposals or actions. A simplified version of the alignment problem, then, would be to ask how we could construct a system that learns its environment and has a very crude decision criterion, like “Select the policy that maximizes the expected number of diamonds in the world.”

高度可靠的代理设计is the technical challenge of formally specifying a software system that can be relied upon to pursue some preselected toy goal. An example of a subproblem in this space isontology identification：如何将“最大化钻石”中的目标正式正式，允许完全自治的代理人可能最终在意想不到的环境中，并可能构建意外的假设和政策？即使我们在世界上没有染色的计算能力和所有时间，我们目前没有知道如何解决这个问题。这表明我们不仅丢失了实用算法，而且是一个理解问题的基本理论框架。

正式的代理AIXI是一种试图通过加强学习者的情况下“最佳行为”来定义我们的意思。然而，如果目标是改变关于外部世界的事情（而不仅仅是为了最大化预先指定的奖励号码），那么缺乏一个简单的AIXI等方程。为了使代理商评估其世界模型来计算钻石的数量，而不是拥有特权奖励渠道，这是一般的正式属性必须拥有它的世界型号？如果系统更新其亚博体育苹果app官方下载假设（例如，发现字符串理论是真实的，量子物理学是错误的），以其程序员未指望，它如何在新模型中识别“钻石”？问题是一个非常基本的问题，但相关理论目前缺失。

We can distinguish highly reliable agent design from the problem of价值规范：“一旦我们了解如何设计一个促进目标的自主AI系统，我们如何确保其目标实际上与我们想要的东西相匹配？”亚博体育苹果app官方下载由于人为错误是不可避免的，并且我们需要能够安全地监督和重新设计AI算法，即使它们在认知任务中接近人类等价，Miri也在正式化上工作宽容耐堵塞agent properties.人工智能：一种现代方法那the standard textbook in AI, summarizes the challenge:

Yudkowsky […] asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design — to design a mechanism for evolving AI under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.^6.

Ourtechnical agenda更详细地描述了这些打开问题，我们yabo 收集更多学习的在线资源。

6.为什么早期努力安全？

Miri优先考虑早期安全工作，因为我们相信这样的工作是important那对时间敏感的那贸易那and信息。

The importance of AI safety work is outlined inQ3, above。我们将问题视为时间敏感：

忽视- 只有少数人目前正在努力Miri技术议程中概述的公开问题。

表观困难— Solving the alignment problem may demand a large number of researcher hours, and may also be harder to parallelize than capabilities research.

risk asymmetry— Working on safety too late has larger risks than working on it too early.

AI时间轴不确定性— AI could progress faster than we expect, making it prudent to err on the side of caution.

AI中的不连续进步— Progress in AI is likely to speed up as we approach general AI. This means that even if AI is many decades away, it would be hazardous to wait for clear signs that general AI is near: clear signs may only arise when it’s too late to begin safety work.

We also think it is possible to do useful work in AI safety today, even if smarter-than-human AI is 50 or 100 years away. We think this for a few reasons:

lack of basic theory— If we had simple idealized models of what we mean by correct behavior in autonomous agents, but didn’t know how to design practical implementations, this might suggest a need for more hands-on work with developed systems. Instead, however, simple models are what we’re missing. Basic theory doesn’t necessarily require that we have experience with a software system’s implementation details, and the same theory can apply to many different implementations.

precedents- 理论计算机科学家在相对缺乏实际实施的基本理论方面重复了成功。（知名的例子包括Claude Shannon，Alan Tures，Andrey Kolmogorov和Judea Pearl。）

早期结果- 我们已经取得了重大进展，因为优先考虑我们正在寻找的一些理论问题，特别是在decision theoryandlogical uncertainty。这表明有很低的理论水果将被挑选。

Finally, we expect progress in AI safety theory to be useful for improving our understanding of robust AI systems, of the available technical options, and of the broader strategic landscape. In particular,we expect transparency to be necessary for reliable behavior那and we think there are basic theoretical prerequisites to making autonomous AI systems transparent to human designers and users.

Having the relevant theory in hand may not be strictly necessary for designing smarter-than-human AI systems — highly reliable agents may need to employ very different architectures or cognitive algorithms than the most easily constructed smarter-than-human systems that exhibit unreliable behavior. For that reason, some fairly general theoretical questions may be more relevant to AI safety work than to mainline AI capabilities work. Key advantages to AI safety work’s informativeness, then, include:

general value of information— Making AI safety questions clearer and more precise is likely to give insights into what kinds of formal tools will be useful in answering them. Thus we’re less likely to spend our time on entirely the wrong lines of research. Investigating technical problems in this area may also help us develop a better sense for how difficult the AI problem is, and how difficult the AI alignment problem is.

requirements for informative testing— If the system is opaque, then online testing may not give us most of the information that we need to design safer systems. Humans are opaque general reasoners, and studying the brain has been quite useful for designing more effective AI algorithms, but it has been less useful for building systems for verification and validation.

安全测试要求- 从不透明系统中提取信息可能不安全，因为我们构建的任何沙箱可能具有显而易见的缺亚博体育苹果app官方下载陷，而不是人类。

7.如何贡献？

Miri是一个主要由亚博体育官网中小型捐助者资助的研究非营利组织。捐赠are therefore helpful for funding our mathematics work, workshops, academic outreach, etc.

For people interested in learning more about our research focus and possibly working with us, our亚博体育苹果app官方下载有一个application formand a number of regularly updated online resources.

由Rob Bensinger撰写。最后更新于2016年9月18日。

尼尔森（2009年）。The Quest for Artificial Intelligence。Cambridge University Press.↩

Bostrom (2014).超明：路径，危险，策略。Oxford University Press.↩

美国空军首席科学家（2010年）办公室。Technology Horizons: A Vision for Air Force Science and Technology 2010-30。↩

Russell (2014). “神话和月光。“Edge.org.。Edge Foundation，Inc。↩

Müller和波斯特拉姆（2014年）。“人工智能未来进展：专家意见调查。“In Müller (ed.),Fundamental Issues of Artificial Intelligence。Springer。↩

Russell and Norvig (2009).人工智能：一种现代方法。皮尔逊。↩