五个论文,两个lemmas和几个战略意义

||yabo app

MIRI’s primary concern about self-improving AI isn’t so much that it might be created by ‘bad’ actors rather than ‘good’ actors in the global sphere; rather most of our concern is in remedying the situation in whichno one knows at allhow to create a self-modifying AI with known, stable preferences. (This is why we see the main problem in terms of亚博体育官网 和encouraging others to perform relevant research, rather than trying to stop ‘bad’ actors from creating AI.)

This, and a number of other basic strategic views, can be summed up as a consequence of 5 theses about purely factual questions about AI, and 2 lemmas we think are implied by them, as follows:

Intelligence explosion thesis。A sufficiently smart AI will be able to realize large, reinvestable cognitive returns from things it can do on a short timescale, like improving its own cognitive algorithms or purchasing/stealing lots of server time. The intelligence explosion will hit very high levels of intelligence before it runs out of things it can do on a short timescale. See:Chalmers(2010);Muehlhauser & Salamon (2013);Yudkowsky(2013年)

Orthogonality thesis。心灵设计空间足够巨大,可以包含几乎任何一组偏好的代理,并且这些代理可以对实现这些偏好具有乐于理解的,并且具有很大的计算能力。例如,介意设计空间理论上包含强大的,有用的Rational代理商,它充当预期的纸夹最大化器,并且始终会选择导致最多的预期纸夹的选项。看:Bostrom(2012);Armstrong (2013)

Convergent instrumental goals thesis。Most utility functions will generate a subset of instrumental goals which follow from most possible final goals. For example, if you want to build a galaxy full of happy sentient beings, you will need matter and energy, and the same is also true if you want to make paperclips. This thesis is why we’re worried about very powerful entities even if they have no explicit dislike of us: “The AI does not love you, nor does it hate you, but you are made of atoms it can use for something else.” Note though that by the Orthogonality Thesis you can always have an agent which explicitly, terminally prefers not to do any particular thing — an AI which does love you will not want to break you apart for spare atoms. See:Omohundro (2008);Bostrom(2012)

Complexity of value thesis。It takes a large chunk of Kolmogorov complexity to describe even idealized human preferences. That is, what we ‘should’ do is a computationally complex mathematical object even after we take the limit of reflective equilibrium (judging your own thought processes) and other standard normative theories. A superintelligence with a randomly generated utility function would not do anything we see as worthwhile with the galaxy, because it is unlikely to accidentally hit on final preferences for having a diverse civilization of sentient beings leading interesting lives. See:Yudkowsky (2011);Muehlhauser & Helm (2013)

Fragility of value thesis。Getting a goal system 90% right does not give you 90% of the value, any more than correctly dialing 9 out of 10 digits of my phone number will connect you to somebody who’s 90% similar to Eliezer Yudkowsky. There are multiple dimensions for which eliminating that dimension of value would eliminate almost all value from the future. For example an alien species which shared almost all of human value except that their parameter setting for “boredom” was much lower, might devote most of their computational power to replaying a single peak, optimal experience over and over again with slightly different pixel colors (or the equivalent thereof). Friendly AI is more like a satisficing threshold than something where we’re trying to eke out successive 10% improvements. See: Yudkowsky (2009,2011).

的se five theses seem to imply two important lemmas:

间接正规性。编程自我改善的机器智能来实现抓住的东西 - 似乎似乎有用的想法会导致一个糟糕的结果,无论苹果派和母性如何响起。例如,如果你给予AI最终目标“让人们幸福”它将只转过人们的快乐中心最多。“Indirectly normative” is Bostrom’s term for an AI that calculates the ‘right’ thing to do via, e.g., looking at human beings and modeling their decision processes and idealizing those decision processes (e.g. what you would-want if you knew everything the AI knew and understood your own decision processes, reflective equilibria, ideal advisior theories, and so on), rather than being told a direct set of ‘good ideas’ by the programmers. Indirect normativity is how you deal with Complexity and Fragility. If you can succeed at indirect normativity, then small variances in essentially good intentions may not matter much — that is, if two different projects do indirect normativity correctly, but one project has 20% nicer and kinder researchers, we could still hope that the end results would be of around equal expected value. See:Muehlhauser & Helm (2013)

友好的额外额外难度。您可以建立一个友好的AI(通过正交性论文),但您需要大量的工作和巧妙,以获得目标系统。亚博体育苹果app官方下载可能更重要的是,其余的AI需要满足更高的清洁标准,以便目标系统通过十亿个顺序自我修改保持不变。亚博体育苹果app官方下载任何充分智能的AI要做清洁的自我修改都会往往会这样做,但问题是智能爆炸可能会从AIS开始,这比例如,例如,使用遗传算法或其他这样的这种方式重写自己的AISthat don’t preserve a set of consequentialist preferences. In this case, building a Friendly AI could mean that our AI has to be smarter about self-modification than the minimal AI that could undergo an intelligence explosion. See:Yudkowsky (2008)Yudkowsky(2013年)

的se lemmas in turn have two major strategic implications:

  1. 我们有很多工作要做,如间接的正规和稳定的自我改善。在这个阶段,这项工作很多看起来真的是基础 - 也就是说,我们无法描述如何使用无限计算电源来做这些东西,更不用说有限计算功率。我们应该尽早开始这项工作,因为基础研究往往需要很多时间。亚博体育官网
  2. 的re needs to be a Friendly AI project that has some sort of boost over competing projects which don’t live up to a (very) high standard of Friendly AI work — a project which can successfully build a stable-goal-system self-improving AI, before a less-well-funded project hacks together a much sloppier self-improving AI. Giant supercomputers may be less important to this than being able to bring together the smartest researchers (see the open question posed inYudkowsky 2013) but the required advantage cannot be left up to chance. Leaving things to default means that projects less careful about self-modification would have an advantage greater than casual altruism is likely to overcome.

AGI Impact Experts and Friendly AI Experts

||yabo app

MIRI’s mission is “to ensure that the creation of smarter-than-human intelligence has a positive impact.” A central strategy for achieving this mission is to find and train what one might call “AGI影响专家“和”Friendly AIexperts.”

AGI impact expertsdevelop skills related to predicting technological development (e.g. buildingcomputational modelsof AI development or reasoning aboutintelligence explosion microeconomics), predicting AGI’s likely impact on society, and identifying which interventions are most likely to increase humanity’s chances of safely navigating the creation of AGI. For overviews, see博斯特罗姆& Yudkowsky (2013);Muehlhauser & Salamon (2013)

友好的AI专家develop skills useful for the development of mathematical architectures that can enable AGIs to betrustworthy(or “human-friendly”). This work is carried out atMIRI research workshops和in various publications, e.g.Christiano et al. (2013);希伯德(2013年)。请注意,选择“友好的AI”术语(部分)以避免我们理解这一主题的建议非常好 - 像“道德AI”一样的短语可能听起来像是通过寻找它的那种东西在百科全书中,我们对此值得信赖的AI的理解是贫穷的。

Now, what do we mean by “expert”?

Read more »

“智力爆炸微观经济学”释放

||Papers

MIRI’s new, 93-page technical report by Eliezer Yudkowsky, “智力爆炸微观经济学,” has now been released. The report explains one of the open problems of our research program. Here’s the abstract:

I. J. Good’s thesis of the ‘intelligence explosion’ is that a sufficiently advanced machine intelligence could build a smarter version of itself, which could in turn build an even smarter version of itself, and that this process could continue enough to vastly exceed human intelligence. As Sandberg (2010) correctly notes, there are several attempts to lay down return-on-investment formulas intended to represent sharp speedups in economic or technological growth, but very little attempt has been made to deal formally with I. J. Good’s intelligence explosion thesis as such.

I identify the key issue as returns on cognitive reinvestment – the ability to invest more computing power, faster computers, or improved cognitive algorithms to yield cognitive labor which produces larger brains, faster brains, or better mind designs. There are many phenomena in the world which have been argued as evidentially relevant to this question, from the observed course of hominid evolution, to Moore’s Law, to the competence over time of machine chess-playing systems, and many more. I go into some depth on the sort of debates which then arise on how to interpret such evidence. I propose that the next step forward in analyzing positions on the intelligence explosion would be to formalize return-on-investment curves, so that each stance can say formally which possible microfoundations they hold to be falsified by historical observations already made. More generally, I pose multiple open questions of ‘returns on cognitive reinvestment’ or ‘intelligence explosion microeconomics’. Although such questions have received little attention thus far, they seem highly relevant to policy choices affecting the outcomes for Earth-originating intelligent life.

的preferred place for public discussion of this research is这里。的re is also a private mailing list for technical discussants, which you can apply to join这里

“Singularity Hypotheses” Published

||Papers

singularity hypotheses奇点假设:科学和哲学评估has now been published by Springer, in hardcover and ebook forms.

的book contains 20 chapters about the prospect of machine superintelligence, including 4 chapters by MIRI researchers and research associates.

“Intelligence Explosion: Evidence and Import”(pdf)由卢克·迈翰万家和(以前的Miri研究员)Anna Salamon评论亚博体育官网

the evidence for and against three claims: that (1) there is a substantial chance we will create human-level AI before 2100, that (2) if human-level AI is created, there is a good chance vastly superhuman AI will follow via an “intelligence explosion,” and that (3) an uncontrolled intelligence explosion could destroy everything we value, but a controlled intelligence explosion would benefit humanity enormously if we can achieve it. We conclude with recommendations for increasing the odds of a controlled intelligence explosion relative to an uncontrolled intelligence explosion.

“Intelligence Explosion and Machine Ethics”(pdf) by Luke Muehlhauser and Louie Helm discusses the challenges of formal value systems for use in AI:

Many researchers have argued that a self-improving artificial intelligence (AI) could become so vastly more powerful than humans that we would not be able to stop it from achieving its goals. If so, and if the AI’s goals differ from ours, then this could be disastrous for humans. One proposed solution is to program the AI’s goal system to want what we want before the AI self-improves beyond our capacity to control it. Unfortunately, it is difficult to specify what we want. After clarifying what we mean by “intelligence,” we offer a series of “intuition pumps” from the field of moral philosophy for our conclusion that human values are complex and difficult to specify. We then survey the evidence from the psychology of motivation, moral psychology, and neuroeconomics that supports our position. We conclude by recommending ideal preference theories of value as a promising approach for developing a machine ethics suitable for navigating an intelligence explosion or “technological singularity.”

“Friendly Artificial Intelligence”by Eliezer Yudkowsky is a shortened version ofYudkowsky (2008)

Finally,“人工综合情报和人体心理模型”(pdf)由罗马yampolskiy和(Miri Research As亚博体育官网sociate)约书亚福克斯评论了人为机器智能的危险:

When the first artificial general intelligences are built, they may improve themselves to far-above-human levels. Speculations about such future entities are already affected by anthropomorphic bias, which leads to erroneous analogies with human minds. In this chapter, we apply a goal-oriented understanding of intelligence to show that humanity occupies only a tiny portion of the design space of possible minds. This space is much larger than what we are familiar with from the human example; and the mental architectures and goals of future superintelligences need not have most of the properties of human minds. A new approach to cognitive science and philosophy of mind, one not centered on the human example, is needed to help us understand the challenges which we will face when a power greater than us emerges.

这本书还包括对大多数章节的简短,关键响应,包括Eliezer Yudkowsky和(以前的Miri Staffer)Michael Anissimov所写的答复。

Altair’s Timeless Decision Theory Paper Published

||Papers

牵牛星纸前During his time as a research fellow for MIRI, Alex Altair wrote a paper onTimeless Decision Theory(TDT) that has now been published: “A Comparison of Decision Algorithms on Newcomblike Problems。“

Altair的纸张既不同简洁,也更加精确于其TDT的制定比Yudkowsky的早期纸张“Timeless Decision Theory。“Thus, Altair’s paper should serve as a handy introduction to TDT for philosophers, computer scientists, and mathematicians, while Yudkowsky’s paper remains required reading for anyone interested to develop TDT further, for it covers more ground than Altair’s paper.

Altair的抽象读:

When formulated using Bayesian networks, two standard decision algorithms (Evidential Decision Theory and Causal Decision Theory) can be shown to fail systematically when faced with aspects of the prisoner’s dilemma and so-called “Newcomblike” problems. We describe a new form of decision algorithm, called Timeless Decision Theory, which consistently wins on these problems.

我们可能会在稍后提交日志,但我们已经将当前版本发布到我们的网站,以便读者不需要等待两年(从提交到接受出版物)来阅读它。

For a gentle introduction to the entire field of normative decision theory (including TDT), see Muehlhauser and Williamson’sDecision Theory FAQ

Miri的4月通讯:Relaum庆祝和新的数学结果

||时事通讯


Greetings from The Executive Director

Dear friends,

这些都是激动人心的时刻在美里。

经过多年的宣传和capacity-building, we have finally transformed ourselves into a research institute focused on producing the mathematical research required to build trustworthy (or “human-friendly”) machine intelligence. As our most devoted supporters know, this has been our goal for roughly a decade, and it is a thrill to have made the transition.

It is also exciting to see how much more quickly one can get academic traction with mathematics research, as compared to philosophical research and technological forecasting research. Withinhoursof publishing a draft ofour first math result, Field Medalist Timothy Gowers had seen the draft and commented on it (这里),以及其他几个职业数学家。

We celebrated our “relaunch” at an April 11th party in San Francisco. It was a joy to see old friends and make some new ones. You can see photos and read some details below.

有关我们新的战略优先事项的更多详细信息,请参阅我们的博客文章:Miri 2013年的战略

Cheers,

Luke Muehlhauser
Executive Director

Read more »

Miri 2013年的战略

||MIRI Strategy

这篇文章不是一个详细的战略计划。目前,我只想提供更新what MIRI is doing in 2013 and why

Our mission remains the same. The creation of smarter-than-human intelligencewill likely bethe most significant event in human history, and MIRI exists to help ensure that this event has a positive impact.

仍然,过去一年仍然发生了变化:

  • 的short-term goals in ourAugust 2011 strategic planwere largelyaccomplished
  • Wechanged our namefrom “The Singularity Institute” to “The Machine Intelligence Research Institute” (MIRI).
  • We were once doing three things — research, rationality training, and the Singularity Summit. Now we’re doingone事情:研究。亚博体育官网合理性培训被淘汰出单独的组织,CFAR., and the Summit wasacquiredby Singularity University. We still co-produce the Singularity Summit with Singularity University, but this requires limited effort on our part.
  • After dozens of hours of strategic planning in January–March 2013, and with input from 20+ external advisors,we’ve decided to (1) put less effort into public outreach, and to (2) shift our research priorities to Friendly AI math research

It’s this last pair of changes I’d like to explain in more detail below.

Read more »

面对智能爆炸电子书

||News

Facing the Intelligence Explosionis nowavailable as an ebook!

You can get it这里。It is available as a “pay-what-you-want” package that includes the ebook in three formats: MOBI, EPUB, and PDF.

It is also available on Amazon Kindle (US,Canada,UK, and most others) and the Apple iBookstore (US,Canada,UK和most others).

All sources are DRM-free. Grab a copy, share it with your friends, and review it on Amazon or the iBookstore.

所有收益直接融资技术和strategic亚博体育官网 Machine Intelligence Research Institute