New Research Page and Two New Articles

||Papers

亚博体育官网

Our new亚博体育官网 page has launched!

Our previous research page was a simple list of articles, but the new page describes the purpose of our research, explains four categories of research to which we contribute, and highlights the papers we think are most important to read.

We’ve also released drafts of two new research articles.

Tiling Agents for Self-Modifying AI, and the Löbian Obstacle(discuss ithere), by Yudkowsky and Herreshoff, explains one of the key open problems in MIRI’s research agenda:

We model self-modification in AI by introducing “tiling” agents whose decision systems will approve the construction of highly similar agents, creating a repeating pattern (including similarity of the offspring’s goals). Constructing a formalism in the most straightforward way produces a Gödelian difficulty, the “Löbian obstacle.” By technical methods we demonstrates the possibility of avoiding this obstacle, but the underlying puzzles of rational coherence are thus only partially addressed. We extend the formalism to partially unknown deterministic environments, and show a very crude extension to probabilistic environments and expected utility; but the problem of finding a fundamental decision criterion for self-modifying probabilistic agents remains open.

Robust Cooperation in the Prisoner’s Dilemma: Program Equilibrium via Provability Logic(discuss ithere),由LaVictoire et al .,解释了一些进展program equilibrium made by MIRI research associate Patrick LaVictoire and several others during MIRI’s April 2013 workshop:

Rational agents defect on the one-shot prisoner’s dilemma even though mutual cooperation would yield higher utility for both agents. Moshe Tennenholtz showed that if each program is allowed to pass its playing strategy to all other players, some programs can then cooperate on the one-shot prisoner’s dilemma. Program equilibria is Tennenholtz’s term for Nash equilibria in a context where programs can pass their playing strategies to the other players.

One weakness of this approach so far has been that any two programs which make different choices cannot “recognize” each other for mutual cooperation, even if they are functionally identical. In this paper, provability logic is used to enable a more flexible and secure form of mutual cooperation.

Participants of MIRI’s April workshop also made progress onChristiano’s probabilistic logic(an attack on the Löbian obstacle), but that work is not yet ready to be released.

We’ve also revamped the亚博体育苹果app官方下载 page, which now includes an亚博体育苹果app官方下载 for forthcoming workshops. If youmightlike to work with MIRI on some of its open research problems sometime in the next 18 months,亚博体育苹果app官方下载 ! Likewise, if you know someone who might enjoy attending such a workshop, please encouragethemto apply.

Friendly AI Research as Effective Altruism

||yabo app

MIRI was founded in 2000 on the premise that creating1Friendly AI might be a particularly efficient way to do as much good as possible.

Some developments since then include:

  • The field of “effective altruism” — trying not just to do good but to doas much good as possible2— has seen more publicity and better research than ever before, in particular through the work ofGiveWell,Center for Effective Altruism,philosopherPeter Singer, and the community atLess Wrong.3
  • In his recentPhD dissertation,Nick Becksteadhas clarified the assumptions behind the claim that shaping the far future (e.g. via Friendly AI) is overwhelmingly important.
  • Due to research performed by MIRI, theFuture of Humanity Institute(FHI), and others, our strategic situation with regard to machine superintelligence is more clearly understood, and FHI’sNick Bostromhas organized much of this work in aforthcoming book.4
  • MIRI’s Eliezer Yudkowsky hasbegunto describe in more detail which open research problems constitute “Friendly AI research,” in his view.

Given these developments, we are in a better position than ever before to assess the value of Friendly AI research as effective altruism.

Still, this is a difficult question. It is challenging enough to evaluate the cost-effectiveness ofanti-malaria netsordirect cash transfers. Evaluating the cost-effectiveness of attempts to shape the far future (e.g. via Friendly AI) is even more difficult than that. Hence,this short post sketches an argument that can be given in favor of Friendly AI research as effective altruism, to enable future discussion, and isnot intended as a thorough analysis.

Read more »


  1. In this post, I talk about the value ofhumanity in generalcreating Friendly AI, though MIRI co-founder Eliezer Yudkowsky usually talks aboutMIRI in particular——或者至少,功能相当于——创造Friendly AI. This is because I am not as confident as Yudkowsky that it is best for MIRI to attempt to build Friendly AI. When updating MIRI’s bylaws in early 2013, Yudkowsky and I came to a compromise on the language of MIRI’s mission statement, which now reads: “[MIRI] exists to ensure that the creation of smarter-than-human intelligence has a positive impact. Thus, the charitable purpose of [MIRI] is to: (a) perform research relevant to ensuring that smarter-than-human intelligence has a positive impact; (b) raise awareness of this important issue; (c) advise researchers, leaders and laypeople around the world; and (d)as necessary实现自己的情报humane, stable goals” (emphasis added). My own hope is that it will not be necessary for MIRI (or a functional equivalent) to attempt to build Friendly AI itself. But of course I must remain open to the possibility that this will be the wisest course of action as the first creation of AIdraws nearer. There is also the question of capability: few people think that a non-profit research organization has much chance of being the first to build AI. I worry, however, that the world’s elites will not find it fashionable to take this problem seriously until the creation of AI is only a few decades away, at which time it will be especially difficult to develop the mathematics of Friendly AI in time, and humanity will be forced to take a gamble on its very survival with powerful AIs we have little reason to trust.
  2. One might think of effective altruism as a straightforward application ofdecision theoryto the subject of philanthropy. Philanthropic agents of all kinds (individuals, groups, foundations, etc.) ask themselves: “How can we choose philanthropic acts (e.g. donations) which (in expectation) will do as much good as possible, given what we care about?” The consensus recommendation forallkinds of choices under uncertainty, including philanthropic choices, is to maximize expected utility (Chater & Oaksford 2012;Peterson 2004;Stein 1996;Schmidt 1998:19). Different philanthropic agents value different things, but decision theory suggests that each of them can get the most of what they want if they each maximize their expected utility. Choices which maximize expected utility are in this sense “optimal,” and thus another term for effective altruism is “optimal philanthropy.” Note that effective altruism in this sense is not too dissimilar from earlier approaches to philanthropy, includinghigh-impact philanthropy(making “the biggest difference possible, given the amount of capital invested“),strategic philanthropy,effective philanthropy, andwise philanthropy. Note also that effective altruism does not say that a philanthropic agent should specify complete utility and probability functions over outcomes and then compute the philanthropic act with the highest expected utility — that is impractical for bounded agents. We must keep in mind the distinction between normative, descriptive, and prescriptive models of decision-making (Baron 2007): “normative models tell us how to evaluate… decisions in terms of their departure from an ideal standard. Descriptive models specify what people in a particular culture actually do and how they deviate from the normative models. Prescriptive models are designs or inventions, whose purpose is to bring the results of actual thinking into closer conformity to the normative model.” Theprescriptivequestion — about what bounded philanthropic agents should do to maximize expected utility with their philanthropic choices — tends to be extremely complicated, and is the subject of most of the research performed by the effective altruism community.
  3. See, for example:Efficient Charity,Efficient Charity: Do Unto Others,Politics as Charity,Heuristics and Biases in Charity,Public Choice and the Altruist’s Burden,On Charities and Linear Utility,Optimal Philanthropy for Human Beings,Purchase Fuzzies and Utilons Separately,Money: The Unit of Caring,Optimizing Fuzzies and Utilons: The Altruism Chip Jar,Efficient Philanthropy: Local vs. Global Approaches,The Effectiveness of Developing World Aid,Against Cryonics & For Cost-Effective Charity,Bayesian Adjustment Does Not Defeat Existential Risk Charity,How to Save the World, andWhat is Optimal Philanthropy?
  4. I believe Beckstead and Bostrom have done the research community an enormous service in creating aframework, ashared language, for discussing trajectory changes, existential risks, and machine superintelligence. When discussing these topics with my colleagues, it has often been the case that the first hour of conversation is spent merely trying to understand what the other person is saying — how they are using the terms and concepts they employ. Beckstead’s and Bostrom’s recent work should enable clearer and more efficient communication between researchers, and therefore greater research productivity. Though I am not aware of any controlled, experimental studies on the effect of shared language on research productivity, a shared language is widely considered to be of great benefit for any field of research, and I shall provide a few examples of this claim which appear in print.Fuzzi et al. (2006): “The use of inconsistent terms can easily lead to misunderstandings and confusion in the communication between specialists from different [disciplines] of atmospheric and climate research, and may thus potentially inhibit scientific progress.”Hinkel (2008): “Technical languages enable their users, e.g. members of a scientific discipline, to communicate efficiently about a domain of interest.”Madin et al. (2007): “terminological ambiguity slows scientific progress, leads to redundant research efforts, and ultimately impedes advances towards a unified foundation for ecological science.”

MIRI May Newsletter: Intelligence Explosion Microeconomics and Other Publications

||Newsletters

Greetings From the Executive Director

Dear friends,

It’s been a busy month!

Mostly, we’ve been busypublishingthings. As you’ll see below,Singularity Hypotheseshas now been published, and it includes four chapters by MIRI researchers or research associates. We’ve also published two new technical reports — one on decision theory and another on intelligence explosion microeconomics — and several new blog posts analyzing various issues relating to the future of AI. Finally, we addedfour older articlesto the research page, includingIdeal Advisor Theories and Personal CEV(2012).

In ourApril newsletterwe spoke about our April 11th party in San Francisco, celebrating our relaunch as the Machine Intelligence Research Institute and our transition to mathematical research. Additional photos from that event are now available as aFacebook photo album. We’ve also uploaded a video from the event, in which I spend 2 minutes explaining MIRI’s relaunch and some tentative results from the April workshop. After that, visiting researcherQiaochu Yuanspends 4 minutes explaining one of MIRI’s core research questions: the Löbian obstacle to self-modifying systems.

Some of the research from our April workshop will be published in June, so if you’d like to read about those results right away, you might like toyabo体育官网 toyabo体育官网 .

Cheers!

Luke Muehlhauser

Executive Director

Read more »

Sign up for DAGGRE to improve science & technology forecasting

||News

InWhen Will AI Be Created?, I named four methods that might improve our forecasts of AI and other important technologies. Two of these methods wereexplicit quantificationandleveraging aggregation, as exemplified by IARPA’sACE program, which aims to “dramatically enhance the accuracy, precision, and timeliness of… forecasts for a broad range of event types, through the development of advanced techniques that elicit, weight, and combine the judgments of many… analysts.”

GMU’sDAGGRE program, one of five teams participating in ACE, recentlyannounceda transition from geopolitical forecasting to science & technology forecasting:

DAGGRE will continue, but it will transition from geo-political forecasting to science and technology (S&T) forecasting to better use its combinatorial capabilities. We will have a brand new shiny, friendly and informative interface co-designed by Inkling Markets, opportunities for you to provide your own forecasting questions and more!

Another exciting development is that our S&T forecasting prediction market will be open to everyone in the world who is at least eighteen years of age. We’re going global!

If you want to help improve humanity’s ability to forecast important technological developments like AI, please register for DAGGRE’s new S&T prediction websitehere.

I did.

Four Articles Added to Research Page

||Papers

Four older articles have been added to our亚博体育官网 .

The first is the early draft of Christiano et al.’s “Definability of ‘Truth’ in Probabilistic Logic” previously discussedhereandhere. The draft was last updated on April 2, 2013.

The second paper is a cleaned-up version of an article originally publishedin December 2012by Luke Muehlhauser and Chris Williamson to Less Wrong: “in December 2012by Luke Muehlhauser and Chris Williamson to Less Wrong: “Ideal Advisor Theories and Personal CEV.”

The third and fourth papers were originally published by Bill Hibbard in theAGI 2012 Conference Proceedings: “AGI 2012 Conference Proceedings: “避免意想不到的AI行为” and “决定苏pport for Safe AI Design.” Hibbard wrote these articles before he became a MIRI research associate, but he gave us permission to include them on our research page because (1) he became a MIRI research associate during the AGI-12 conference at which the articles were published, (2) the articles were partly inspired by apublic dialoguewith Luke Muehlhauser, and (3) the articles build on MIRI’s paper “public dialoguewith Luke Muehlhauser, and (3) the articles build on MIRI’s paper “Intelligence Explosion and Machine Ethics.”

As mentioned in ourDecember 2012 newsletter,“避免意想不到的AI行为”被授予MIRI’s $1000 Turing Prize for Best AGI Safety Paper. The prize was awarded in honor of Alan Turing, who not only discovered some of the key ideas of machine intelligence, but also grasped its importance, writing that “…it seems probable that once [human-level machine thinking] has started, it would not take long to outstrip our feeble powers… At some stage therefore we should have to expect the machines to take control…”

When Will AI Be Created?

||yabo app

Strong AIappears to be the topic of the week. Kevin Drum atMother JonesthinksAIs will be as smart as humans by 2040.Karl SmithatForbesand “M.S.” atThe Economistseem to roughly concur with Drum on this timeline. Moshe Vardi, the editor-in-chief of the world’smost-read computer science magazine,predictsthat “by 2045 machines will be able to do if not any work that humans can do, then a very significant fraction of the work that humans can do.”

But predicting AI is more difficult than many people think.

To explore these difficulties, let’s start with a 2009bloggingheads.tv conversationbetween MIRI researcherEliezer Yudkowskyand MIT computer scientistScott Aaronson, author of the excellentQuantum Computing Since Democritus. Early in that dialogue, Yudkowsky asked:

It seems pretty obvious to me that at some point in [one to ten decades] we’re going to build an AI smart enough to improve itself, and [it will]“foom” upward in intelligence, and by the time it exhausts available avenues for improvement it will be a “superintelligence” [relative] to us. Do you feel this is obvious?

Aaronson replied:

The idea that we could build computers that are smarter than us… and that those computers could build still smarter computers… until we reach the physical limits of what kind of intelligence is possible… that we could build things that are to us as we are to ants — all of this is compatible with the laws of physics… and I can’t find a reason of principle that it couldn’t eventually come to pass…

The main thing we disagree about is thetime scale… a few thousand years [before AI] seems more reasonable to me.

Those two estimates — several decades vs. “a few thousand years” — have wildly different policy implications.

If there’s a good chance that AI will replace humans at the steering wheel of history in the next several decades, then we’d better put our gloves on and亚博体育官网 making sure that this event has a positive rather than negative impact. But if we can be pretty confident that AI is thousands of years away, then we needn’t worry about AI for now, and we should focus on other global priorities. Thus it appears that “When will AI be created?” is a question with highvalue of informationfor our species.

Let’s take a moment to review the forecasting work thathasbeen done, and see what conclusions we might draw about when AI will likely be created.

Read more »

Advise MIRI with Your Domain-Specific Expertise

||News

MIRI currently has a few dozen volunteer advisors on a wide range of subjects, but we need more! If you’d like to help MIRI pursue its mission more efficiently, pleasesign up to be a MIRI advisor.

If you sign up, we will occasionally ask you questions, or send you early drafts of upcoming writings for feedback.

We don’t always want technical advice (“Well, you can do that with a relativized arithmetical hierarchy…”); often, we just want to understand how different groups of experts respond to our writing (“The tone of this paragraph rubs me the wrong way because…”).

At the moment, we are most in need of advisors on the following subjects:

Even if you don’t havemuchtime to help,please sign up! We will of course respect your own limits on availability.