December 2018 Newsletter


Announcing a new edition of “Rationality: From AI to Zombies”


MIRI is putting out a new edition ofRationality: From AI to Zombies, including the first set ofR:AZ print books!Map and Territory(volume 1) andHow to Actually Change Your Mind(volume 2) are out today!

Map and TerritoryHow to Actually Change Your Mind

  • Map and Territoryis:
  • 6.50美元on Amazon, for the print version.
  • Pay-what-you-on Gumroad, for PDF, EPUB, and MOBI versions.
  • How to Actually Change Your Mindis:
  • $8on Amazon, for the print version.
  • Pay-what-you-on Gumroad, for PDF, EPUB, and MOBI versions (available in the next day).

Read more »

2017 in review

||MIRI Strategy

This post reviewsMIRI’s activities in 2017, including research, recruiting, exposition, and fundraising activities.

2017 was a big transitional year for MIRI, as we took on new research projects that have a much greater reliance on hands-on programming work and experimentation. We’ve continued these projects in 2018, and they’re described more in our2018 update. This meant a major focus on laying groundwork for much faster growth than we’ve had in the past, including setting up infrastructure and changing how we recruit to reach out to more people with engineering backgrounds.

Read more »

MIRI’s newest recruit: Edward Kmett!


Prolific Haskell developerEdward Kmetthas joined the MIRI team!

Edward is perhaps best known for popularizing the use of lenses for functional programming. Lenses are a tool that provides a compositional vocabulary for accessing parts of larger structures and describing what you want to do with those parts.

镜头之外的库,爱德华维护significant chunk of all libraries around the Haskell core libraries, covering everything from automatic differentiation (used heavily in deep learning, computer vision, and financial risk) to category theory (biased heavily towards organizing software) to graphics, SAT bindings, RCU schemes, tools for writing compilers, and more.

最初支持爱德华加入美里来了in the form of funding from long-time MIRI donor Jaan Tallinn. Increased donor enthusiasm has put MIRI in a great position to take on more engineers in general, and to consider highly competitive salaries for top-of-their-field engineers like Edward who are interested in working with us.

At MIRI, Edward is splitting his time between helping us grow our research team and diving in on a line of research he’s been independently developing in the background for some time: building a new language and infrastructure to make it easier for people to write highly complex computer programs with known desirable properties. While we are big fans of his work, Edward’s research is independent of the directions we described in our2018 Update, and we don’t consider it part of our core research focus.

We’re hugely excited to have Edward at MIRI. We expect to learn and gain a lot from our interactions, and we also hope that having Edward on the team will let him and other MIRI staff steal each other’s best problem-solving heuristics and converge on research directions over time.

As described in our recentupdate, our new lines of research are heavy on the mix of theoretical rigor and hands-on engineering that Edward and the functional programming community are well-known for:

In common between all our new approaches is a focus on using high-level theoretical abstractions to enable coherent reasoning about the systems we build. A concrete implication of this is that we write lots of our code in Haskell, and are often thinking about our code through the lens of type theory.

MIRI’s nonprofit mission is to ensure that smarter-than-human AI systems, once developed, have apositive impacton the world. And we want to actually succeed in that goal, not just go through the motions of working on the problem.

Our current model of the challenges involved says that the central sticking point for future engineers will likely be that the building blocks of AI just aren’t sufficiently transparent. We think that someone, somewhere, needs to develop some new foundations and deep theory/insights, above and beyond what’s likely to arise from refining or scaling up currently standard techniques.

We think that the skillset of functional programmers tends to be particularly well-suited to this kind of work, and we believe that our new research areas can absorb a large number of programmers and computer scientists. So we want this hiring announcement to double as a hiring pitch: considerjoining our research effort!

To learn more about what it’s like to work at MIRI and what kinds of candidates we’re looking for, seeour last big post, or shoot MIRI researcher Buck Shlegerisan email.

November 2018 Newsletter


MIRI’s 2018 Fundraiser


Update January 2019: MIRI’s 2018 fundraiser is now concluded.







Fundraiser concluded

345 donors contributed

MIRI is a math/CS research nonprofit with a mission of maximizing the potential humanitarian benefit of smarter-than-human artificial intelligence. You can learn more about the kind of work we do in “Ensuring Smarter-Than-Human Intelligence Has A Positive Outcome” and “Embedded Agency.”

Our funding targets this year are based on a goal of raising enough in 2018 to match our “business-as-usual” budget next year. We view “make enough each year to pay for the next year” as a good heuristic for MIRI, given that we’re a quickly growing nonprofit with a healthy level of reserves and a budget dominated by researcher salaries.

Read more »

2018 Update: Our New Research Directions

||MIRI Strategy,News

For many years, MIRI’s goal has been to resolve enough fundamental confusions aroundalignmentand intelligence to enable humanity to think clearly about technical AI safety risks—and to do this before this technology advances to the point of potential catastrophe. This goal has always seemed to us to be difficult, but possible.1

Last year, we said that we were beginning a new research program aimed at this goal.2Here, we’re going to provide background on how we’re thinking about this new set of research directions, lay out some of the thinking behind our recent decision to do less default sharing of our research, and make the case for interested software engineers tojoin our teamand help push our understanding forward.

Read more »

  1. This post is an amalgam put together by a variety of MIRI staff. The byline saying “Nate” means that I (Nate) endorse the post, and that many of the concepts and themes come in large part from me, and I wrote a decent number of the words. However, I did not write all of the words, and the concepts and themes were built in collaboration with a bunch of other MIRI staff. (This is roughly what bylines have meant on the MIRI blog for a while now, and it’s worth noting explicitly.)
  2. See our 2017strategic updateandfundraiserposts for more details.


||yabo app

This is the conclusion of theEmbedded Agencyseries. Previous posts:

Embedded AgentsDecision TheoryEmbedded World-Models
Robust DelegationSubsystem Alignment

A final word on curiosity, and intellectual puzzles:

I described an embedded agent, Emmy, and said that I don’t understand how she evaluates her options, models the world, models herself, or decomposes and solves problems.

In the past, when researchers have talked about motivations for working on problems like these, they’ve generally focused on the motivation fromAI risk. AI researchers want to build machines that can solve problems in the general-purpose fashion of a human, anddualismis not a realistic framework for thinking about such systems. In particular, it’s an approximation that’s especially prone to breaking down as AI systems get smarter. When people figure out how to build general AI systems, we want those researchers to be in a better position to understand their systems, analyze their internal properties, and be confident in their future behavior.

This is the motivation for most researchers today who are working on things like updateless decision theory and subsystem alignment. We care about basic conceptual puzzles which we think we need to figure out in order to achieve confidence in future AI systems, and not have to rely quite so much on brute-force search or trial and error.

But the arguments for why we may or may not need particular conceptual insights in AI are pretty long. I haven’t tried to wade into the details of that debate here. Instead, I’ve been discussing a particular set of research directions as anintellectual puzzle, and not as an instrumental strategy.

One downside of discussing these problems as instrumental strategies is that it can lead to some misunderstandings aboutwhywe think this kind of work is so important. With the “instrumental strategies” lens, it’s tempting to draw a direct line from a given research problem to a given safety concern. But it’s not that I’m imagining real-world embedded systems being “too Bayesian” and this somehow causing problems, if we don’t figure out what’s wrong with current models of rational agency. It’s certainly not that I’m imagining future AI systems being written in second-order logic! In most cases, I’m not trying at all to draw direct lines between research problems andspecific AI failure modes.

What I’m instead thinking about is this: We sure do seem to be working with the wrong basic concepts today when we try to think about what agency is, as seen by the fact that these concepts don’t transfer well to the more realistic embedded framework.

If AI developers in the future arestillworking with these confused and incomplete basic concepts as they try to actually build powerful real-world optimizers, that seems like a bad position to be in. And it seems like the research community is unlikely to figure most of this out by default in the course of just trying to develop more capable systems. Evolution certainly figured out how to build human brains without “understanding” any of this, via brute-force search.

Embedded agency is my way of trying to point at what I think is a very important and central place where I feel confused, and where I think future researchers risk running into confusions too.

There’s also a lot of excellent AI alignment research that’s being done with an eye toward more direct applications; but I think of that safety research as having a different type signature than the puzzles I’ve talked about here.

Intellectual curiosity isn’t the ultimate reason we privilege these research directions. But there are somepracticaladvantages to orienting toward research questions from a place of curiosity at times, as opposed toonly applying the “practical impact” lensto how we think about the world.

When we apply the curiosity lens to the world, we orient toward the sources of confusion preventing us from seeing clearly; the blank spots in our map, the flaws in our lens. It encourages re-checking assumptions and attending to blind spots, which is helpful as a psychological counterpoint to our “instrumental strategy” lens—the latter being more vulnerable to the urge to lean on whatever shaky premises we have on hand so we can get to more solidity and closure in our early thinking.

Embedded agencyis an organizing theme behind most, if not all, of our big curiosities. It seems like a central mystery underlying many concrete difficulties.