When Will AI Be Created?

||yabo app

强大的AIappears to be the topic of the week. Kevin Drum atMother Jones认为AIs will be as smart as humans by 2040.Karl Smith福布斯and “M.S.” atThe Economist似乎在此时间轴上与鼓声大致同意。莫西·瓦尔迪(Moshe Vardi),世界主编最阅读的计算机科学杂志,,,,predicts“到2045年,机器将能够做到人类可以做的任何工作,那么人类可以做的很大一部分的工作。”

But predicting AI is more difficult than many people think.

要探索这些困难,让我们从2009年开始bloggingheads.tv conversationbetween MIRI researcherEliezer Yudkowskyand MIT computer scientistScott Aaronson,优秀的作者Quantum Computing Since Democritus。Early in that dialogue, Yudkowsky asked:

在我看来,在[一到十年]的某个时刻,我们将建立一个足够聪明的人以改善自身,[它将]智力上向上的“ foom”,,,,and by the time it exhausts available avenues for improvement it will be a “superintelligence” [relative] to us. Do you feel this is obvious?

亚伦森回答:

The idea that we could build computers that are smarter than us… and that those computers could build still smarter computers… until we reach the physical limits of what kind of intelligence is possible… that we could build things that are to us as we are to ants — all of this is compatible with the laws of physics… and I can’t find a reason of principle that it couldn’t eventually come to pass…

The main thing we disagree about is the时间尺度…[在AI之前]几千年对我来说似乎更合理。

Those two estimates — several decades vs. “a few thousand years” — have wildly different policy implications.

If there’s a good chance that AI will replace humans at the steering wheel of history in the next several decades, then we’d better put our gloves on and亚博体育官网 making sure that this event has a positive rather than negative impact. But if we can be pretty confident that AI is thousands of years away, then we needn’t worry about AI for now, and we should focus on other global priorities. Thus it appears that “When will AI be created?” is a question with highvalue of information对于我们的物种。

让我们花一点时间回顾一下预测工作been done, and see what conclusions we might draw about when AI will likely be created.

The challenge of forecasting AI

Expert elicitation

也许我们可以问专家?天文学家非常擅长预测日食,即使是几十年或几个世纪以前。技术发展往往比天文学更混乱,但也许专家仍然可以给我们一个rangeof years during which we can expect AI to be built? This method is called专家启发

Several people have surveyed experts working in AI or computer science about their AI timelines. Unfortunately, most of these surveys suffer from rather strongsampling bias,,,,and thus aren’t very helpful for our purposes.1

Should weexpectexperts to be good at predicting AI, anyway? AsArmstrong&Sotala(2012)point out, decades of research on expert performance2suggest that predicting the first creation of AI is precisely the kind of task on which we should expect experts to show较差的performance — e.g. because feedback is unavailable and the input stimuli are dynamic rather than static.Muehlhauser & Salamon (2013)add, “If you have a gut feeling about when AI will be created, it is probably wrong.”

也就是说,接受调查的专家Michie (1973)-a more representative sample than in other surveys3-did pretty well。当被要求估计“ [计算机]在成人人类层面上表现出智力的计算机”的时间表时,最常见的反应是“超过50年”。假设(像大多数人一样)AI到2023年不会到达,这些专家将是正确的。

Unfortunately, “more than 50 years” is a broad time frame that includes both “several decades from now” and “thousands of years from now.” So we don’t yet have any evidence that a representative survey of experts can predict AI within a few decades, and we have general reasons to suspect experts may not be capable of doing this kind of forecasting very well — although various aids (e.g. computational models; see below) may help them to improve their performance.

How else might we forecast when AI will be created?

趋势外推

Many have tried to forecast the first creation of AI by extrapolating various trends.Like Kevin Drum,,,,Vinge (1993)自己的预测人工智能基于硬件混乱关系nds (e.g.摩尔定律)。但是2003 reprintof his article, Vinge noted the insufficiency of this reasoning: even if we acquire hardware sufficient for AI, we may not have the software problem solved.4作为罗宾·汉森(Robin Hanson)remindsus, “AI takes software, not just hardware.”

Perhaps instead we could extrapolate trends in software progress?5Some people estimate the time until AI by asking what proportion of human abilities today’s software can match, and how quickly machines are “catching up.”6Unfortunately, it’s not clear how to divide up the space of “human abilities,” nor how much each ability matters. Moreover, software progress seems to come in fits and starts.7除了可能的例外computer chess progress,我不知道软件进度的任何趋势,因为摩尔定律在计算硬件方面有很多数十年的强劲趋势。

On the other hand,Tetlock (2005)指出,至少在他对Pundit关于政治预测的大型纵向数据库中,简单的趋势外推很难击败。考虑一下AI领域的一个例子:当David Levy问1989年世界计算机国际象棋冠军参与者时,象棋计划将击败人类世界冠军时,他们的估计往往是不准确的悲观,8尽管计算机国际象棋在那时已经表现出了二十年的定期和可预测的进度。那些以幼稚趋势外推的预测事件的人(例如Kurzweil 1990) got almost precisely the correct answer (1997)。

Hence, it may be worth searching for a measure for which (a) progress is predictable enough to extrapolate, and for which (b) a given level of performance on that measure robustly implies the arrival of Strong AI. But to my knowledge, this has not yet been done, and it’s not clear that trend extrapolation can tell us much about AI timelines until such an argument is made, and made well.

Disruptions

更糟糕的是,几个事件可能会大大加速或减速我们向AI的进步,我们不知道这些事件将发生,或者以什么顺序发生。例如:

  • An end to Moore’s Law。摩尔定律的“串行速度”版本在2004年破裂,需要飞向并行处理器,这给软件开发人员带来了重大的新困难(Fuller & Millett 2011)。最经济的有关公式oore’s law,computations per dollar,,,,有been maintained thus far,9but it remains unclear whether this will continue much longer (Mack 2011;Esmaeilzadeh等。2012)。
  • 低悬挂水果的耗竭。进步不仅是努力的函数,而且是进步的困难。有些领域看到每个连续发现的难度增加的模式(Arbesman 2011)。AI may prove to be a field in which new progress requires far more effort than earlier progress. That is clearly the case for many parts of AI already, for example natural language processing (Davis 2012)。
  • 社会崩溃。政治,经济,技术或自然灾害可能会导致社会崩溃,在此期间,AI的进展基本上会停滞不前(Posner 2004;Bostrom and Ćirković 2008)。
  • 拒绝Chalmers (2010)andHutter (2012a)认为我们向AI的进步中最有可能的“速度颠簸”将是拒绝。作为AI technologies become more powerful, humans may question whether it is wise to create machines more powerful than themselves.
  • A breakthrough in cognitive neuroscience。It is difficult, with today’s tools, to infer the cognitive algorithms behind human intelligence (Trappenberg 2009)。然而,新的工具和方法可能使认知神经科学家能够解码人脑如何实现自己的智力,这可能会使AI科学家在硅中复制这种方法。
  • 人类增强人类增强技术可以通过认知增强药物使科学家更有效(Bostrom and Sandberg 2009), brain-computer interfaces (Groß2009), and genetic selection or engineering for cognitive enhancement.10
  • Quantum computing。量子计算已经克服了一些早期障碍(Rieffel and Polak 2011), but it remains difficult to predict whether quantum computing will contribute significantly to the development of machine intelligence. Progress in quantum computing depends on particularly unpredictable breakthroughs. Furthermore, it seems likely that even if built, a quantum computer would provide dramatic speedups only for specific applications (e.g.searching unsorted databases)。
  • A tipping point in development incentives。发射Sputnikin 1957 demonstrated the possibility of space flight to the public. This event triggered a space race between the United States and the Soviet Union, and led to long-term funding for space projects from both governments. If there is a “Sputnik moment“对于向公众和政府清楚地表明,比人类更明智的人道AI不可避免的AI是不可避免的,可能会发生强大的AI竞赛,尤其是因为AI种族的获胜者可能会获得非凡的经济,技术,技术和地缘政治优势。11

极大的不确定性

Given these considerations, I think the most appropriate stance on the question “When will AI be created?” is something like this:

We can’t be confident AI will come in the next 30 years, and we can’t be confident it’ll take more than 100 years, and anyone who is confident of either claim is pretending to know too much.

How confident is “confident”? Let’s say 70%. That is, I think it is unreasonable to be 70% confident that AI is fewer than 30 years away, and I also think it’s unreasonable to be 70% confident that AI is more than 100 years away.

This statement admits my inability to predict AI, but it also constrains my probability distribution over “years of AI creation” quite a lot.

我认为上面的考虑证明了这些限制在我的概率分布上是合理的,但是我尚未详细说明我的推理。这将需要比我在这里提出更多的分析。但是我希望我至少总结了有关该主题的基本考虑因素,并且那些概率分布与我的工作不同的人现在可以在我的工作基础上为试图证明它们的合理性。

如何减少我们的无知

但是,让我们不要对宣告无知感到满意。承认我们的无知是一个重要的一步,但这只是first步。我们的下一步应该是reduce our ignoranceif we can, especially for high-value questions that have large strategic implications concerning the fate of our entire species.

How can we improve our long-term forecasting performance?Horowitz & Tetlock (2012),,,,based on their own empirical research and prediction training, offer some advice on the subject:12

  • Explicit quantification:“成为长期未来的更好校准评估者的最佳方法是养成制定定量概率估计的习惯,这些概率估计值可以客观地评分,以确保长时间的准确性。明确的量化可以实现明确的精度反馈,从而可以学习。”
  • 标志未来: Thinking through specific scenarios can be useful if those scenarios “come with clear diagnostic signposts that policymakers can use to gauge whether they are moving toward or away from one scenario or another… Falsifiable hypotheses bring high-flying scenario abstractions back to Earth.”13
  • Leveraging aggregation:“平均预测通常比计算平均水平的个人预测中的绝大多数预测更为准确。[预报员]还应该养成一个习惯,即[IARPA预测锦标赛中的一些更好的预报员高手已经进入了:比较他们的预测group averages, weighted-averaging algorithms, prediction markets, and financial markets.” SeeUngar等。(2012)对于ACE锦标赛的一些聚合式杠杆作用。

许多预测专家补充说,在做出高度不确定的预测时,通常会有所帮助decompose the phenomenainto many parts and make predictions about each of the parts.14作为Raiffa (1968)succinctly put it, our strategy should be to “decompose a complex problem into simpler problems, get one’s thinking straight [on] these simpler problems, paste these analyses together with a logical glue, and come out with a program for action for the complex problem” (p. 271). MIRI’sThe Uncertain Future是这种简单的玩具模型,但更复杂的计算模型,例如在气候变化模型中成功使用的模型(艾伦等。2013) — could be produced, and integrated with other prediction techniques.

我们应该期望AI预测会很困难,但我们不必作为ignorant about AI timelines as we are today.

致谢

My thanks to Carl Shulman, Ernest Davis, Louie Helm, Scott Aaronson, and Jonah Sinick for their helpful feedback on this post.


  1. First,Sandberg & Bostrom (2011)gathered the AI timeline predictions of 35 participants at a 2011 academic conference on human-level machine intelligence. Participants were asked by what year they thought there is a 10%, 50%, and 90% chance that AI will have been built, assuming that “no global catastrophe halts progress.” Five of the 35 respondents expressed varying degrees of confidence that human-level AI would never be achieved. The median figures, calculated from the views of the other 30 respondents, were: 2028 for “10% chance,” 2050 for “50% chance,” and 2150 for “90% chance.” Second,Baum et al. (2011)surveyed 21 participants at a 2009 academic conference on machine intelligence, and found estimates similar to those in Sandberg & Bostrom (2011). Third,Kruel (2012)截至2013年5月7日,已经通过电子邮件采访了34个人有关AI时间表和风险的人,其中33个人可以被视为AI或计算机科学中一种或另一种的“专家”(Richard Carrier是历史学家)。Of those 33 experts, 19 provided full, quantitative answers to Kruel’s question about AI timelines: “Assuming beneficial political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans (or better, perhaps unevenly) at science, mathematics, engineering and programming?” For those 19 experts, the median estimates for 10%, 50%, and 90% were 2025, 2035, and 2070, respectively (spreadsheethere)。第四,贝恩布里奇(2005),,,,surveying participants of 3 conferences on “Nano-Bio-Info-Cogno” technological convergence, found a median estimate of 2085 for “the computing power and scientific knowledge will exist to build machines that are functionally equivalent to the human brain.” However, the participants in these four surveys were disproportionately HLAI enthusiasts, and this introduces a significant sampling bias. The database of AI forecasts discussed inArmstrong&Sotala(2012)可能遇到了类似的问题:认为AI迫在眉睫而不是遥远的个人更有可能对AI做出公开预测。
  2. Shanteau (1992);Kahneman and Klein (2009)
  3. Another survey was taken at theAI@50 conference在2006年。当询问参与者“何时能够模拟人类智能方面?”时,有41%的人said“超过50年”,另有41%的人说“从不。”不幸的是,许多调查参与者不是AI专家,而是参加会议的大学生。此外,问题的措辞可能引入了偏见。“从不”的答案可能经常给出,因为一些参与者将“人类智力的各个方面”纳入了意识,许多人对机器可能具有意识的想法有哲学上的异议。如果他们被问到“几乎所有工作中AIS什么时候会取代人类?”,我怀疑“从不”的答案会少得多。至于我自己,我不接受对AI可能性的任何原则异议。对于最常见的异议,请参见Chalmers (1996),ch。9,,,,andChalmers (2012)
  4. Though,Muehlhauser & Salamon (2013)point out that “Hardware extrapolation may be a more useful method in a context where the intelligence software is already written: whole brain emulation [WBE]. Because WBE seems to rely mostly on scaling up existing technologies like microscopy and large-scale cortical simulation, WBE may be largely an “engineering” problem, and thus the time of its arrival may be more predictable than is the case for other kinds of AI.” However, it is especially difficult to forecast WBE while we do not even have a proof of concept via a simple organism like秀丽隐杆线((David Dalrympleis working on this). Moreover, much progress in neuroscience will be required (Sandberg & Bostrom 2011),而且这种进度可能不如硬件外推可预测。
  5. 我不确定什么general衡量软件进度的度量看起来像是,尽管我们当然可以识别localexamples of software progress. For example,Holdren等。(2010)注意:“在许多领域,由于加工速度提高,由于算法的改进而造成的性能提高也大大超过了性能的增长……线性编程将在1988年使用当天的计算机和线性编程算法花费82年的时间。十五年后 -​​ 在2003年 - 大约1分钟可以解决相同的模型,改善了约4300万。其中,大约有1,000个因素是由于处理器速度提高,而大约43,000个因素是由于算法的改善所致!Grötschel还引用了1991年至2008年间混合整数编程的算法改进。”Muehlhauser & Salamon (2013)give another example: “For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.015 TIPS. Thus, the computational efficiency of the chess algorithms increased by a factor of 100 in only six years (Richards and Shaw 2004)。” A third example isSetty等。(2012),这提高了概率可检查的证明方法的效率,并通过单个突破来提高20个数量级。另一方面,很容易找到非常的例子减缓也进步(也Davis 2012)。
  6. For example, seeGood(1970)
  7. 正如我写的earlier:“增加计算能力相当预测able, but for AI you probably need fundamental mathematical insights, and it’s damn hard to predict those. In 1900, David Hilbert posed23unsolved problemsin mathematics. Imagine trying to predict when those would be solved.” Some of these problems were solved quickly, some of them required several decades to solve, and many of them remain unsolved. Even the order in which Hilbert’s problems would be solved was hard to predict. According toErdős & Graham (1980),p。7,,,,“Hilbert lectured in the early 1920’s on problems in mathematics and said something like this: probably all of us will see the proof of the Riemann hypothesis, some of us… will see the proof of Fermat’s last theorem, but none of us will see the proof that √2√2is transcendental.” In fact, these results came in the reverse order: the last was proved by Kusmin a few years later, Fermat’s last theorem was proved by Wiles in 1994, and the Riemann hypothesis still has not been proved or disproved.
  8. 根据Levy & Newborn (1991),一位参与者猜到了正确的一年(1997年),有13名参与者猜测了从1992 - 1995年,二十八名参与者从1998 - 2056年开始猜测年,而一位参与者猜测了“从不”。在1998 - 2056年猜测的二十八岁中,有11年或以后猜测。
  9. 作为Fuller&Millett(2011,p。81)请注意,“当我们谈论扩展计算性能时,我们隐含地意味着增加我们可以为每笔花费的美元购买的计算性能。”我们大多数人并不真正在乎我们的新计算机是否具有更多的晶体管或其他结构。我们只想要它do more stuff, more cheaplyKurzweil (2012),ch。10,,,,footnote 10 shows “calculations per second per $1,000” growing exponentially from 1900 through 2010, including several data points after theserial speed摩尔定律的版本在2004年破裂。这种趋势的延续通过2006 - 2011年的“每秒每一美元的指示”确认,从英特尔和其他来源收集到克里斯·霍尔奎斯特(Chris Hallquist)(电子表格电子表格)here)。因此看来computations per dollarform of Moore’s Law has continued unabated, at least for now.
  10. One possible breakthrough here may be迭代的胚胎选择。See Miller (2012,ch。9)有关更多详细信息。
  11. It is interesting, however, that the United States did not pursue extraordinary economic, technological and geopolitical advantage in the period during which it was the sole possessor of nuclear weapons. Also, it is worth noting that violence and aggression have steadily declined throughout human history (Pinker 2012)。
  12. Tetlock (2010)adds another recommendation: “adversarial collaboration” (Mellers et al. 2001)。Tetlock explains: “The core idea is simple: rival epistemic and political camps would nominate experts to come together to reach agreements on how they disagree on North Korea or deficit reduction or global warming — and then would figure out how to resolve at least a subset of their factual disputes. The disputants would need to specify, ex ante, how much belief change each side would ‘owe’ the other if various agreed-upon empirical tests were to work out one way or the other. When adversarial collaboration works as intended, it shifts the epistemic incentives from favoring cognitive hubris (generating as many reasons why one’s own side is right and the other is wrong) and toward modesty (taking seriously the possibility that some of the other side’s objections might have some validity). This is so because there is nothing like the prospect of imminent falsification to motivate pundits to start scaling back their more grandiose generalizations: ‘I am not predicting that North Korea will become conciliatory in this time frame if we did x — I merely meant that they might become less confrontational in this wider time frame if we did x, y and z — and if there are no unexpected endogenous developments and no nasty exogenous shocks.'”Tetlock(2012),,,,inspired byGawande (2009),,,,also tentatively recommends the use of checklists in forecasting: “The intelligence community has begun developing performance-appraisal checklists for analysts that nudge them in the direction of thinking more systematically about how they think. But it has yet — to our knowledge — taken the critical next step of checking the usefulness of the checklists against independent real-world performance criteria, such as the accuracy of current assessments and future projections. Our experience in the [ACE] IARPA forecasting tournament makes us cautiously optimistic that this next step is both feasible and desirable.”
  13. But, let us not fool ourselves concerning the difficulty of this task.Good (1976)作为serted that human-level performance in computer chess was a good signpost for human-level AI, writing that “a computer program of Grandmaster strength would bring us within an ace of [machine ultra-intelligence].” But of course this was not so. We may chuckle at this prediction today, but how obviously wrong was Good’s prediction in 1976?
  14. E.g.Armstrong&Sotala(2012);MacGregor (2001);劳伦斯等。(2006)

Did you like this post?You may enjoy our otheryabo app posts, including: