How Big is the Field of Artificial Intelligence? (initial findings)

||yabo app

Co-authored with Jonah Sinick.

How big is the field of AI, and how big was it in the past?

This question is relevant to several issues in AGI safety strategy. To name just two examples:

  • AI forecasting。Some people forecast AI progress by looking at how much has been accomplished for each calendar year of research. But as inputs to AI progress, (1) AI funding, (2) quality-adjusted researcher years (QARYs), and (3) computing power are more relevant than calendar years.1To use these metrics to predict future AI progress, we need to know how many dollars and QARYs and computing cycles at various times in the past have been required to produce the observed progress in AI thus far.
  • Leverage points。如果大多数AI研究资金亚博体育官网来自相对较少的资助者,或者大多数研究是由相对较少的研究小组产生的,那么这些可能代表高价值的利用杠杆点,从而可以通过这些点影响整个领域,例如更关心long-term social consequences of AI

For these reasons and more, MIRI recently investigated the current size and past growth of the AI field. This blog post summarizes our initial findings, which are meant to provide a “quick and dirty” launchpad for future, more thorough research into the topic.

To begin, we tried to quantify the size and past growth of the field using metrics such as

  • Number of researchers
  • Number of journals
  • 出版物计数
  • Number of conferences
  • Number of organizations
  • Famous prizes awarded for AI research
  • Amount of funding

It’s difficult to interpret these figures, and they may be significantly less informative than an object level study of the research would be, but the figures still have some relevance:

  • For the purpose of investigating growth, one can look at year-to-year percentage growth in the statistics, combining this with other measures of the amount of progress that has occurred in AI, in order to estimate the amount of AI research that will occur in the medium term future.
  • For the purpose of investigating the current size of the AI field, one can look at the quantitative metrics relative to the corresponding metrics for computer science (CS)2, and use these in conjunction with a holistic sense of the current size of the CS field to inform one’s holistic sense of the amount of progress that there’s been in AI.

The data that we were able to collect provide a decent picture of the size of the AI field relative to the size of the CS field, but they are insufficient to support a robust conclusion, and more investigation is warranted. Unless otherwise specified, see the spreadsheet “Current size & past growth of AI field” for the raw data on which this blog post is based.

The size of the AI field

According to a variety of metrics,the amount of AI research being done appears to be about 10% of the amount of computer science (CS) research being done。The metrics used, however, mostly capture researchquantityrather than researchquality, and thus may be a weak proxy for measuring how many QARYs have been invested. That said, the fact that roughly 10% of CS research prizes are awarded for AI work may indicate that research quality is similar in CS and AI.

We obtained many of the relevant figures fromMicrosoft Academic Search(MAS). MAS allows one to search under the headings:

  • Computer science
  • Artificial intelligence
  • 自然语言和言语
  • Machine learning and pattern recognition
  • Computer vision

One gets different figures depending on whether one counts the latter three subjects (hereafter referred to as “cognate disciplines”) as AI. Below, we give figures both for items that fall under the “artificial intelligence” heading alone, and for items that fall under the heading “artificial intelligence”orunder the heading of one of the cognate disciplines.

Number of researchers

MAS gives number of authors in CS, AI, and the cognate disciplines of AI, but these figures don’t pick up on the amount of research done as well as publication count figures do.3

The IEEE Computational Intelligence Societyhas〜7,000名成员和IEEE计算机协会has~85,000 members, so the membership of the first is 8% the membership of the second.

Some other relevant figures (which don’t paint a cohesive picture):

  • According tothe Bureau of Labor Statistics, there are 26,700 computer and information science researchers in the US.
  • ACM’s Special Interest Group on Artificial Intelligence (SIGAI)has“more than 1,000 members.”
  • The国际神经网络学会(INNS) has “more than 2,000 members.”

Number of journals

MAS lists 1360 CS journals, with 106 in AI, and 172 in either AI or one of AI’s cognate disciplines, so 8% and 13% respectively.4

出版物计数

Between 2005 and 2010, of those publications listed under MAS’s “CS” heading, about 10% were listed under “AI” and about 20% were listed under “AI” or one of its cognate disciplines.5如果人们在1990年至1995年之间,1995年至2000年以及2000年至2005年之间的出版物之间,人们看到的百分比大致相同。6搜索Google Scholar寻找“计算机科学”和“人工智能”,发现后一个搜索的命中次数约为前搜索的命中次数的30%,7which could mean that the amount of AI research is significantly more than 10% the amount of CS research, but some papers that contain the phrase “artificial intelligence” are not artificial intelligence research, and some computer science papers may not contain the phrase “computer science.”

Number of conferences

MAS在CS中列出了3,519个“顶级会议”和AI中的361个“顶级会议”,而前者的数字约为后一个数字的10%。AI或同源学科中有561个“顶级会议”,因此CS会议的数量为16%。8

Number of organizations

Microsoft Academic Search lists 11,338 organizations for CS and 7,125 organizations for AI, so 63%. If one counts cognate disciplines as AI, the number of AI organizations is 21,802, so 192% that of CS organizations.9Taken in isolation, this would suggest that the amount of AI research is much greater than 10%.

“Number of organizations” seems likely to be a weaker metric of amount of research than “number of publications,” etc., so this should be discounted. Nevertheless, the fact that the ratio of AI organizations to CS organizations is so much higher than the other ratios that we looked at is a puzzle. Perhaps the difference comes from the CS community and the AI community having different cultural norms. Or, perhaps MAS is less consistent about how it counts organizations than how it counts publications.

Famous prizes awarded for AI research vs. CS research

ACM Turing Award: Six out of 46 prizes were awarded for AI research, so 13% of the total.10

Nevanlinna Prize: One of the 8 prizes was awarded for AI work, so 12.5% of the total. However, the prize for AI work was awarded in 1986, which is a long time ago.11

Amount of funding

2011年,美国国家科学基金会(NSF)矩形eived $636 million for funding CS research (through塞斯). Of this, $169 millionwent to信息和智能系统(IIS)。亚博体育苹果app官方下载II有three programs: Cyber-Human Systems (CHS), Information Integration and Informatics (III) and Robust Intelligence (RI). If roughly 1/3 of the funding went to each of these, then $56 million went to Robust Intelligence, so 9% of the total CS funding. (Some CISE funding may have gone to AI work outside of IIS — that is, viaACI,CCF, orCNS— but at a glance, non-IIS AI funding through CISE looks negligible.)

Other major U.S. funding sources for CS research includeONR,DARPA, and several companies (Microsoft, Google, IBM, etc.) but we have not investigated these funding sources yet. We also did not investigate non-U.S. funding sources.

The growth of the AI field

我们没有在足够深入的AI研究人员数量的增长率上进行有意义的估计。亚博体育官网但是,所有领域的科学家和工程师人数的增长率可能会作为very weakproxy measure for the growth rates of AI or CS.

For example,the annual growth rate of science and engineering researchers in OECD countries, between 1995 and 2005, appears to be about 3.3%, corresponding to a doubling time of 23 years.12需要与平均研究人员生产率并列(按专利衡量的研究人员,每个研究人员花费的时间衡量,每篇论文的合着者数量以及所引用的论文数量)的迹象有所减少。亚博体育官网13NSF预算信息和智能系统tems (IIS) has generally increased between 4% and 20% per year since 1996, with a one-time percentage boost of 60% in 2003, for a total increase of 530% over the 15 year period between 1996 and 2011.14“强大的智能”是三个项目地区之一s covered by this budget. According to MAS, the number of publications in AI grew by 100+% every 5 years between 1965 and 1995, but between 1995 and 2010 it has been growing by about 50% every 5 years. One sees a similar trend in machine learning and pattern recognition.15

Notes on further research

Future research on this topic could dig much deeper, and come to more robust conclusions. Our purpose here is to lay some groundwork for future research. With that in mind, here are some miscellaneous notes to future researchers investigating the current size and past growth of the AI field:

  • 如果被引用的论文更新,则可能表明进步更快。另一方面,它也可能表明了时尚性,并且需要以某种方式区分这两件事。
  • 一些可能可用于分析引用模式有用的引用数据库是Scopus,Scopus,Web of Science,MS学术搜索和科学引文指数(SSI)。16
  • Some sources of noise in citation counts are: (a) Journal editors asking authors of submitted papers to add citations to other papers in the same journal in order to boost the journal’s impact factor & (b) Authors citing their own papers excessively in order to increase their citation counts.17

我们感谢Sebastian Nickel的数据收集,并感谢Carl Shulman的反馈。


  1. Another important input metric is theoretical progress imported from other fields, e.g. methods from statistics.
  2. It’s also worth noting the following point. Suppose that a source S can be used to generate an estimate E1 for a quantity Q1 having to do with AI and an estimate E2 having to do with CS. Then E1 and E2 may overstate or understate Q1 and Q2 (respectively). Let the factors by which E1 and E2 differ from Q1 and Q2 be F1 and F2. We don’t have good estimates for F1 and F2, but if we compute the ratio 1(E1)/(E2) we get [(Q1)/(Q2)]*[(F1)/(F2)]. The quantity (F1)/(F2) will be closer to 1 than F1 is to 1, because the some of the factors that lead E1 to deviate from Q1 to given degrees will also lead E2 to deviate from Q2 to similar degrees. So (E1)/(E2) is closer to (Q1)/(Q2) (in relative terms) than E1 is to Q1 (in relative terms).
  3. MAS显示CS的160万作者和0.26个AI的作者,因此16%。如果一个人在AI和同源学科中加起来列出的作者数量,则该数字上升到39%。但是,一些作者在多个学科中发布(例如,作者可能在人工智能和机器学习中出版)。
  4. Cells B96 through B100 of thespreadsheet
  5. Some papers may be listed under multiple categories, making it unclear whether the 10% figure or the 20% figure is more representative.
  6. Table with upper left hand corner A2 in thespreadsheet
  7. Google Scholar results:

    Search term “computer science” (in quotes) yields 2,650,000 results
    “人工智能” - > 1,710,000
    “machine intelligence” -> 655,000

    自2013年以来:
    Search term “computer science” (in quotes) yields 99,600 results
    “artificial intelligence” -> 32,300
    “machine intelligence” -> 11,600

    2012:
    Search term “computer science” (in quotes) yields 163,000 results
    “artificial intelligence” -> 52,500
    “machine intelligence” -> 22,600

    2011:
    Search term “computer science” (in quotes) yields 247,000 results
    “artificial intelligence” -> 66,100
    “machine intelligence” -> 23,000

  8. Cell B27 through Cell B31 of thespreadsheet
  9. Cell B119 through Cell B123 of thespreadsheet
  10. The annualTuring Prize在1966年首次获得去年奖(2012年),所以46prizes so far. Of those, 6 were for achievements in AI related research, namely:
    • 1969 Marvin Minsky
    • 1971 John McCarthy
    • 1975 Newell & Simon
    • 1991 Robin Milner (machine assisted proof construction)
    • 1994 Edward Feigenbaum & Raj Reddy
    • 2010 Leslie Valiant (Probably Approximately Correct Learning)
    •2011年犹太珍珠
    在46个奖品中,有8人获得了2人的奖励,另有2人授予3人,因此收件人的总数为58,其中8人获得了与AI相关的成就奖。
  11. The Nevanlinna Prize has been awarded every 4 years since 1982; 8 times so far.
  12. FromU.S. National Science Foundation (NSF). Science and Engineering Indicators: 2010, Chapter 3. Science and Engineering Labor Force:

    在1960年代初期,著名的科学史学家德里克·德·索拉·普莱斯(Derek J. de Solla Price)研究了科学的成长和历史上很长一段时间的科学家人数,并在一本书中总结了他的发现。自从巴比伦以来的科学(1961). Using a number of empirical measures (most over at least 300 years), Price found that science, and the number of scientists, tended to double about every 15 years, with measures of higher quality science and scientists tending to grow slower (doubling every 20 years) and measures of lower quality science and scientists tending to grow faster (every 10 years). According to Price (1961), one implication of this long-term exponential growth is that “80 to 90% of all the scientists that ever lived are alive today.” This insight follows from the likelihood that most of the scientists from the past 45 years (a period of three doublings) would still be alive. Price was interested in many implications of these growth patterns, but in particular, he was interested in the idea that this growth could not continue indefinitely and the number of scientists would reach “saturation.” Price was concerned in 1961 that saturation had already begun.

    How different are the growth rates in the number of scientists and engineers in recent periods from what Price estimated for past centuries?Table 3-A显示了一段可用的数据,显示了美国和世界其他地方的S&E劳动力的某些测量值的增长率。在这些措施中,美国劳动力的科学博士学位持有人的数量显示,年平均年增长率最低2.4%(如果这种增长率继续持续31年,则在31年内翻了一番)。在美国,在S&E职业中雇用的博士持有人的数量显示,年平均年增长率更快为3.8%(如果续,在20年内增加了一倍)。S&E中没有全球个人的计数,但是经济合作与发展组织(OECD)成员国的“研究人员”的计数(OECD)的平均年率增长了3.3%(如果亚博体育官网续23年,则在23年内翻了一番)。大多数发展中国家科学家和工程师人口的数据非常有限,但是中国研究人员的经合组织数据显示平均年增长率为10.8%(如果续8年,则在8年内增加一倍)。亚博体育官网所有这些数字与S&E劳动力的延续超过一般劳动力增长率的延续是一致的。

  13. 以下是每个研究人员生产率下降的一些参考。亚博体育官网我们感谢Gwernfor compiling many of these in the articleScientific Stagnation:
    • Machlup, Fritz.The Production and Distribution of Knowledge in the United States, Princeton, NJ: Princeton University Press, 1962, 170-176
    •Segerstrom,Paul。内源性生长没有规模效应,American Economic Review, December 1998, 88, 1290-1310
    • Terman, F.E. A Brief History of Electrical Engineering Education,Proceedings of the IEEE, August 1998, 86 (8), 1792-1800
    • Adams, James D., Black, Grant C., Clemmons, J.R., and Stephan, Paula E. Scientific Teams and Institutional Collaborations: Evidence from U.S. Universities, 1981-1999, NBER Working Paper #10640, July 2004
    • Jones (2006),Age and Great Invention
    • Jones, Benjamin F.The Burden of Knowledge and the Death of the Renaissance Man: Is Innovation Getting Harder?NBER Working Paper #11360, 2005
    • National Research Council,On Time to the Doctorate: A Study of the Lengthening Time to Completion for Doctorates in Science and Engineering,华盛顿特区:国家学院出版社,1990年
    Tilghman, Shirley (chair) et al.生命科学早期职业的趋势,华盛顿特区:国家学院出版社,1998年
    • Zuckerman, Harriet and Merton, Robert. Age, Aging, and Age Structure in Science, in Merton, Robert,The Sociology of Science, Chicago, IL: University of Chicago Press, 1973, 497-559
    • Cronin et al, 2004 Visible, Less Visible, and Invisible Work: Patterns of Collaboration in 20th Century Chemistry,Journal of the American Society for Information Science and Technology, 2004, 55(2), 160-168
    • Grossman, Jerry. The Evolution of the Mathematical Research Collaboration Graph,Congressus Numerantium,2002,158,202-212
    • Cronin, Blaise, Shaw, Debora, and La Barre, Kathryn. A Cast of Thousands: Coauthorship and Subauthorship Collaboration in the 20th Century as Manifested in the Scholarly Journal Literature of Psychology and Philosophy,Journal of the American Society for Information Science and Technology, 2003, 54(9), 855-871
    •McDowell,John和Melvin,Michael。合行的决定因素:经济学文献的分析,经济学和统计审查, February 1983, 65, 155-160
    • Hudson, John. Trends in Multi-Authored Papers in Economics,Journal of Economic Perspectives, Summer 1996, 10, 153-158
    • Laband, David and Tollison, Robert. Intellectual Collaboration,Journal of Political Economy, June 2000, 108, 632-662
    •琼斯2010。As Science Evolves, How Can Science Policy?
    • The Collapse of the Soviet Union and the Productivity of American Mathematicians, by George J. Borjas and Kirk B. Doran, NBER Working Paper No. 17800, February 2012

  14. 参见左上角A367的表格spreadsheet
  15. Table with upper left hand corner A2 in thespreadsheet
  16. 2008年的一项研究比较了PubMed,Scopus,Web of Science和Google Scholar,并得出结论:“PubMed and Google Scholar are accessed for free […] Scopus offers about 20% more coverage than Web of Science, whereas Google Scholar offers results of inconsistent accuracy. PubMed remains an optimal tool in biomedical electronic research. Scopus covers a wider journal range […] but it is currently limited to recent articles (published after 1995) compared with Web of Science. Google Scholar, as for the Web in general, can help in the retrieval of even the most obscure information but its use is marred by inadequate, less often updated, citation information.”Larsen&Von Ins(2010)claim that the coverage of SSI has been declining.
  17. Here are some caveats about citations as a measure of quality:Wilhite and Fong (2012): “…impact factors continue to be a primary means by which academics “quantify the quality of science”. One side effect of impact factors is the incentive they create for editors to coerce authors to add citations to their journal. Coercive self-citation does not refer to the normal citation directions, given during a peer- review process, meant to improve a paper. Coercive self-citation refers to requests that (i) give no indication that the manuscript was lacking in attribution; (ii) make no suggestion as to specific articles, authors, or a body of work requiring review; and (iii) only guide authors to add citations from the editor’s journal.” AndStorbeck (2012): “The [extent] of manipulation is amazing. For example, according to figures published by the Managing Editor of the ‘Review of Finance’, the impact factor of the ‘Journal of Banking and Finance’ – the fourth worst offender according to the study by Wilhite and Fong – dwindles if self-citations are excluded. While the raw impact factor of the journal is 2.731, the one without self-citations is just 0.748.”