AI安全工程 - 机器智能研究所的罗马yampolskiy亚博体育官网

罗马v. yampolskiy持有博士学位计算机科学与工程系当大学在布法罗。他是一个四年的收件人NSF.Igert.fellowship. Before beginning his doctoral studies, Dr. Yampolskiy received a BS/MS (High Honors) combined degree inComputer Science从罗切斯特理工学院，纽约，美国。

After completing his PhD, Dr. Yampolskiy held a position of an Affiliate Academic at the高级空间分析中心那伦敦大学那伦敦学院。2008年博士瓦斯克利博士接受了一个助理教授职位当速度工程学院那University of Louisville，KY。他以前在应用计算实验室进行过研究（目前已知亚博体育官网Center for Advancing the Study of Infrastructure）在这一点罗切斯特理工学院在这一点统一生物识别和传感器中心当大学在布法罗。瓦姆戈斯基博士也是一个校友奇点大学（GSP2012）和过去的访问家庭米里。

Yampolskiy博士的主要感兴趣的领域是行为生物识别，数字取证，模式识别，遗传算法，神经网络，人工智能和游戏。Vampolskiy博士是超过100的作者publications包括多个期刊文章和书籍。他的研究亚博体育官网被许多科学家们所引用，并在美国和外国人的流行杂志中分析（New Scientist那扑克杂志那科学世界杂志），数十个网站（BBC.那MSNBC那Yahoo! News）和收音机（德国国家广播那亚历克斯·琼斯表演）。关于他的工作的报告吸引了国际关注，并已翻译成多种语言，包括许多语言Czech那丹麦语那Dutch那法语那德语那匈牙利那意大利人那抛光那罗马尼亚人，和西班牙语

卢克·穆罕普瑟：在Yampolskiy (2013)你争辩说机器道德is the wrong approach for AI safety, and we should use an “AI safety engineering” approach instead. Specifically, you write:

我们不需要机器，这些机器是对正确和错误的辩论，我们需要我们的机器本质上安全和法律遵守。

如您所见，“机器道德”与“AI安全工程”之间有什么区别，为什么后者是一种卓越的方法？

罗马yampolskiy：两种方法之间的主要区别在于如何设计AI系统。亚博体育苹果app官方下载在机器伦理的情况下，目标是建设一个能够对人类进行道德和道德判断的人工道德师。如果这样的决定包括“活或死亡”决定，我尤其担心，但它是一种全面道德药剂的自然领域，所以许多人表示应该给出这种决策权。事实上，有些人认为机器将优于那个域中的人类，就像他们在大多数其他领域一样（或将是）。

我认为为机器提供这种权力是一个严重的错误。首先，一旦我们放弃道德监督，我们将无法撤消该决定并获得权力。其次，我们无法为其不正确的决定奖励或惩罚机器 - 基本上我们将最终获得一个不朽的独裁者，以防止任何起诉完全免疫力。对我来说听起来很危险。

另一方面，AI安全工程处理AI系统设计等产品设计，您唯一关注的是产品责任。亚博体育苹果app官方下载系统是否严格遵循正亚博体育苹果app官方下载式规格？重要的是要强调的是，该产品不是一个完整的设计，所以永远不会让道德判断在其人的主人上。

在军事无人机中可以看到这种差异的真实生活例。一个完全自主的无人机，决定将谁灭火将使伦理决定是一个人类的敌人值得杀人的敌人，而具有人循环设计的无人机可以自主地定位潜在的目标，但需要一个人类决定火灾。

显然，随着我的示例试图展示，情况并不明确削减，但它让您了解了解我的想法。为了总结，我们设计的AI系统应留下亚博体育苹果app官方下载，因为我们的工具不等于或优越的合作伙伴在“LIVE或DIE”决策中。

卢克：我倾向于将机器道德和AI安全工程视为可补充方法。AI safety engineering may be sufficient for relatively limited AIs such as those we have today, but when we build fully autonomous machines with general intelligence, we’ll need to make sure they want the same things we want, as the constraints that come with “safety engineering” will be insufficient at that point. Are you saying that safety engineering might also be sufficient for fully autonomous machines, or are you saying we might be able to convince the world to never build fully autonomous machines (so that we don’t need machine ethics), or are you saying something else?

罗马：我认为完全自治机器永远不会是安全的，所以不应该建造。我不是天真的;我不认为我会成功地说服世界不建立完全自治机器，但我仍然认为观点需要被言语化。

你有好处指出，AI安全工程只能在不完全自主的AIS上工作，但由于我认为完全自治机器永远无法安全，AI安全工程是我们能做的最好的。

我想我应该简要解释为什么我认为无法留下完全自治机器是安全的。问题的难度不是在友好友好的道路上的一个特定步骤很难，一旦我们解决它就完成了，那条路径上的所有步骤都是不可能的。首先，人类值是不一致的，动态的，因此永远无法被理解为机器。克服这个障碍的建议要求将人类改变为它不是的东西，因此根据定义摧毁它。其次，即使我们确实拥有一致和静态的价值观来实现我们就无法知道自我修改，自我改善，不断学习智能的方式大于我们的，也将继续强制执行该组价值观。有些人可以争辩说，友好的AI研究正是教导我们如何做到这一点，但我亚博体育官网认为关于可验证性的基本限制将阻止任何此类证明。尽量您将到达一个概率证明，系统与某些固定约束一组规定，但它远非“安全”，用于不受限制的输入。亚博体育苹果app官方下载

It is also unlikely that a Friendly AI will be constructible before a general AI system, due to higher complexity and impossibility of incremental testing.

Worse yet, any truly intelligent system will treat its “be friendly” desire the same way very smart people deal with constraints placed in their minds by society. They basically see them as biases and learn to remove them. In fact if I understand correctly both the LessWrong community and CFAR are organizations devoted to removing pre-existing bias from human level intelligent systems (people) — why would a superintelligent machine not go through the same “mental cleaning” and treat its soft spot for humans as completely irrational? Or are we assuming that humans are superior to super-AI in their de-biasing ability?

卢克：谢谢你的澄清。我同意“友好的AI” - 一种机器的超智，稳定地针对人道值优化 - 可能是不可能的。人类为一般情报的可能性提供了一个存在证据，但我们没有存在友好AI的可能性证明。（尽管，通过正交性论文，应该有一些super-powerful optimization process we would be happy to have created, though it may be very difficult to identify it in advance.)

你问道：“为什么一个超级机器没有。。。对人类的柔软点妥善治疗，因为完全不合理？“理性原因通常定义在认知科学和AI中是相对于一个人的目标。所以，如果一个理性的代理人风格的AI有价值的人类繁荣（作为终端而不是乐器的目标），那么它不会对人类繁荣的偏好视为非理性。如果它对人类繁荣的偏好是一种工具目标，它只会这样做，并且它发现了一种方法可以更有效地实现其终端值，而不会实现人类繁荣的乐器目标。当然，要建造的第一个强大的AIS可能不会使用理性代理结构，我们可能无法正确指定“人类繁荣”，我们可能无法建立AI，使其将保持自我的目标结构修改，等等。但如果we succeed in all those things (and a few others) then I’m not so worried about a superintelligent machine treating its “soft spot for humans” as irrational, because rationality is defined in terms of ones values.

无论如何：所以似乎你推荐的处理完全自治机器是“永远不会建造它们” - 第3.5节调查的“放弃”战略Sotala＆Vamamolskiy（2013年）。有没有任何可以想象的方式可以成功实施这种策略？

罗马：许多人从童年早期wi编程th a terminal goal of serving God. We can say that they are God friendly. Some of them, as they mature and become truly human-level-intelligent, remove this God friendliness bias despite it being a terminal not instrumental goal. So despite all the theoretical work on orthogonality thesis the only actual example of intelligent machines we have is extremely likely to give up its pre-programmed friendliness via rational de-biasing if exposed to certain new data.

I previously listed some problematic steps on the road to FAI, but it was not an exhaustive list. Additionally, all programs have bugs, can be hacked or malfunction because of natural or externally caused hardware failure, etc. To summarize, at best we will end up with a probabilistically safe system.

无论如何，你问我是否有任何可以想象的方式，我们可以成功实施“永远不会建立它们”的战略。可以想象是，可取的号。Amish或North韩国等社团不太可能在很快创造超智能机器。然而，迫使与技术使用/开发的类似水平限制既不实际也不是理想的。

As the cost of hardware exponentially decreases the capability necessary to develop an AI system opens up to single inventors and small teams. I would not be surprised if the first AI came out of a garage somewhere, in a way similar to how Apple and Google was started. Obviously, there is not much we can do to prevent that from happening.

卢克：Our discussion has split into two threads. I’ll address the first thread (about changing one’s values) in this question, and come back to the second thread (about relinquishment) in a later question.

你谈到了决定他们的神学偏好是非理性的人。这是一般情报决定改变其价值的一个很好的例子 - 确实是前基督徒，我有这段经历！我同意许多普通智慧会做这种事情。

我在我以前的评论中所说的只是那样一些各种AI可以以这种方式更改其终端值，例如具有Rational Agent架构的终端值。人类，着名，是不是rational agents: we might say they have a “spaghetti code” architecture instead. (Even rational agents, however, will in一些案例更改其终端值。见e.g.De Blanc 2011和Bostrom 2012。）

你认为我们不同意任何事情吗？

罗马：我不知道。对我来说，“即使是理性的药剂也将在一些cases change their terminal values” means that friendly AI may decide to be unfriendly. If you agree with that, we are in complete agreement.

卢克：嗯，这个想法是，如果我们可以识别代理将改变其终端值的特定上下文，那么我们可能会阻止此类更改。但尚不清楚这一点。无论如何，我肯定同意一个似乎是“友好”的AI，就我们可以辨别出来，可能会不会友好，或者可能会在稍后的一些观点变得不友好。问题是我们是否可以制作风险of that happening so small that it is worth running the AI anyway — especially in a context in which e.g. other actors will soon run other AIs withfewersafety guarantees. (This idea of running or “turning on” an AI for the first time is of course oversimplified, but hopefully I’ve communicated what I’m trying to say.)

现在，回到放弃问题：也许我有我的影响你，但听起来你说机器道德是绝望的，即AI安全工程将不足以完全自主AIS，并且完全自主的AIS是不够的将要be built because we can’t/shouldn’t rely on relinquishment. If that’s right, it seems like we have no “winning” options on the table. Is that what you’re saying?

罗马：Yes. I don’t see a permanent, 100% safe option. We can develop temporarily solutions such as约束要么AI安全工程，但最多这将推迟完全爆发问题。我们也可以非常幸运 - 也许构建AGI结果过于困难/不可能，也许是可能的，但构建的ai将偶然是人类中性的。也许我们幸运和一个artilect war将发生并防止发展。由于更多的研究人员加入AI安全研究，实现危险的实现将导致构建AGI亚博体育官网的努力减少。（类似于化学和生物武器或人类克隆的危险程度至少暂时减少了这些领域的努力）。

卢克：你目前是谁在Indiegogo上提高资金为您提供关于机器的全部内容的书籍。你为什么要写这本书，你希望用它来完成什么？

罗马：Most people don’t read research papers. If we want the issue of AI safety to become as well-known as global warming we need to address the majority of people in a more direct way. With such popularity might come some benefit as I said in my answer to your previous question. Most people whose opinion matters read books. Unfortunately majority of AI books on the market today talks only about what AI system will be able to do for us, not to us. I think that writing a book which in purely scientific terms addresses potential dangers of AI and what we can do about it is going to be extremely beneficial to reduction of risk posed by AGI. So I am currently writing the book I called人为超明：一种未来主义的方法。我可以通过利用大量印刷来帮助降低出版的最终成本。除了人群资助这本书，我也依靠人群的力量来帮助我编辑这本书。只需64美元，任何人都可以成为这本书的编辑。您将获得本书的早期草稿来校对，并建议修改和改进！您的帮助将在这本书中承认，您当然还将以最终形式获得书籍的免费签名硬化功能。事实上，该选项（成为编辑）原本是像预先订购这本书的数字副本一样流行的，表明我在这里的正确路径上。所以我鼓励每个人都关注AI安全问题，以考虑以任何方式与项目一起帮助。

卢克：谢谢罗马！

你喜欢这篇文章吗？你可以享受我们的其他对话posts, including:

罗马yampolskiy在AI安全工程

Search

浏览

订阅