What is the function of a fire alarm?
One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit the building.
在1968年拉坦和达利(Latane and Darley)的经典实验中,要求八组三个学生在一个房间里填写一份问卷,不久之后不久就开始充满烟雾。八个小组中有五个没有反应或报告烟雾,即使它变得足够稠密以使它们开始咳嗽。随后的操作表明,一个孤独的学生将有75%的时间做出反应。虽然一名学生陪同两个演员被告知假装冷漠的时间只有10%的时间。这项和其他实验似乎确定正在发生的事情是多元化的无知。We don’t want to look panicky by being afraid of what isn’t an emergency, so we try to look calm while glancing out of the corners of our eyes to see how others are reacting, but of course they are also trying to look calm.
(I’ve read a number of replications and variations on this research, and the effect size is blatant. I would not expect this to be one of the results that dies to the replication crisis, and I haven’t yet heard about the replication crisis touching it. But we have to put a maybe-not marker on everything now.)
A fire alarm creates common knowledge, in the you-know-I-know sense, that there is a fire; after which it is socially safe to react. When the fire alarm goes off, you know that everyone else knows there is a fire, you know you won’t lose face if you proceed to exit the building.
这fire alarm doesn’t tell us with certainty that a fire is there. In fact, I can’t recall one time in my life when, exiting a building on a fire alarm, there was an actual fire. Really, a fire alarm isweaker火灾的证据比从门下传来的烟雾的证据。
But the fire alarm tells us that it’s socially okay to react to the fire. It promises us with certainty that we won’t be embarrassed if we now proceed to exit in an orderly fashion.
It seems to me that this is one of the cases where people have mistaken beliefs about what they believe, like when somebody loudly endorsing their city’s team to win the big game will back down as soon as asked to bet. They haven’t consciously distinguished the rewarding exhilaration of shouting that the team will win, from the feeling of anticipating the team will win.
When people look at the smoke coming from under the door, I think they think their uncertain wobbling feeling comes from not assigning the fire a high-enough probability of really being there, and that they’re reluctant to act for fear of wasting effort and time. If so, I think they’re interpreting their own feelings mistakenly. If that was so, they’d get the same wobbly feeling on hearing the fire alarm, or even more so, because fire alarms correlate to fire less than does smoke coming from under a door. The uncertain wobbling feeling comes from the worry that others believe differently, not the worry that the fire isn’t there. The reluctance to act is the reluctance to be seen looking foolish, not the reluctance to waste effort. That’s why the student alone in the room does something about the fire 75% of the time, and why people have no trouble reacting to the much weaker evidence presented by fire alarms.
时不时地提议我们应该以后对人工通用情报的问题作出反应(背景在这里), because, it is said, we are so far away from it that it just isn’t possible to do productive work on it today.
(For direct argument about there being things doable today, see: Soares and Fallenstein (2014/2017); Amodei, Olah, Steinhardt, Christiano, Schulman, and Mané (2016); or Taylor, Yudkowsky, LaVictoire, and Critch (2016)。
(如果这些论文存在或者如果你是一个AI researcher who’d read them but thought they were all garbage, and you wished you could work on alignment but knew of nothing you could do, the wise next step would be to sit down and spend two hours by the clock sincerely trying to think of possible approaches. Preferably without self-sabotage that makes sure you don’t come up with anything plausible; as might happen if, hypothetically speaking, you would actually find it much more comfortable to believe there was nothing you ought to be working on today, because e.g. then you could work on other things that interested you more.)
(但是没关系。)
So if AGI seems far-ish away, and you think the conclusion licensed by this is that you can’t do any productive work on AGI alignment yet, then the implicit alternative strategy on offer is: Wait for some unspecified future event that tells us AGI is coming near; andthenwe’ll all know that it’s okay to start working on AGI alignment.
在我看来,这是错误的。这里是其中的一些。
阅读更多 ”