MIRI updates
- 米里研究员亚博体育官网埃文Hubinger discusses learned optimization, interpretability, and homogeneity in takeoff speedson the Inside View podcast.
- Scott Garrabrant releases part three of "Finite Factored Sets", onconditional orthogonality.
- UC Berkeley's Daniel Filan provides examples of conditional orthogonality in finite factored sets:1,2.
- Abram Demski proposesfactoring the alignment probleminto "outer alignment" / "on-distribution alignment", "inner robustness" / "capability robustness", and "objective robustness" / "inner alignment".
- MIRI senior researcher Eliezer Yudkowskysummarizes"the real core of the argument for 'AGI risk' (AGI ruin)" as "appreciating the power of intelligence enough to realize that getting superhuman intelligence wrong,on the first try, will kill you在第一次尝试, not let you learn and try again".
News and links