Towards an Axiological Approach to AI Alignment
AI alignment research currently operates primarily within the framework of decision theory and looks for ways to align or constrain utility functions such that they avoid “bad” outcomes and favor “good” ones. I think this is a reasonable approach…