Constitutional AI: Harmlessness from AI Feedback
← Back to topic
Authors: Yuntao Bai, Saurav Kadavath, Sandipan Kundu, et al.
Year: 2022
Journal: arXiv
DOI: 10.48550/arXiv.2212.08073
Publisher: https://arxiv.org/abs/2212.08073
Keywords: constitutional ai, cai
Abstract
We present Constitutional AI a method for training a harmless AI assistant using a constitution of principles.
Cite this paper
bibtex
@misc{cai2022,
title = {Constitutional AI: Harmlessness from AI Feedback},
author = {Yuntao Bai, Saurav Kadavath, Sandipan Kundu, et al.},
year = {2022},
journal = {arXiv},
doi = {10.48550/arXiv.2212.08073},
url = {https://doi.org/10.48550/arXiv.2212.08073},
}