Honesty is the best policy: defining and mitigating AI deception
File(s)2312.01350.pdf (599.07 KB)
Preprint
Author(s)
Ward, Francis Rhys
Belardinelli, Francesco
Toni, Francesca
Everitt, Tom
Type
Preprint
Date Issued
2023-12-03
Citation
arXiv, 2023
Journal / Book Title
arXiv
Copyright Statement
Copyright © 2023 The Author(s). This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).
License URL
Identifier
https://arxiv.org/abs/2312.01350