Professional Documents
Culture Documents
InformativeEssay RicardoRosa#32141266
InformativeEssay RicardoRosa#32141266
Número de cuenta:
32141266
Sede de estudio:
Ceutec Tegucigalpa
Docente:
Lic. Jose Ucles
Sección:
1371
Fecha de entrega:
26/05/2024
I. Introduction
Thesis: It is critical that we prioritize AI safety by aligning AI
systems with human values and ethics, ensuring AI systems are
robust, reliable and secure and Mitigating existential risk from
advanced AI systems to ensure these powerful technologies are
developed and deployed responsibly.
V. Conclusion
“Aligning AI systems with human values and ethics, ensuring AI
systems are robust, reliable and secure, and mitigating existential
risk from advanced AI systems are critical challenges that must be
resolved through research and responsible development.”
It is essential that AI systems have human values and ethics. This is an important
challenge, since defining and codifying human values is a complex philosophical and
technical problem. There are many learning approaches that require advanced features, and
debate, where artificial intelligence systems argue opposing points of view, show promise
in this area. However, much more research is needed to reliably align advanced AI systems
with human values.
AI systems must be robust, reliable and secure. It is critical that these systems
behave as intended, even in the face of distributional shift or hacking. Techniques like
amplification, which recursively improves the performance of an AI system, and debate,
where AI systems argue about the reliability of other AI systems, can help address these
challenges. However, ensuring the reliability of advanced AI systems remains an open
problem.
We must reduce the existential risk shown by advanced AI systems. Since predicting
and controlling the trajectory of AI is extremely difficult, the potential consequences of
getting it wrong can be serious. AI alignment research, responsible for ensuring that
advanced AI systems remain aligned with human values, and technical work on AI security,
which develops tools to make AI systems safer, are essential to this challenge. However,
much more work is needed to make advanced AI systems remain under human control.
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016).
Concrete Problems in AI Safety. En arXiv [cs.AI]. http://arxiv.org/abs/1606.06565
Dalrymple, D. “davidad”, Skalse, J., Bengio, Y., Russell, S., Tegmark, M., Seshia, S.,
Omohundro, S., Szegedy, C., Goldhaber, B., Ammann, N., Abate, A., Halpern, J., Barrett,
C., Zhao, D., Zhi-Xuan, T., Wing, J., & Tenenbaum, J. (2024). Towards guaranteed safe AI:
A framework for ensuring robust and reliable AI systems. En arXiv [cs.AI].
http://arxiv.org/abs/2405.06624
Krakovna, V., Orseau, L., Martic, M., & Legg, S. (s/f). Penalizing side effects using
stepwise relative reachability. Ceur-ws.org. Recuperado el 27 de mayo de 2024, de
https://ceur-ws.org/Vol-2419/paper_1.pdf
Two types of AI existential risk: Decisive and accumulative. (s/f). En arXiv. Recuperado el
27 de mayo de 2024, de https://arxiv.org/html/2401.07836v2