InformativeEssay RicardoRosa#32141266

Informative Essay #4
Nombre del estudiante:

Ricardo David Rosa Barahona
Número de cuenta:
32141266
Sede de estudio:
Ceutec Tegucigalpa
Docente:
Lic. Jose Ucles
Sección:
1371
Fecha de entrega:
26/05/2024
I. Introduction
Thesis: It is critical that we prioritize AI safety by aligning AI
systems with human values and ethics, ensuring AI systems are
robust, reliable and secure and Mitigating existential risk from
advanced AI systems to ensure these powerful technologies are
developed and deployed responsibly.
II. Aligning AI systems with human values and ethics

a. Importance of imbuing AI with human values and ethics.
b. Challenges in defining and encoding human values.
III. Ensuring AI systems are robust, reliable and secure

a. Importance of AI systems behaving as intended.
b. Challenges like distributional shift or hacking.
IV. Mitigating existential risk from advanced AI systems

a. Avoiding catastrophic outcomes from advanced AI.
b. Predicting and controlling the trajectory of AI.
V. Conclusion
“Aligning AI systems with human values and ethics, ensuring AI
systems are robust, reliable and secure, and mitigating existential
risk from advanced AI systems are critical challenges that must be
resolved through research and responsible development.”
Artificial Intelligence Safety

As artificial intelligence systems become more advanced and capable, it is critical
that we prioritize AI safety to ensure these powerful technologies are developed and
deployed responsibly. There are three key aspects of AI safety that must be addressed:
aligning AI systems with human values and ethics, ensuring AI systems are robust, reliable
and secure, and mitigating existential risk from advanced AI systems.
It is essential that AI systems have human values and ethics. This is an important
challenge, since defining and codifying human values is a complex philosophical and
technical problem. There are many learning approaches that require advanced features, and
debate, where artificial intelligence systems argue opposing points of view, show promise
in this area. However, much more research is needed to reliably align advanced AI systems
with human values.
AI systems must be robust, reliable and secure. It is critical that these systems
behave as intended, even in the face of distributional shift or hacking. Techniques like
amplification, which recursively improves the performance of an AI system, and debate,
where AI systems argue about the reliability of other AI systems, can help address these
challenges. However, ensuring the reliability of advanced AI systems remains an open
problem.
We must reduce the existential risk shown by advanced AI systems. Since predicting
and controlling the trajectory of AI is extremely difficult, the potential consequences of
getting it wrong can be serious. AI alignment research, responsible for ensuring that
advanced AI systems remain aligned with human values, and technical work on AI security,
which develops tools to make AI systems safer, are essential to this challenge. However,
much more work is needed to make advanced AI systems remain under human control.
In conclusion, as artificial intelligence becomes more advanced, prioritizing AI

safety is essential. Aligning AI systems with human values and ethics, ensuring AI systems
are robust, reliable and secure, and mitigating existential risk from advanced AI systems are
critical challenges that must be resolved through research and responsible development. By
prioritizing AI safety, we can make good use of artificial intelligence while avoiding its
most serious risks.
Bibliography
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016).
Concrete Problems in AI Safety. En arXiv [cs.AI]. http://arxiv.org/abs/1606.06565
Dalrymple, D. “davidad”, Skalse, J., Bengio, Y., Russell, S., Tegmark, M., Seshia, S.,
Omohundro, S., Szegedy, C., Goldhaber, B., Ammann, N., Abate, A., Halpern, J., Barrett,
C., Zhao, D., Zhi-Xuan, T., Wing, J., & Tenenbaum, J. (2024). Towards guaranteed safe AI:
A framework for ensuring robust and reliable AI systems. En arXiv [cs.AI].
http://arxiv.org/abs/2405.06624
Krakovna, V., Orseau, L., Martic, M., & Legg, S. (s/f). Penalizing side effects using
stepwise relative reachability. Ceur-ws.org. Recuperado el 27 de mayo de 2024, de
https://ceur-ws.org/Vol-2419/paper_1.pdf
Two types of AI existential risk: Decisive and accumulative. (s/f). En arXiv. Recuperado el
27 de mayo de 2024, de https://arxiv.org/html/2401.07836v2

InformativeEssay RicardoRosa#32141266

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

InformativeEssay RicardoRosa#32141266

Uploaded by

Copyright:

Available Formats

Informative Essay #4

Nombre del estudiante:

II. Aligning AI systems with human values and ethics

III. Ensuring AI systems are robust, reliable and secure

IV. Mitigating existential risk from advanced AI systems

Artificial Intelligence Safety

In conclusion, as artificial intelligence becomes more advanced, prioritizing AI

You might also like