Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Informative Essay #4

Nombre del estudiante:


Ricardo David Rosa Barahona

Número de cuenta:
32141266

Sede de estudio:
Ceutec Tegucigalpa

Docente:
Lic. Jose Ucles

Sección:
1371

Fecha de entrega:
26/05/2024

I. Introduction
Thesis: It is critical that we prioritize AI safety by aligning AI
systems with human values and ethics, ensuring AI systems are
robust, reliable and secure and Mitigating existential risk from
advanced AI systems to ensure these powerful technologies are
developed and deployed responsibly.

II. Aligning AI systems with human values and ethics


a. Importance of imbuing AI with human values and ethics.
b. Challenges in defining and encoding human values.

III. Ensuring AI systems are robust, reliable and secure


a. Importance of AI systems behaving as intended.
b. Challenges like distributional shift or hacking.

IV. Mitigating existential risk from advanced AI systems


a. Avoiding catastrophic outcomes from advanced AI.
b. Predicting and controlling the trajectory of AI.

V. Conclusion
“Aligning AI systems with human values and ethics, ensuring AI
systems are robust, reliable and secure, and mitigating existential
risk from advanced AI systems are critical challenges that must be
resolved through research and responsible development.”

Artificial Intelligence Safety


As artificial intelligence systems become more advanced and capable, it is critical
that we prioritize AI safety to ensure these powerful technologies are developed and
deployed responsibly. There are three key aspects of AI safety that must be addressed:
aligning AI systems with human values and ethics, ensuring AI systems are robust, reliable
and secure, and mitigating existential risk from advanced AI systems.

It is essential that AI systems have human values and ethics. This is an important
challenge, since defining and codifying human values is a complex philosophical and
technical problem. There are many learning approaches that require advanced features, and
debate, where artificial intelligence systems argue opposing points of view, show promise
in this area. However, much more research is needed to reliably align advanced AI systems
with human values.

AI systems must be robust, reliable and secure. It is critical that these systems
behave as intended, even in the face of distributional shift or hacking. Techniques like
amplification, which recursively improves the performance of an AI system, and debate,
where AI systems argue about the reliability of other AI systems, can help address these
challenges. However, ensuring the reliability of advanced AI systems remains an open
problem.

We must reduce the existential risk shown by advanced AI systems. Since predicting
and controlling the trajectory of AI is extremely difficult, the potential consequences of
getting it wrong can be serious. AI alignment research, responsible for ensuring that
advanced AI systems remain aligned with human values, and technical work on AI security,
which develops tools to make AI systems safer, are essential to this challenge. However,
much more work is needed to make advanced AI systems remain under human control.

In conclusion, as artificial intelligence becomes more advanced, prioritizing AI


safety is essential. Aligning AI systems with human values and ethics, ensuring AI systems
are robust, reliable and secure, and mitigating existential risk from advanced AI systems are
critical challenges that must be resolved through research and responsible development. By
prioritizing AI safety, we can make good use of artificial intelligence while avoiding its
most serious risks.
Bibliography

Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016).
Concrete Problems in AI Safety. En arXiv [cs.AI]. http://arxiv.org/abs/1606.06565

Dalrymple, D. “davidad”, Skalse, J., Bengio, Y., Russell, S., Tegmark, M., Seshia, S.,
Omohundro, S., Szegedy, C., Goldhaber, B., Ammann, N., Abate, A., Halpern, J., Barrett,
C., Zhao, D., Zhi-Xuan, T., Wing, J., & Tenenbaum, J. (2024). Towards guaranteed safe AI:
A framework for ensuring robust and reliable AI systems. En arXiv [cs.AI].
http://arxiv.org/abs/2405.06624

Krakovna, V., Orseau, L., Martic, M., & Legg, S. (s/f). Penalizing side effects using
stepwise relative reachability. Ceur-ws.org. Recuperado el 27 de mayo de 2024, de
https://ceur-ws.org/Vol-2419/paper_1.pdf

Two types of AI existential risk: Decisive and accumulative. (s/f). En arXiv. Recuperado el
27 de mayo de 2024, de https://arxiv.org/html/2401.07836v2

You might also like