Enhanced Hybrid RAG-LLM Architecture for Domain-Specific Cloud Infrastructure Management: Advancing Context-Aware Decision-Making Strategies

Main Article Content

Madhu Chavva
Sathiesh Veera

Abstract

Effective management of cloud infrastructure requires sophisticated decision-making strategies that account for the dynamic and complex nature of cloud environments. This paper presents an enhanced Hybrid Retrieval-Augmented Generation with Large Language Model (RAG-LLM) architecture tailored for domain-specific cloud infrastructure management. The proposed framework integrates retrieval-based knowledge extraction with generative reasoning to enhance contextual awareness and improve decision accuracy. By combining real-time data retrieval from cloud infrastructure with the generative capabilities of LLMs, the system facilitates more informed and adaptive decision-making. The hybrid approach addresses key challenges such as latency, scalability, and domain-specific knowledge gaps by leveraging both structured and unstructured data sources. Experimental results demonstrate significant improvements in performance, including faster response times, increased decision accuracy, and better adaptability to evolving infrastructure states. The proposed architecture represents a significant advancement in cloud infrastructure management by enhancing situational awareness and enabling more strategic resource allocation and fault resolution.

Article Details

How to Cite
Chavva, M., & Veera, S. (2023). Enhanced Hybrid RAG-LLM Architecture for Domain-Specific Cloud Infrastructure Management: Advancing Context-Aware Decision-Making Strategies. American Journal of AI & Innovation, 5(5). Retrieved from https://journals.theusinsight.com/index.php/AJAI/article/view/70
Section
Articles

References

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.

Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020). Retrieval-augmented language model pre-training. Proceedings of the 37th International Conference on Machine Learning, 6318–6327.

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Riedel, S. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008.

Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., ... & Yih, W.-T. (2020). Dense passage retrieval for open-domain question answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 6769–6781.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1–67.

Gao, L., & Callan, J. (2021). Is your language model connected to the world? Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 4730–4745.

Zhang, Y., Balog, K., & Lin, J. (2021). Conversations with documents: An exploration of text-based knowledge retrieval models. Information Retrieval Journal, 24(2), 137–158.

Cheng, G., Zhang, Y., & Qu, Y. (2020). Knowledge graph-enhanced open-domain question answering. Proceedings of the 29th ACM International Conference on Information and Knowledge Management, 2521–2524.

Glaeser, L., & König, A. (2021). Cloud infrastructure management using deep learning models. Journal of Cloud Computing: Advances, Systems and Applications, 10(1), 1–15.