In this case you will need access to a GPU accelerated AWS EC2 instance with 8 cores, 64GB memory and 2X A100 or L40s GPUs. For this option, deploy the RAG pipeline on AWS using a g5.12xlarge machine.
"Now, the vast majority of generative AI workloads today run on Nvidia GPUs, and AWS is by far the best place anywhere in the world to run GPU workloads," boasted Garman. "Part of the reason is ...