Gemini CAG Demo (Open Source)
Cache-Augmented Generation with the Google Gemini API: cache documents once, query cached tokens at about 90% lower cost.
Key Metrics
~90%
Cost Reduction
About This Project
A working demo of Cache-Augmented Generation (CAG) with the Google Gemini API. Load documents into a context cache once, then query the cached tokens billed at roughly 90% lower cost. Includes a live cost-savings dashboard, a CAG vs full-context compare mode, and cache TTL lifecycle handling. Tested with pytest and Dockerized.
Technologies Used
Interested in Similar Work?
I'm available for freelance projects and full-time opportunities. Let's discuss how I can help bring your ideas to life.
Get in Touch