Back to Projects
open/source
2026

Gemini CAG Demo (Open Source)

Cache-Augmented Generation with the Google Gemini API: cache documents once, query cached tokens at about 90% lower cost.

Key Metrics

~90%

Cost Reduction

About This Project

A working demo of Cache-Augmented Generation (CAG) with the Google Gemini API. Load documents into a context cache once, then query the cached tokens billed at roughly 90% lower cost. Includes a live cost-savings dashboard, a CAG vs full-context compare mode, and cache TTL lifecycle handling. Tested with pytest and Dockerized.

Technologies Used

Python Flask Google Gemini google-genai

Interested in Similar Work?

I'm available for freelance projects and full-time opportunities. Let's discuss how I can help bring your ideas to life.

Get in Touch