Blog
3 days ago
Optimise LLM usage costs with Semantic Cache
Semantic cache strategy in RAG system reduces LLM calls for similar questions, and hence cuts down token usage which results in lowering overall API costs without affecting answer quality.
Source: HackerNoon →