class
Memo::QueryCache
- Memo::QueryCache
- Reference
- Object
Overview
LRU query embedding cache with optional DB persistence.
Avoids repeated API calls for the same query text by caching the embedding vector. Memory LRU is checked first (instant), then DB (fast), then API (slow).
Memory is bounded by max_entries (LRU eviction). DB is bounded by max_db_entries (oldest evicted on prune).
Defined in:
memo/query_cache.crConstructors
Instance Method Summary
-
#clear
Clear all cached entries (memory and DB)
-
#get(query : String) : Tuple(Array(Float64), Int32) | Nil
Look up a cached embedding for a query string.
-
#hit_rate : Float64
Cache hit rate as a percentage
- #hits : Int64
- #max_db_entries : Int32
- #max_entries : Int32
- #misses : Int64
-
#put(query : String, embedding : Array(Float64), token_count : Int32)
Store an embedding for a query string.
-
#size : Int32
Number of entries in memory cache
Constructor Detail
def self.new(max_entries : Int32 = 10000, max_db_entries : Int32 = 100000, db : DB::Database | Nil = nil, service_id : Int64 = 0)
#
Instance Method Detail
def get(query : String) : Tuple(Array(Float64), Int32) | Nil
#
Look up a cached embedding for a query string. Returns {embedding, token_count} or nil on miss.
def put(query : String, embedding : Array(Float64), token_count : Int32)
#
Store an embedding for a query string.