class Memo::QueryCache

Overview

LRU query embedding cache with optional DB persistence.

Avoids repeated API calls for the same query text by caching the embedding vector. Memory LRU is checked first (instant), then DB (fast), then API (slow).

Memory is bounded by max_entries (LRU eviction). DB is bounded by max_db_entries (oldest evicted on prune).

Defined in:

memo/query_cache.cr

Constructors

Instance Method Summary

Constructor Detail

def self.new(max_entries : Int32 = 10000, max_db_entries : Int32 = 100000, db : DB::Database | Nil = nil, service_id : Int64 = 0) #

[View source]

Instance Method Detail

def clear #

Clear all cached entries (memory and DB)


[View source]
def get(query : String) : Tuple(Array(Float64), Int32) | Nil #

Look up a cached embedding for a query string. Returns {embedding, token_count} or nil on miss.


[View source]
def hit_rate : Float64 #

Cache hit rate as a percentage


[View source]
def hits : Int64 #

[View source]
def max_db_entries : Int32 #

[View source]
def max_entries : Int32 #

[View source]
def misses : Int64 #

[View source]
def put(query : String, embedding : Array(Float64), token_count : Int32) #

Store an embedding for a query string.


[View source]
def size : Int32 #

Number of entries in memory cache


[View source]