class Memo::QueryCache

Memo::QueryCache
Reference
Object

Overview

LRU query embedding cache with optional DB persistence.

Avoids repeated API calls for the same query text by caching the embedding vector. Memory LRU is checked first (instant), then DB (fast), then API (slow).

Memory is bounded by max_entries (LRU eviction). DB is bounded by max_db_entries (oldest evicted on prune).

Defined in:

memo/query_cache.cr

Constructors

.new(max_entries : Int32 = 10000, max_db_entries : Int32 = 100000, db : DB::Database | Nil = nil, service_id : Int64 = 0)

Instance Method Summary

#clear
Clear all cached entries (memory and DB)
#get(query : String) : Tuple(Array(Float64), Int32) | Nil
Look up a cached embedding for a query string.
#hit_rate : Float64
Cache hit rate as a percentage
#hits : Int64
#max_db_entries : Int32
#max_entries : Int32
#misses : Int64
#put(query : String, embedding : Array(Float64), token_count : Int32)
Store an embedding for a query string.
#size : Int32
Number of entries in memory cache

Constructor Detail

def self.new(max_entries : Int32 = 10000, max_db_entries : Int32 = 100000, db : DB::Database | Nil = nil, service_id : Int64 = 0) #

[View source]

Instance Method Detail

def clear #

Clear all cached entries (memory and DB)

[View source]

def get(query : String) : Tuple(Array(Float64), Int32) | Nil #

Look up a cached embedding for a query string. Returns {embedding, token_count} or nil on miss.

[View source]

def hit_rate : Float64 #

Cache hit rate as a percentage

[View source]

def hits : Int64 #

[View source]

def max_db_entries : Int32 #

[View source]

def max_entries : Int32 #

[View source]

def misses : Int64 #

[View source]

def put(query : String, embedding : Array(Float64), token_count : Int32) #

Store an embedding for a query string.

[View source]

def size : Int32 #

Number of entries in memory cache

[View source]

CrystalDoc.info

memo

class Memo::QueryCache

Overview

Defined in:

Constructors

Instance Method Summary

Constructor Detail

Instance Method Detail