Andrej Baranovskij Blog
Blog about Oracle, Full Stack, Machine Learning and Cloud
Monday, December 23, 2024
Stateless MLX Inference with FastAPI in Sparrow
I show how to run inference with MLX in stateless mode, when loaded model is released after inference completes. This is useful when inference requests are less frequent and it helps to reclaim resources reserved by MLX.
No comments:
Post a Comment
Older Post
Home
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment