Andrej Baranovskij Blog
Blog about Oracle, Full Stack, Machine Learning and Cloud
Tuesday, September 16, 2025
Ollama vs MLX Inference Speed on Mac Mini M4 Pro 64GB
MLX runs faster on first inference, but thanks to model caching or other optimizations by Ollama, second and next inference runs faster on Ollama.
No comments:
Post a Comment
Older Post
Home
View mobile version
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment