Andrej Baranovskij Blog
Blog about Oracle, Full Stack, Machine Learning and Cloud
Tuesday, September 16, 2025
Ollama vs MLX Inference Speed on Mac Mini M4 Pro 64GB
MLX runs faster on first inference, but thanks to model caching or other optimizations by Ollama, second and next inference runs faster on Ollama.
No comments:
Post a Comment
›
Home
View web version
No comments:
Post a Comment