Andrej Baranovskij Blog

Blog about Oracle, Full Stack, Machine Learning and Cloud

Tuesday, September 16, 2025

Ollama vs MLX Inference Speed on Mac Mini M4 Pro 64GB

MLX runs faster on first inference, but thanks to model caching or other optimizations by Ollama, second and next inference runs faster on Ollama.

Andrej Baranovskij at 8:34 AM

No comments:

Post a Comment

View web version

About Me

Andrej Baranovskij: Vilnius, Lithuania; I'm Oracle ACE Director, Oracle Groundbreaker Ambassador, CEO and Technical Expert at Red Samurai Consulting with focus on Oracle Fusion Middleware and Oracle Cloud technologies.

View my complete profile

Powered by Blogger.