Who is Apple's MLX for?
A few days ago Apple quietly announced two Python based machine learning open source projects: "MLX: An array framework for Apple silicon" which is a lot like PyTorch and "MLX-Data: Efficient framework-agnostic data loading" which helps you load and manage data when training a model.
People are proclaiming that Apple is getting into the Gen AI race. The packages look great and I'm not trying to disparage their creators or dissuade anyone from using them in any way but I don't fully believe that claim yet and am curious as to who these packages are for and why they were created and released.
Let me explain, the Mx series machines have a unified memory architecture. That is, both the CPU and GPU have direct access to the same memory. This lets you 1) have much more GPU accessible memory than you would otherwise normally have and 2) save time moving data back and forth to and from the GPU. This is different from other machines with a separate GPU card and MLX may lead to new approaches and algorithms that take advantage of that. Which is a good thing.
Having more memory is great. It means you can use all of the 128 gigs available on a M3 MacBook Pro or 196 gigs available on a Mac Studio to run LLMs vs being limited to the 24 gigs on an NVIDIA 4090 or 40 or 80 gigs on more powerful cards.
And as I mentioned not having to move data back and forth can save a lot of time and MLX is designed to take advantage of that. However a 4090 or A100 not to mention H100 all have many more GPU cores and are faster than current Macs for ML work. Also most LLM training is done on multiple large clusters of large NVIDIA cards. A single Mac does not stack up in most cases.
So my conundrum is this: If you are a researcher you probably want/need to use a cluster. If you are an industry MacOS or iOS programmer you are probably using a pretrained model and are not doing much training on your Mac. If you just want to run some models locally Ollama lets you do that with no work on your part. PyTorch has an "mps" (Metal Performance Shader) backend so how much performance are you really gaining? Most models are written in PyTorch so do you really want to convert or re-write them? There is already a huge ecosystem around PyTorch. The MLX API is slightly different in places than Numpy and PyTorch so it is something else to learn. MLX does not address the pain of converting models to CoreML. MLX uses the CPU and GPU but not the Apple Neural Engine. MLX addresses training but not in-app inference since it is Python and not Swift based.
Which leads me to think that currently MLX is great for Apple developers who want to get into ML (though they could have been using PyTorch and related libraries for a while now) and maybe actual Apple engineers who have a strong incentive to maximize the performance on Mx hardware with a library they control. But in either case your limiting the advantage you could get from an industry working in PyTorch and with other AI processors (GPU cards from various vendors, TPUs, etc.)
It's hard to tell right now if this is a side project or the beginnings of Apple taking ML more seriously. Evidence for the latter would be announcements around increasing the ml-explore team size, more products to simplify bringing models into Core ML, integration with Swift for training, and the killer would be on device in-app training/fine-tuning from Swift.
That last one would would let you add personalized AI/ML to your apps and would make this all make more sense. I really hope it comes to be.