Machine Learning on Mobile
There is a lot of hype around Artificial Intelligence (AI) and Machine Learning (ML). Its been called 'the new electricity' and many believe it will fundamentally change our lives as much as the internet and the industrial revolution.
We know that every major company is investing heavily in AI/ML but how does that help us, as app/product developers and owners, unless we can figure out how to make it work for us and our customers? It is critical that we evaluate our apps and businesses and ask how can we use AI and ML to ...
- fine tune app mechanics
- improve user productivity and performance
- improve user outcomes
- increase user satisfaction
- and increase engagement
We already use it every day, as consumers, in the form of spam detection, language translation, search, image understanding and recommendation engines. Bringing these same technologies into our apps means a better, more customized user experience and maximizing user engagement and satisfaction. Lets step back and think of some examples. Such as:
-
Image understanding can be used for handwriting recognition, object recognition, face recognition, emotion recognition, identifying plants and animals, estimating crop yields, estimating crop types and mixtures
-
Natural Language Understanding can be used for text prediction, text summarization, automatic highlight generation, sentiment analysis, custom domain/app specific search, improving customer service, custom automated responses
-
Audio Processing can be used for speech recognition, voice first interfaces, bird song recognition, machine operation analysis, room activity analysis
-
User Behavior Analysis can be used to guide the user to accomplish the tasks they want to do, generate recommendations, surface features and important information
... and much much more. Not to mention all the health related apps that can take advantage of the myriad of sensors on devices.
What Is Machine Learning?
There are a lot of similar terms being used interchangeably in different contexts which can be confusing. A good starting point is:
Artificial Intelligence (AI) - the study of "intelligent agents". Reasoning, knowledge representation, planning, robotics, etc.
Artificial General Intelligence (AGI) - the sci-fi future where robots are generally as smart as we are.
Artificial Super Intelligence (ASI) - the dystopian sci-fi future where robots are smarter than we are.
Artificial Narrow Intelligence (ANI) - the present and near future where machines are as good or better than us in very narrow specific tasks (playing Go, playing Poker, driving vehicles, detecting anomalies).
Machine Learning (ML) - a subset of AI focusing on programs that learn from data and make predictions. In comparison to rule engines, for example, where explicit bits of information are encoded in the program by the programmer and subject matter expert.
Deep Learning (DL) - a subset of ML/AI using artificial neural networks (ANNs). In comparison to traditional statistical forms of ML such as linear regression, SVM and decision trees and forests.
Regression - the prediction of continuous values such as the expected price of a house, time of user engagement, or total purchase size.
Classification - the prediction of discreet values/labels such as fraud or not fraud, positive or negative sentiment or emotion, likely to convert or not.
Training Phase - typically we create models by training (offline in a batch process) on data we have collected. This can also include validation on data that has been held out and not used during training in order to evaluate training progress.
Inference or Prediction Phase - when we're done training we use the model to make predictions on new data. This can be on test data to ensure it is performing as expected or with real / live data in production.
Online Learning - algorithms that can learn or refine what they know as you use them rather than in a separate training phase. These are not yet universally common or well understood and are currently heavily researched.
Supervised Learning - using labeled data during training. That is, we know the values we are trying to predict (such as cats vs dogs or sale price of a house) in the training data and we want the model to make similar predictions on new data.
Unsupervised Learning - using unlabeled data to group and better understand it. For example if we want to generate user personas based on what we know from our users vs personas based on who we think might be using our app.
Reinforcement Learning - the algorithm learns by interacting with its environment and getting rewards and punishments based on its actions. Imagine a robot/agent exploring a space and learning how to interact with its environment.
Show Me The Data
The key to using AI/ML is having enough good quality data.
It doesn't have to be big data but more is always better.
Luckily, as software developers, we don't have to go out in the field to catch and measure specimens. We can instrument our software to process and save meaningful events.
Some important questions to ask yourself:
- What business questions do I want to answer?
- What data do I have or can get that can help me answer those questions?
- Can a human answer those questions with the data available?
- What is my performance metric? What level of performance does my application require before it is useful?
With this information you'll be able to design a data collection and processing framework that can make AI/ML model selection and training possible.
As I mentioned, more data is always better so collecting as much high quality data should be a priority. Fortunately, there is a lot you can do with pre-trained networks, by using weakly labeled data and by using synthetic/augmented data.
Baking It In To Mobile
When it comes to training and inference, inference may require significant memory and computation power, but training is the real challenge. Especially on mobile devices.
To reduce latency and improve privacy you'd like to do as much as possible on the device. Though modern phones are powerful hand held computers training AI/ML models are still too demanding of memory, processing power and battery life to be practical for most applications.
There is a lot of work on improving this so that we can push as much computation out to the edge but for now this leaves us with a few options:
-
Use simpler models - Traditional ML models (SVM, decision trees, small ANNs) require much less processing power than modern ANNs and may be accurate enough for many situations.
-
Use a web/network API - Either your own or a 3rd parties such as Microsoft, IBM, Amazon and Google and many others.
Web APIs are easy for you to implement and 3rd party ones don't require maintenance on your part and are usually built to scale to any number of requests. The downsides however are increased latency in making an off device request, requiring a network connection and possibly the privacy implications of moving data off the device.
- Train on a server. Do inference on device. - In this approach you leverage powerful servers to train a model and then load the model onto the device for inference.
Models can be updated and downloaded as needed. However deep learning models themselves can be pretty large, 500MB or more, so people are working on finding ways to make them smaller without impacting performance too much.
Plus you still have the privacy concerns of getting the data off the device to be trained on the servers. Though perhaps that can be addressed somewhat by anonymizing the data or updating the model on device where possible.
Software Tools
Yet another challenge for machine learning on mobile is that the languages typically used for such as AI/ML such as Python, R, and Matlab are not particularly mobile friendly.
There are libraries for traditional models in every language but its a bit different when it comes to neural nets. Many of the Python packages simply wrap an underlying C/C++ library for training and you can directly access that library for inference on a device. There is a lot of work in this area and the landscape changes frequently so keep an eye on caffe2Go, Tensorflow, and mxnet
The iOS Metal Library with Basic Neural Network Subroutines (BNNS) can be used for inference on the device but is not a complete solution and integrating with other libraries (importing models trained elsewhere) is tricky.
I would not be surprised if Apple introduced a Swift based DL library so that you could train on your Mac/server and do inference on an iOS device.
Conclusions
There are many great opportunities to improve your apps with AI/ML. You can start out with simpler models such as linear regression, GBM, and Random Forests. And then move to deep learning if your needs warrant. This is a fast moving field. Keep your eyes out for new developments in software and hardware from Apple and Google on both the server and device sides to speed things up.
How are you planning on using AI in your app?
How many ways can you think of to improve your apps with these technologies?
Resources
Interested in learning more? These are some good places to start.
- Andrew Ng's Coursera Course
- Neural Networks and Deep Learning online book
- Curated list of machine learning resources for iOS developers
- Learn TensorFlow and deep learning, without a Ph.D.
Take care.