AI and ML Terms in Plain English

Artificial Intelligence (AI) and Machine Learning (ML) are making their way into our lives more and more. In this article I'll define some of the terms that I get asked about most frequently.

Information Retrieval for LLMs

Vector databases have been getting a lot of attention but what are they and are they strictly necessary?

Prompt Engineering and Fine Tuning

Many people make the assumption that they have to train an Large Language Model (LLM) on their data to get better results. Though that would definitely improve responses I suggest that prompt engineering may be sufficient and is faster, cheaper and easier to implement.

Intro to Generative AI

Recently I had the pleasure of speaking to local business and education leaders at the ChatGPT-AI Forum put on by Mountain Area Workforce Development Board of North Carolina. This is a summary of the speaker notes from that talk.

Analyzing Themes in Reviews with Natural Language Processing (NLP)

Recently I was working on analyzing some short texts and came up with an idea for extracting interesting themes from them. I thought the technique might be particularly useful for app/product reviews. Many businesses are interested in analyzing sentiment but this goes beyond that and tries to analyze recurring themes automatically.

Finding Similar Text with Machine Learning and Natural Language Processing

I've been working with a client on analyzing some text documents and wanted to share a bit of what has been working for us. I can't share the data or the exact project details but it entails, finding similar text documents from a large collection of other documents a specific query example. Imagine searching a database of company statements, product descriptions, articles, contracts, emails, support/trouble tickets, etc. not by keyword but by 'meaning' and 'similarity'.

Helping your Community with Data Science

I'm a data scientist (machine learning engineer) and would like to use my skills to help others in my community. I'd also really like to help others help others. So I'm starting a project to bring together volunteer data scientists, engineers and students with non-profits and organizations to find opportunities to do good in their own communities.

Analyzing Features Associated with Churn

Customer churn, the percentage of customers that stop using your product or service in a particular time period, can quickly become disastrous to your revenue. Acquiring new customers is more costly than keeping existing ones and even a reasonable sounding churn rate can result in a leaky bucket that is impossible to fill.

Measuring Image Similarity with Neural Nets

Automatically finding similar and duplicate images can be very useful as a quick way to show similar products or items from a collection of images. For example, I was shopping for a phone case and the online store had many many interesting designs but they were hard to navigate. Once I found a case that I _kind of_ liked I wanted to see other similar cases to find one that I _really_ liked. Unfortunately they only showed other popular cases that were not at all similar to the one I was considering.

An Introduction to Machine Learning Interpretability

Lately there has been a lot of interest in explainable AI/ML. Nobody wants to feel discriminated against by an algorithm and when we don't like its prediction or decision we want to know why it made that decision. Plus there is an added sense of security when we feel we understand (or could understand) how something works.

Tuning a Text Classification Algorithm

In previous articles we worked through basic approaches for text classification by presenting a simplified version of a problem posed by a client and examining the performance of several algorithms. In this article we improve (slightly) the performance of one of the algorithms with a grid and random hyper parameter optimization search.

TensorFlow for JavaScript for ClojureScript

Among the many announcements at the TensorFlow Dev Summit was the announcement of 'TensorFlow for JavaScript' and I of course wanted to play with that ... from ClojureScript. These are the steps I took to get a simple polynomial regression example working in cljs. I created a re-frame template app but that's not important. I just needed a place to keep the code and I liked having a button to press to fire the function.

Text Classification with Scikit-Learn

In this article we talk about using the next simplest approach which TF-IDF with basic classifiers from Scikit-Learn (sklearn). We show that with minimal processing and no parameter tuning at all we get the impressive accuracy.

Text Classification with IBM Watson

Recently I had a request from a client for help classifying short pieces of text. The exact nature of the text is confidential but the passages were similar to paragraphs from reviews and comments users write about products and services. We wanted a quick and easy way to establish a baseline we could use to compare various approaches. This would help us decide, based on performance and expected necessary investment, if further efforts in research, development and operations were necessary.

AI and ML Unconference

We're hosting an unconference March 24, 2018 in Portland, Oregon focused on finding ways to use AI to improve the lives of everyone in the Community.

About the ACA Bot

The ACA Bot is a very early version of a chatbot that tries to answer some basic questions related to the Affordable Care Act.

Machine Learning on Mobile

There is a lot of hype around Artificial Intelligence (AI) and Machine Learning (ML). Its been called *'the new electricity'* and many believe it will fundamentally change our lives as much as the internet and the industrial revolution.

How to Run a Successful Mastermind Group

Working on a business or new project can be lonely. Your family and friends support your efforts, but they don’t really understand the details of what you are trying to do or what exactly you are going through. Projects are difficult and the long hours can make you feel isolated, frustrated or overwhelmed - but don’t let that kill your morale or progress.

Random Forests At Scale

These are slides and code from a workshop in Portland on using random forests at scale with Python, Apache Spark and H20.