In an earlier blog post, I suggested that Machine Learning might be an interesting alternative to rigorous facilitation for the discovery of business rules. Today I’d like to talk about how we might identify domains where we might consider Machine Learning.
Broadly speaking, there are two large classes of Machine Learning algorithms.
In supervised Machine Learning, we have a collection of “training data” that we can feed to an algorithm to create a model for the data domain. In those scenarios, we know both the input data as well as the expected result. Examples of this might include approval decisions where someone fills out an application for something and the application is either approved or rejected. Enterprise environments with workflow systems may have a tremendous amount of data of this form that already exists.
In unsupervised Machine Learning, we’re trying to solve a slightly different problem. In this case, we may not know the outcomes that we are looking for, but we are expecting the outcomes to group together in some way based on the input data. An example of this might be an algorithm that looks at a lot of documents and tries to classify them by content. We don’t know in advance how we want them classified. We use the algorithm to help us to identify classifications.
When we build software to solve business problems there are some questions we can ask ourselves about the problems that we’re trying to solve. These questions might help us to identify whether the problems would be a good candidate for Machine Learning.
- Do we have enough data that a machine can actually learn from?
Some business processes may be hard to model because they are built around infrequent events. For example, using Machine Learning to automate tasks relating to annual planning or annual financial events may be challenging, not only because the tasks are infrequent, but also because the rules that determine correct input and output may change over time.
- Do we have a lot of examples of both input and desired output?
The cost of storing data has gone down dramatically in the last several decades leading many companies to archive an enormous amount of data. When this data contains information with both an input and a desired result, we may have an opportunity to apply supervised learning techniques. An example could be how companies manage Search functionality. Could we do a better job of displaying relevant results if we have a large volume of data that shows what people are actually clicking on when they perform searches? Google built its initial lead in the search space using page ranking algorithms based on links between “reputable” pages. However, the ranking system is now one of many factors that are used in Machine Learning algorithms that predict what people are actually searching for.
- Are we relying on humans to make repetitive decisions?
Many companies who specialize in business process outsourcing use a collection of competitively priced humans to do mundane tasks in finance, HR, and operations. When human resources are good at doing a series of tasks collectively at a low cost, we frequently see scenarios where the entire task may be amenable to Machine Learning based automation.
Being able to identify domains where Machine Learning might play a helpful role enables analysts to focus on building the right software solutions. Rather than blindly documenting business rules and processes, it’s important that we step back and try to understand if we could use an algorithm to do this better than we can do it ourselves.