As a beginner in machine learning (ML), you’re typically confronted by several questions: Should I really invest the effort to learn about this technology? If so, how do I get started? Are there any pitfalls? Will this knowledge increase my employability? This article will aim to answer some of these questions.
For starters, you should definitely learn about machine learning. It is the future — intelligent machines will predict your world, explain valuable insights, automate some tasks (e.g., driving a car) and assist you in making optimal decisions. Even as a software professional, you need to enhance your skill set because ML will develop expert systems, reducing the need for traditional software development.
Getting started may seem daunting. Should you enroll in a course? Not necessarily. Most of the courses are designed to make you a data scientist, and that may not be what you want for your career. You may prefer instead to use ML as a casual practitioner, not an expert.
The way forward would be to read up on how ML can help with your objectives, collect limited training data (the variables you think are important) to train the ML and sign on to an ML platform that walks you through the process of building models to predict. There’s a preference among some for the ML platform to be open source, but that comes at a cost: assumed expertise in programming, in languages like Python, that may deter you to do ML.
When researching an ML platform, make sure to opt for a user-friendly one with easy-to-follow self-help videos and generous support to help you overcome any issues. When you have your first ML model, check its predictions against what you actually observe, knowing fully well that ML is an iterative refinement process.
The guidelines above seem logical, but let’s exemplify them with specific instances that you can relate to.
Say you are the subject matter expert of an operations process for an enterprise. The process accesses data from a variety of sources, including cloud, and decides on how to execute. It is not easy to write rules on how to cover various possibilities that include many exceptions. To investigate each request exhausts you. You have often wondered if you can build a system as an aid to assist you — a system to suggest how to process the request and the reason for it. You have heard about machine learning and how it may help invent such a system, alleviating your workload. But you don’t know how to get started — how to invent the training data to train ML (for supervisory learning).
Here’s another instance of a potential dilemma. Let’s say you are an economics major who has collected tons of data about your research project. You would like insights into what the data reveals about the outcome you are analyzing. What should you do? Study correlation between the variables? But that can be misleading. You have heard about ML and how it could help, but you don’t know how to get started — how to learn ML so as to use it for your project.
I believe these challenges have been holding back ML from widespread adoption. So what should you look for in an ML platform to help you overcome these hurdles?
Let’s start with the training data issue, a considerable challenge. There are many training data providers that have mushroomed (e.g., Lionbridge, Appen). They provide you with crowdsourced training data, making every effort to sanitize it and free it of biases. However, it could well be that you can’t divulge your process details because they contain strictly confidential information, so you can’t benefit from these services. Or you are exploring ML and don’t have an approved budget to pay for these services.
When doing your research of an ML platform, look for one that has “connectors” to most of the common sources of data, including your systems and the cloud, to automatically extract the information elements you are interested in. These connectors should be able to grab information elements automatically, assemble them in a file in the appropriate order and go about training statistical models from the training data in your file so that you can predict your world. The connectors should be sophisticated to periodically pull in the data, once daily or every so often, and collect enough over days and weeks so you can acquire your training data to get started with ML.
So how do you use this data to understand ML and build models? Look for an ML platform that automatically builds models from training data. It should also explain the various steps involved so that you can understand and appreciate how the platform is processing your data, the statistical techniques it is applying and why, and how it is selecting the best model to learn and predict for your use case. Unless you have these answers, it puts you in an uncomfortable position where you can’t discuss your work confidently with your peers.
The bottom line: Learn ML by doing ML. Machine learning has evolved to where you should be no longer be afraid of how to get started.