Using Machine Learning To Reduce User Intervention
In recent years there’s been a lot of focus on leveraging machine learning to automate repetitive user functions. The challenge comes from multiple factors. How do we identify repetitive actions performed by our users? How do we categorize that action and formulate it into machine-understandable rules? How do we create our own rules to take the corresponding actions?
These are challenges we’ve been trying to solve in Liquidnet’s middle office platform. We recently deployed a new AI-driven product called “Clippy” (Client Preference Python) that evaluates past user actions and makes recommendations to automate certain repetitive tasks via a customizable rules engine.
Data Gathering – Identifying Repetitive Actions
The middle office system is written on a web stack, so we can easily record all POST requests made to our webserver from our users. These POST requests serve as an audit stream for all the actions we need to categorize.
We can then record a few key features needed for our machine learning element:
· Type of database record our users are modifying
· Features of the record (IDs, trade identifiers, client information, etc.)
· Action taken by our users (cancels, corrections, etc.)
Using Python and SQLite, we built a full-blown rules engine that can alter data on the fly. Our Python rules engine is able to monitor events happening in our system. Each event gets converted to an in-memory SQLite record where we can use SQL-like syntax to create WHERE clauses and determine if the event needs to be processed by our rules engine. Using SQLite gives us a robust language that is already well known, without creating a new DSL.
With the event now in our in-memory SQLite database, we can also use UPDATE statements to alter the features and now columns directly. Once the data is altered and committed, we can then translate it back to a user action which can take the form of a POST request, a middleware message, direct database modification, or another event in our system.
Now that we have a way to identify users’ repetitive actions and express them as actions in our system, we use Python’s pandas and scikit-learn libraries to create a decision tree classifier. We can correlate the features from our audit stream with the actions taken by our users and identify records with a high degree of accuracy that we’d expect them to take an action on. Scikit-learn helps us build a data model which we convert back into our SQLite rules engine where the nodes in the decision tree translate to rules in our SQLite database.
Since most of our workflows are automated and don’t require user intervention, we end up with a lot of noise in our model that comes from “no action” events. To eliminate that noise we feed our raw data through a sparse matrix from the SciPy module, which is designed especially for this purpose. This lets us work with millions of rows and events in our system and build a model where high-probability events are automated and no longer require user intervention.
By Yevgeny Abov, Technology