Salesforce Launches the Open-source Merlion Project for Time-series Machine Learning Analysis

Gabriel Jones
3 min readJun 10, 2022

--

In late 2020, Salesforce’s application performance management team encountered a challenge — They had to enhance their anomaly detection algorithms. This team monitors the data quality of all Salesforce’s data centers that churn out heaps of real-time metrics. One such real-time metric is CPU usage, which generates time-series data.

“When you can detect anomalies you’re getting from Salesforce data centers, you can identify the potential events happening within Salesforce,” says Adyot Bhatnagar, Senior Research Engineer — Salesforce. “Thus, you can address and resolve anomalies faster while shortening your customers’ downtime. This is one of the reasons why Salesforce is interested in time-series.”

Bhatnagar and his entire team have been working on developing an open-source Machine Learning (ML) library, Merlion for the last two years. Merlion runs time-series analysis with the help of ML. Initially designed to overcome the anomaly detection algorithm challenge, Merlion has now emerged as an end-to-end Python library for a wide range of time-series tasks including forecasting.

How Does the Open-source Merlion Work?

Started as a global collaboration project between the Salesforce research teams in Singapore and Palo Alto, the Merlion project is named after the national animal of Singapore — a mythical creature that is half-lion and half-fish. Just like the mythical creature, the Merlion project does more than just one thing.

In addition to its ML capabilities, Merlion can also enable data loading, data processing, development of diverse models integrated under a common API and training these models to carry out tasks. This Python library comprises various exercises and steps that help users get desired outputs and build a robust data model performance evaluation framework.

Once the project was initiated, Bhatnagar and his team identified the need for time-series ML attributes for various Salesforce’s internal tasks. The open-source Merlion’s time-series forecasting is a great framework to execute diverse tasks.

Let’s say there’s a service in the IT operations sector that uses computational resources like CPU and memory extensively. The time-series ML attribute can be utilized for predictive forecasting that will outline how resource usage will fluctuate, thereby helping Salesforce professionals plan their capabilities better.

Open-Source Merlion: From Ideation to Use

While having an idea to develop an ML library is one task, having supporting technology that translates this library of knowledge into a working production environment is another challenge. Bhatnagar revealed that most ML libraries pose issues while integrating them with the production environment.

Some challenges one may encounter include:

· Access to the essential computational resources

· Data retrieval in the required format and manner and

· The power to read data back to appropriate locations

To overcome such challenges, the Merlion project has added some default options that allow users to get good starting points. The entire Merlion project is undertaken to simplify various project workflows and automate operations wherever necessary.

Toward a New Open-Source Model for Time-Series Machine Learning Analysis

Before Merlion, Facebook-led Prophet project was the first open-source project providing forecasting for time-series data analysis. Bhatnagar stated that Prophet only has a subset of Merlion’s features (pre-processing, evaluation, modeling and post-processing functions), and as a result, does not meet Salesforce’s requirements.

This led Salesforce to create its own project to solve various issues pertaining to time-series data analysis; the company made the solution it developed open for all, enabling various organizations to use it to meet their needs. Merlion’s open-source project from Salesforce can be used both internally within the Salesforce organization and by any other firm that wants to leverage the time-series ML analysis model.

“Perceiving that there’s no standard solution meeting all the requirements of those who want to access and use time-series machine learning analysis from a single platform, we deduced that the Merlion project will not only be useful for Salesforce internally but for others who’ve encountered time-series problems as well,” added Bhatnagar.

--

--

Gabriel Jones
Gabriel Jones

Written by Gabriel Jones

Hi, I am Gabriel, Salesforce architect at Solunus; I have helped several firms use cutting-edge products to achieve their business goals.

No responses yet