Logo Reinfer

NLP – Buy, or Build?

Computer code on screen

In recent years, deep learning architectures and algorithms have made impressive advances in the field of natural language processing (NLP). As firms explore how best to apply this technology, they are asking — should we buy, or build?

In this article, we look at the options available to firms that want to deploy NLP and discuss the main factors that, in our experience, drive successful outcomes for clients.

The global NLP market is expected to be worth USD 34.80 billion by 2025, with a CAGR of 21.5% during 2020–2025.

This explosive growth is fueled by firms that want to drive a step-change in their analytics and automation capabilities by extracting insights from the large volumes of unstructured data that flow through their enterprise systems — email, chat, incident management, CRM, etc.

But how should a firm approach NLP? Should they try to build something themselves, or just buy a pre-built solution?

An in-house NLP solution can be delivered with open-source software like spaCy or NLTK or Hugging Face. Or it can be powered by a cloud cognitive API such as Azure Cognitive Services or Google’s Cloud Natural Language.

Both of these approaches fit into the “build” category because the whole end-to-end data pipeline has to be built — real-time connectors, content extraction, data cleansing, labeling, model training, model testing, model versioning and deployment, monitoring, and integration with downstream applications.

Any production solution must also support enterprise requirements such as security, access control, auditing, data lineage, availability, recoverability, etc.

In our experience, the key to evaluating the options is to understand what it means to develop a machine learning model and to deploy and operate that model in a production environment.

What does it mean to build a model?

Developing a machine learning model with sufficient accuracy to be useful is a lot harder than most people think. Accuracy is really important — false positives create unnecessary work and false negatives create risk.

As a minimum, developing a model involves the following tasks:

  1. Choosing an algorithm — firms must select a learning algorithm to suit their data and use case(s). We see many firms experiment with statistical models (e.g. TF-IDF) only to find the accuracy is poor. Enterprise data tends to be highly variable, and so firms generally need some kind of neural network to achieve good accuracy (most of the recent advances in NLP were only made possible by deep neural networks — to support this, your solution will almost certainly need to run on GPU machines).

  2. Preparing training data — you need to use your own enterprise data for training, even if you are using a cloud API. It can take a lot of time and effort to source the raw data, clean it, and transform it into a form suitable for training.

  3. Training the model — we have seen poorly designed solutions (from blue-chip companies) involving manual labeling of data on spreadsheets and long training cycles measured in weeks or months. Generally speaking, training automation and good user experience design are essential considerations if you want accuracy.

  4. Testing the model — in our experience, a model needs to attain precision/recall thresholds of ~97% before it can even be considered for production use. Below this threshold, the number of false positives and false negatives will create a net increase in operational effort and risk, thereby eliminating the ROI of your machine learning initiatives.

  5. Iteration — tasks [1–4] require frequent and continuing iteration to refine accuracy. Even after a model is versioned and released to production, it is common to refine further as new data is observed. In many cases, a solution may need to support an adaptive learning approach, whereby training occurs continuously against the real-time data in production.

In a build option, most of the above tasks will always be very manual.


  • Cloud APIs do allow you to train your own model, but you still need to source the data, prepare it, test the model, iterate, and build an end-to-end data pipeline.
  • Cloud APIs offer pre-trained models, but they are highly unlikely to work on enterprise-specific data.

What does it mean to put a model into production?

In an enterprise setting, all the normal release management and quality controls apply to a machine learning model:

  • Models must be tested and signed off by business owners — that means they have to give deterministic results in any environment.
  • Models need to be versioned, with sufficient controls to release new versions without service interruption, and manage rollbacks to a previous version if necessary.
  • As with any automated process, a firm needs monitoring and control procedures in place to detect performance degradation and manage the impact on flow.

We are also seeing specific requirements emerge for machine learning solutions:

  • Auditability — to explain where training data came from and how decisions were reached.
  • Model bias detection — requires appropriate reporting metrics and compensating actions.
  • Adaptability — ability to detect unseen data and handle it gracefully.

Firms are also having to develop new governance procedures around machine learning solutions to answer questions like:

  • Who trains a model and decides when a model is ready for production?
  • Who monitors model performance in production?

In practice, this requires new roles and fine-grained user permissions to meet compliance requirements — for example, separation of duties between model management and operational roles.

In a build option, you need to integrate the solution with the firm’s release management, security, and control frameworks — as well as real-time data sources and downstream systems.

Key takeaways

  1. Businesses want to deliver NLP benefits quickly and sustainably— focus on reducing time-to-value while ensuring the solution can be scaled and maintained across the enterprise in order to deliver a step-change in analytics and automation capabilities.

  2. Building a production NLP solution is typically a multi-year effort  model development is hard and requires careful thought around automation, reporting, and user experience. Plus, the solution needs to meet enterprise requirements and be integrated with source systems and control frameworks.

  3. A bought solution reduces risk and guarantees fast time-to-value — NLP is non-core and non-differentiating, it just needs to work. A bought solution gives you the latest algorithms (constantly updated), short training cycles, automatic model validation, out-of-the-box controls, and pre-built integration with enterprise systems and security frameworks.

Ready to work together to accelerate your success and drive digital transformation?

Grow your customer base by joining the Re:infer partner program.