ShareShareShareShare
Case Study

Speech to Text & Call Summarization using NLP

On
Off

Overview

Leading US-based Teledentistry company that works closely with dental health experts to help consumers via its products.

The Challenge

  • The client’s agents spent 5-7 mins per call for the after-call work of manually summarizing the call details. This lead to low agent productivity and made the process prone to human error.
  • The challenge was to automate call classification and summarization by converting the inbound audio calls received by agents to text (STT), speech diarization, call categorization, and summary generation.
  • Identification, benchmarking, selection, and continuous evaluation of best performing deep learning models were core to solving this problem.

Our Solution

  1. Mindtree’s solution was to benchmark cloud API services and State of the art (SOTA) model for Speech To Text (STT) on the accuracy, pricing, and response time. AWS API was finalized for converting audio streaming files to generate the transcripts and speaker diarization.
  2. The summary was generated by using SOTA models like Pegasus and BART variants and calculate ROUGE scores to validate the accuracy.
  3. The implementation included building training and inference pipelines for data preprocessing, custom model training & evaluation, and further exposing these models as APIs for consumption using AWS Sagemaker.
  4. Tech stack used: AWS Sagemaker, Python, Jupyter Notebooks

Download Case Study to read more

Teledentistry-thumbnail
Get in touch

Thank you for your submission. We'll be in touch.