improving recognition

  • 1
  • Idea
  • Updated 1 month ago
I assume that you are using IBM Watson Speech to Text service, thus Feeding text + audio to the recognition system should improve the recognition - as shown at Where the text is added as keywords to spot. If you have such an interface and the user has PPT or other slides, they could extract the text and pass this together with the audio --> to get better recognition.

Since it also seems that you are not preserving information from the speaker, it is not clear how quickly the recognition will get better, even if statistically what you say is true (given the aggregated volume of recognition that the IBM recognizer is getting). What it lacks is feedback, thus it would be good if you could also provide it with the edited recognized text (after human editing).
Photo of chip.maguire


  • 6 Posts
  • 0 Reply Likes

Posted 1 month ago

  • 1

Be the first to post a reply!