Automatically transcribing audio from a video or audio file

  • 1
  • Question
  • Updated 3 months ago
Hi,

You can't do this in Camtasia, but does anyone have any suggestion with regarding automatically transcribing audio from a video or audio file.
  • Have you used any services? 
  • What are they like?
  • How much do they cost?
The intention is to bring in the transcription as closed captions, but going through an hour long video manually is going to be painful. 
Photo of chris.swinney

chris.swinney

  • 60 Posts
  • 11 Reply Likes

Posted 4 months ago

  • 1
Photo of Bill Sistler

Bill Sistler

  • 1 Post
  • 0 Reply Likes
I use text to speech, a windows program
Photo of kdwalkerjr

kdwalkerjr

  • 31 Posts
  • 18 Reply Likes
Trust me, I feel your pain. I do hour long sales videos and transcribe every word. None of the subtitle services will give you great looking subtitles. It’ll look like closed captioning. If that’s acceptable the rates are very low.


If you’re doing it yourself, here are some tips.


Put a text block on the timeline. Stretch it to cover the entire timeline.


Set it’s placement, font, size, effects... and then zoom in right so you can see the waveform detail. You’ll often cut on the gaps you see.


Play the video and edit the text block. As sales material, the text in the video is very important, so I only do one line of text so that I can make the font size larger.


Cut the text block. Always reposition the players to just before your last edit and play to confirm you got it right.


After you do this a lot you develop a rhythm and it goes pretty good. I can get through 10 minutes in an hour.
(Edited)
Photo of David Bookbinder

David Bookbinder

  • 120 Posts
  • 73 Reply Likes
You could try uploading the video to YouTube and letting its auto-captioning do it's thing. I've found that basic correcting of the auto-captioning is not too time-consuming.
Photo of kayakman

kayakman, Champion

  • 8041 Posts
  • 2779 Reply Likes
try using Camtasia's built-in speech-to-text captioning feature


Photo of kayakman

kayakman, Champion

  • 8041 Posts
  • 2779 Reply Likes
How To Do Speech To Text Across Short Media Sections In CS9 2017-05-21
https://www.screencast.com/t/Udc5Xpk6PkD

applies to Camtasia 2018 and 2019 as well
Photo of David Mbugua

David Mbugua

  • 20 Posts
  • 18 Reply Likes
There are many services that you can use to automatically transcribe and caption videos.

1. YouTube is free.
Uploading your video on Youtube will automatically generate subtitles for you that you can download and add to your videos on Camtasia.

NOTE: You'll need to run the subtitle via subtitle next for it to be accepted in Camtasia. I'm not sure why this happens. 

Here's how to download YouTube Subtitles (automatically generated and manually uploaded) using Google2SRT.

https://www.youtube.com/watch?v=pBq6AqhyHPU

2. Otter Voice notes is an awesome A.I. powered tool that has a paid and free option (600 free minutes per month)

It can automatically transcribe and provide you with an SRT (paid). It's super fast. 

Here's how Otter works.

https://www.youtube.com/watch?v=SCaBnObVsqI&t=2s

3. Descript for Windows or Mac. 

Awesome application for Podcasters, Video editors, etc.
It's extremely good.

You can even add a ready transcript and have Descript automatically append timestamps for free.

Check it out.
https://www.youtube.com/watch?v=tZZcZQrpJjw

There many others including, Temi, Trint, Sonix.ai, Google Docs Voice Typing feature, etc.

All the best.
Photo of chris.swinney

chris.swinney

  • 60 Posts
  • 11 Reply Likes
Nice guys :) Thank you very much. 

Most of the time (well for my videos), I pre-write a script, then record the audio separately, which allows me to add captions sentence by sentence, which is of course word for word.

This time I have just a video handed to me, but all I have is the video. I have extracted the audio and processed it so using a service should be fine, or I can indeed upload to YouTube. 

I prefer to work with audio and video separately as you get a lot more control, especially processing audio outside of Camtasia. 

I have tried the Windows text to speech, or Google notes, but replaying the audio into the mic is not very friendly. The Camtasia option is interesting, but I think I will check out some of the other services. 
Photo of bill.raymond

bill.raymond

  • 70 Posts
  • 37 Reply Likes
MacOS has the capability to transcribe your voice while talking and they have speech services, but I do not think Camtasia supports any of that, which is really unfortunate. However, for like $1 you can use Amazon Web Services (AWS) and their Rekognition Transcribe product. There are super simple instructions to set it up and use it. While this is technically a cloud service that would need a programmer, they have an easy way to use it without writing any code.

https://us-west-2.console.aws.amazon.com/transcribe/home?region=us-west-2#welcome

The basic steps are to set up an AWS bucket (think of it as a hard drive) and create a folder. You place the video to transcribe in that folder.

Then you point Amazon Transcribe to the video and it outputs the transcribed audio for you into the same folder. You might find yourself editing it a bit since it is not perfect.

Here is a useful website that will take your transcription text and convert it to a format you can import into Camtasia.

https://www.yash.info/aws-srt-creator.htm

Of course, you can just look up transcription services on Google and will find plenty that uses real human beings at very low costs (starting at $2).

Please note that I posted an idea on this Camtasia forum asking Techsmith to allow us to build our own plugins so developers can write tools like this. If that were an option, I would write an AWS service so you can use it within the product.

Finally, I have used Fiverr.com to create transcriptions and you will find lots of high-quality transcription options there.

Hope this helps.

-Bill
Photo of Joe Morgan

Joe Morgan

  • 9125 Posts
  • 4806 Reply Likes

I like Dragon NaturallySpeaking.


I’m using it to transcribe this response. I’ve been using the program off and on for several years. Meaning, I only use it once in a while.

However, it’s my opinion it’s the most accurate program out there.

The biggest problem with electronic transcriptions is that nobody’s voice sounds exactly the same. You have to train Dragon to recognize your voice. I like to call it training your Dragon. My voice is a little horse today. My health has been less than stellar the last few days. Thus far, the program hasn’t gotten any of my words wrong.

If you don’t speak clearly and enunciate. You will encounter errors.

Dragons not cheap, my last upgrade was $100. The regular price for the new version is $300.

So, when it comes right down to it. Between training your Dragon, which takes time. And the initial cost of the program. It’s more than a casual user might need.

However, if you’re going to be doing a lot of this. Going back and correcting all the mistakes other programs make. Adds up over time.

Couple that with the less than user-friendly close captions in Camtasia.

You may be better off hiring a service.

As you start racking up hours to do all of this. Paying a service a nominal fee to handle all your needs. Starts looking pretty attractive.

However, anytime you hire out. You’re on hold until they deliver the goods.

You can add audio files directly to Dragon for transcription.

I just wanted toss in my two cents here. I haven’t fired up Dragon in a while.

For what it’s worth.

Regards, Joe

Photo of davemillman

davemillman

  • 716 Posts
  • 252 Reply Likes
Now that thread has migrated to dictation, I'll point out that Mac has dictation built into the OS.
https://support.apple.com/guide/mac-help/use-dictation-mh40584/10.15/mac/10.15

For fast-turn transcription projects that require humans, I use Rev.com. Best results I've found, good but not perfect.
(Edited)
Photo of Joe Morgan

Joe Morgan

  • 9125 Posts
  • 4806 Reply Likes
I think you misinterpreted my post.
Electronic transcription requires the same tools used to dictate accurately. That was my point.


Dragon can do transcriptions from audio recordings.
Once trained to your voice. Its more accurate than programs that can't read the nuances of your voice.



Microsoft has a simular program built into Windows. It doesn't hold a candle to Dragon.I think Microsoft should scrap it or fix it. I lean towards scrap it and start over. {:>)
(Edited)