The Ultimate Guide to Automatic Captioning
March 2, 2021

The Ultimate Guide to Automatic Captioning

Have you ever stumbled upon a video that you wanted to watch but couldn't do so because the audio was too low or muffled? Or the video wasn't in a language that you spoke or understood?


Or maybe you were in a crowded place without headphones and playing it out loud would have earned you death stares and a few colorful insults.


This is where automatic captioning comes in. It solves all these problems and more.


Thanks to the advancement of technology, more and more people are using smartphones and online video platforms are popping up everywhere. 


Reports show that mobile video consumption increases by 100% every year. Over 78% of people watch videos online every week with an average of 84 minutes spent on this endeavor per day. By 2022, online videos are predicted to make up more than 82% of all internet traffic.


What does this all mean?


Video is the future of communication. But it's not enough for you to just create and share videos. You need to make sure that people from all walks of life can easily access your video and understand your message.


In this article, we'll walk you through everything there is to know about automatic captioning, why it's important, and how your brand can maximize impact and expand its reach by adding captions and subtitles to the spoken media you produce.


You'll also learn about the best practices to implement when captioning your content and the tools you can use to get this done efficiently.

What is Automatic Captioning?

Automatic captioning - the process of using artificial intelligence to reduce the manual labor and stress involved in captioning media files.


Anyone who's spent time on transcriptions or captioning can tell you it's very challenging, painstaking, and time-consuming. You could end up spending 5 to 10 hours captioning an hour-long video depending on how experienced you are.


This presents a lot of difficulties, especially when you churn out large volumes of content. Consumers are not known for their patience. They want what they want and they want it right now. If you can't give it to them, they will go somewhere else.


Training and leveraging AI-based solutions can save you a lot of time and money while ensuring that your media assets are properly captioned and ready to be shared with the world.


There are many techniques and technologies that are used by AI to create automatic captions for your videos such as speech recognition, audio recognition, diarization, language identification, context, audio description, and AI vocabulary. 


You might be wondering: who needs captions anyway?


The answer is everyone, but especially people who are deaf or hard of hearing. Research shows that over 5% of the world's population — 466 million people— has some degree of hearing loss. Your content needs to be accessible to them.


There are numerous instances where captioning comes in handy from theatre productions to lectures, workplace meetings/interactions, television, and events.


Captioning not only promotes inclusion but also gives your audience a chance to follow or catch-up to the dialogue and assimilate the message you're trying to pass across.


Here's another thing: search engine algorithms are designed to read text, so they won't be able to tell what your video is about unless you describe it to them with words.


Captions and subtitles provide these details so Google and the rest can interpret them and accurately present your videos in the result pages for relevant searches.


Although there are plenty of platforms and video captioning software out there, learning and getting more powerful by the day, these auto captioning tools are still far from perfect. 


Spoken words can get captioned incorrectly due to background noise, accents, mispronunciations, or the machine's lack of familiarity with certain terms.


As a result, there's still a need for human influence. You have to proofread and edit the generated subtitles and captions to ensure they're an accurate representation of your video.

The Difference Between Closed Captioning and Open Captioning

When using automatic captioning software to add text to your videos you can choose to go the close caption route or use the open caption method. The main difference between the two is user control.


Closed captions are encoded in a way that allows users to turn the captions in the video on or off. The captions are contained in a separate file rather than the video itself.


Source


Open captions are burned directly into the video so that it's impossible for the user to turn the captions off. They're permanent. Anyone who watches your video will see the captions whether they like it or not.


Social media platforms have embraced open captioning because it creates opportunities for the captions to grab the user's attention even when the video autoplay is silent. According to Facebook, adding captions to videos increases view time by up to 12%.


It's also commonly used in cinema and clips intended for platforms that don't allow sidecar files.

Key features and differences 


Closed captions

Open captions

Little control over the look and style of your captions

Font, caption size, and other stylistic elements display exactly as they are embedded

Easy to include multiple languages

Works best when only one language is used

Highly beneficial for search engine optimization

Doesn’t benefit SEO because they can’t be read by search engines

Correcting or modifying your captions can be easily done even after they’re exported or published

You have to remove and re-embed captions every time you need to make edits

Supported by a wider variety of video platforms which makes uploading and syncing easier

Requires expensive video captioning software and professional help to create them

Easier to repurpose captions for other content



Captions vs Subtitles 

These terms are frequently used interchangeably, but they're not necessarily the same. So how are captions different from subtitles?


Captions are used to relay the message of the audio in the same language it's spoken in. They are the words on the screen that represent what is being said in the video, including non-speech audio information. Captions are based on the assumption that the user cannot hear.


However, subtitles are used to translate what's being said into another language that the viewer is familiar with. They're intended for people who can hear the audio but don't understand the spoken language. In addition, subtitles may also be used by those who speak and understand the language. They may turn on subtitles while the sound is turned off, etc. 

DIY: How to Add and Edit Automatic Captions for Videos

Creating your own captions might be a tedious, time-consuming activity, but it's very doable if you've got the backbone for it. Especially for those moments when there's no room in the budget for a professional captioning service.


These DIY captioning techniques will help make your videos more accessible and engaging:

1 . Use YouTube to create captions

Whether you intend to share your videos on YouTube or not, you can use the free automatic closed captioning tools the platform offers to caption your content.


Simply upload your video to YouTube, set a language, and wait a few hours for it to auto-generate subtitles. But as we mentioned earlier, automated close captions are hardly free of errors so you'll need to review and edit them.


Source


All you have to do is:


  1. Sign in to YouTube, go to the video manager, select the desired video and click Edit > Subtitles
  2. Click the language you want to edit to display the automatic captions
  3. Click Edit, scroll to any line in the tracking panel and select the frames you want to edit
  4. When you're done, click Publish to overwrite previously saved captions
  5. Scroll to Actions > Download and select the caption file format you want to download (usually .srt)
  6. Open the caption file in a text editor, delete any extra lines or spaces, and upload it alongside your video to your desired platform


How to adjust your caption settings on YouTube



If you already have a transcript, you can use YouTube to accurately time code your closed captions to your video.

2. Transcribe your audio manually or use automatic speech recognition software

A trained transcriptionist might take five hours or more to transcribe one hour of video or audio content. For an untrained person, captioning content from scratch could take double or even triple the amount of time.


And here's the kicker: your text is going to be riddled with human errors.


Source


When you weigh the time cost of manually transcribing content on your own from scratch, it's seldom worth it.


The best thing you can do is leverage voice recognition closed captioning technology like Keevi, Dragon, YouTube, Watson, or Otter to automate some parts of the process. This will save you tons of time and frustration. 

3. Create captions with an existing transcript

Using a transcript makes the work of creating closed captions for your videos much easier. All you need to do is synchronize your video and transcript so that they flow together. 


Attempting to add the time codes by yourself is not advisable. For utmost accuracy, use tools like YouTube, Dotsub, or Camtasia to create the time codes.


The final step is to ensure your time-codes transcript is in the right format for the media player you plan to use before uploading it to your video.


Automatic speech recognition software such as Camtasia, YouTube, and Dictation can help produce a rough draft transcript which you can edit and use for this process.

Automatic Video Captioning

Each video player comes with its own rules and quirks, so what works for one might not fly with another. As a result, you need to know how to add automatic closed captions to your videos using different methods.

1. Burn your captions into the video

When you burn captions into your video file, it'll remain there permanently so anyone who views your video will see them whether they like it or not. This method works best for social media or offline videos. These captions are called hardcoded captions. 


Making use of Keevi is one of the best, easiest and cost-friendly ways to go about hardcoding captions in your videos. Alternatively, you can use open captioning software like Adobe Premier Pro or hiring a professional captioning company. 

2. Use a "sidecar" file

This is the most common way to add subtitles or captions to videos that are intended for online viewing. The sidecar file is the computer file that contains the transcript or captioning data for your video. 

3. Utilize an integration or API workflow

This automatic captioning method will create a link between your video player and captioning vendor so the latter can automatically post the captions to your video file once it's ready. This saves you the stress of having to create the captions or upload and merge them by yourself. 

4. Embed your captions

Embedding or encoding captions into your videos works best for offline videos or videos intended for a platform that doesn't offer captioning support.


Instead of having your video and captions in separate files, they'll be merged into one asset. However, users who view the video offline will still be able to turn the captions on or off.

Benefits of Using Captions

Investing in video captioning can produce tremendous gains for both you and your audience such as:

1. Improved SEO 

The text in your captions will enable search engines to properly crawl and index your videos. This will boost brand awareness and lead to your videos ranking higher and getting more views.

2. Accessibility

71% of people with disabilities will abandon your content immediately if it's not accessible. Captions and subtitles allow your message to reach a wider audience.

3. Content viewing flexibility

Over 85% of videos are consumed without sound. With captions, your message can still get across and remain accessible when viewers watch your videos without audio.

4. Increased engagement and retention

Captions allow for better comprehension. People are far more likely to remember what they see than what they hear. And if they like what they see, they'll happily engage and share it.

5. Videos can easily be repurposed

You can turn your captions into transcripts that people can download and refer to at will. They can also be converted into infographics, blog posts, newsletters, or social videos. That’ll save you the time, money, and energy when creating new content from scratch.

The Cost of Poor-Quality Captions

Having no captions on your videos is better than having a video where the captions are blurry, illegible, or don't properly capture what's being said. Inaccurate or poor video captions can negatively impact the accessibility of your content.


Beyond that, it can also:


  • Ruin your content's message by fostering confusion and misinformation, which can be potentially harmful if relied upon.
  • Create a frustrating viewing experience for the user that ultimately tarnishes your brand reputation and costs you sales.
  • Damage website or social media SEO because search engines don't appreciate poor content riddled with grammatical errors and inaccuracies.
  • Complicate your work and force you to spend time and resources fixing your errors when you should be focused on other projects.
  • Lead to low user engagement and video ROI because viewers will abandon the content as soon as they discover the caption quality is poor.

How to Add Captions and Subtitles to Social Media Videos

Social videos have become a powerful tool for brands, influencers, marketers, and users to share interesting stories and communicate with their audience in an engaging way. 


It's quickly become social media's most popular and beloved form of content.


Studies show that 72% of people prefer to learn about products and services through videos than other mediums. In 2016, HubSpot and Wistia revealed that social media posts that contained videos generated 48% more views and 1200% more shares than those that didn't. (WordStream)


Instagram video consumption has grown by more than 40%, while Twitter claims over 2 billion daily video views on its site. 


Adding captions to your social videos will entice more people to watch them and increase the interactions you get. 


When you consider the ROI, captioning your social videos seems like a small price to pay for such massive rewards.


Here's how you can add captions to your social videos on various platforms:


Social Media Platform

Video Captioning Method

Facebook

  • Transcribe your video and save it in SRT file format with the naming convention: filename.[two-letter language code]_[two-letter country code]. For example, a marketing video file in UK English would be named marketing.en_UK.srt
  • Go to your Facebook page, select Edit Video, scroll to the Captions tab and click Upload SRT File, select the desired video, then hit Save.

Instagram 

  • Burn or encode your captions directly into your video preferably in Mp4 format
  • Keep your video length under 60 seconds
  • Upload video to Instagram 

Twitter

  • Transcribe your video and save the caption file in .srt format
  • Log into Twitter, click on a video within the media studio library of your captioning software, select the subtitles option, then select the language of your subtitles
  • Hit the upload button and add an SRT file from your computer

LinkedIn

  • Create a caption file for your video in SRT format
  • Log in to LinkedIn on desktop, scroll to the share box, and click the Video icon to select the desired video 
  • Click the Edit tab in the upper right corner to view the Video Settings, click Select File to attach your SRT caption file, and hit Save
  • Enter any additional text into your post and click Post
  • LinkedIn will send you a notification when your video is up and ready for viewing


Common Captioning Formats

Captions are not a one size fits all deal. There are dozens of caption formats for you to choose from depending on your needs and the platform you want your video to run on.


Closed caption formats come with varying levels of functionality, compatibility, and ease of use. They include XML, SBV, RT, SCC, CAP, SRT, SMPTE-TT, and more.


However, the most commonly used formats for closed captioning are SRT (SubRip Subtitle) and WebVTT (Web Video Text Tracks) files. These formats read like a script, are fairly easy to create by yourself, and are compatible with most lecture capture programs and video players.


Now, let's take a look at some closed caption formats that you are more likely to deal with, their uses and systems they're compatible with:


File Format

Compatibility

Use Cases

Web Video Test Tracks (WebVTT, .vtt)

YouTube, MediaCore, Brightcove, Video.js, JW Player, Vimeo, and more

  • It supports positioning, audio description, and text formatting 
  • Used for HTML5 media players, web videos, and cloud-based video management programs 

SubRip Subtitle (SRT, .srt)

Kaltura, Windows Media Player, VLC, Blip.tv, Slideshare, Mediasite, Facebook, Camtasia, thePlatform, Wistia, YouTube, Adobe Presenter, etc

  • Runs with most popular video recording software, lecture capture systems, and media players
  • Easy to learn and create on your own

Real text (RT, .rt)

Real one Player, RealPlayer

  • Streams easily and uses very little bandwidth
  • It's a timed-text file for RealMedia

Distribution Format Exchange Profile (DFXP, .dfxp)

Panopto, Flowplayer, YouTube, Kaltura, Ooyala, MediaSpace, Adobe Flash, Limelight, etc

  • Not CVAA compliant
  • Uses timed-text format

Scenarist Closed Captions (SS, .scc)

DVD Studio Pro, iTunes, Adobe Encore, Final Cut Pro, Apple Compressor, YouTube, Apple Compressor, and more

  • Adding captions to VHS and DVD videos
  • For web and broadcast videos
  • Based on the standard transmission/broadcast format for closed captions in North America — CEA–608

Subviewer (SBV, SUB)

YouTube

  • It's similar to an SRT file
  • Doesn't recognize style markups

Society of Motion Picture Television Engineering – Timed Text (SMPTE–TT, .xml)

AOL, SubtitlePlus, Crackle, Yahoo, Open Source Media Framework, Microsoft Media Platform's Player Framework, Netflix, Flowplayer, Amazon Video, Open DCP, YouTube, etc

  • It enables captions to include symbols and special characters 
  • FCC regulations compliant 
  • It supports caption positioning
  • It references video frames rather than video time


Tips for Video Captioning, Subtitling, and Transcription

All captions are not created equal. Accuracy, quality, positioning, and readability go a long way in improving the viewing experience for your audience. You can't just add some text to a screen and call it a day, subtitling videos require extra care and effort.


Here are some best practices you'll want to keep in mind when creating captions for your videos:


  • Display your captions with a maximum of two lines and 35 characters per frame.
  • Match the spoken-word style. If informal language is used in the video, your caption should do the same.
  • Time and position your captions so viewers can read and follow them easily.
  • Don't obstruct important visual content with your captions.
  • Be consistent with the style and size of your captions and use a legible, sans-serif font.
  • Describe sound effects and non-verbal information in brackets.
  • Captions should show up on the screen at a time that matches the speed of the words being spoken.
  • Use a font that supports symbols so you can provide non-verbal information where necessary.


But that's not all. These captioning mistakes can compromise your content so do your best to avoid them:


  • Free captioning software can save you lots of money but they aren't 100% reliable so don't depend entirely on them.
  • Your captions shouldn't be word-for-word audio transcriptions. Consider reading time on screen and reading speed and shorten your captions accordingly without sacrificing accuracy and meaning.
  • Don't do it all by yourself. Use a reputable closed captioning software to do the grueling work, then edit the transcribed text.
  • You're not entering a graphic design contest so don't go overboard with the style of your captions. A simple, readable font and standard color is all you need.

Voice Recognition Closed Captioning Software

Many tools use machine learning and voice recognition technology to produce captions without human intervention by converting spoken words into text.


The use cases for voice recognition software is practically endless. It can be used for legal documentation, customer service, healthcare, marketing, internal workplace communications, and more.


Here are some of the most popular and top-rated speech recognition software that you can try out:

1. Express Scribe

Although Express Scribe is one of the more pricey options on the market, it's well worth the price. It's easy to learn and offers many features to make your transcription process smoother and faster. This includes audio and video playback, comprehensive editing, and audio quality adjustment.


Express Scribe supports all audio and video formats recorded on a variety of devices. You can use the hot key function to automatically insert accurate timestamps into your transcript. 

2. The FTW Transcriber

This transcription software allows you to create captions for your audio files easily. It offers automatic timestamps, hot keys, and great sound quality to improve your captioning experience and make the work go faster. You can try out the free plan to see what the fuss is about before handing over your money.

3. Speech Notes

If you're looking for a free speech recognition closed captioning tool that can help you seamlessly transform your audio files into text, Speech Notes could be right for you. It claims 90% accuracy, recognizes pauses, and inserts appropriate punctuation where needed.


Although the user interface leaves something to be desired, it gets the job done.

4. Google Cloud Speech to Text

This easy-to-use voice recognition software supports various audio and video formats and provides near-perfect captions in a short amount of time.


It's reasonably priced and can easily transcribe jargon and up to 120 languages. It also offers automatic backup so you never have to worry about losing your files or ongoing work if something goes wrong.

Top Free Automatic Closed Captioning Software

The automated closed captioning marketplace is flush with all sorts of apps and web-based tools that can help simplify your captioning process.


To save you the time and trouble of researching the best free captioning tools to suit your needs, we've done the heavy lifting for you.


The following programs would transcribe and caption your audio and video with some degree of accuracy. But you may still need to edit the auto-generated subtitles to include correct technical terms, proper punctuation, and rectify other errors.

  1. Keevi

You can convert your YouTube videos, interviews, recordings, documentaries, speeches, podcasts, and more into text accurately and quickly with our captioning tool. Keevi only takes a few minutes to set up and is very simple to use and understand. It allows you to transcribe files in different languages into your language and easily insert timestamps into your text. All you have to do is upload your audio or video file and let our algorithms work their magic. In just 30 minutes, a 2-hour long file would be transcribed with 90% accuracy. So your only job will be to review, download, and share the subtitled or captioned text.


2. Otter.ai

Otter offers 600 minutes of high-quality automatic transcriptions for free each month. If you don't have large volumes of content to caption, you can enjoy the wonderful benefits of this tool without paying a dime.


Its interface is sleek and easy to master in just a few minutes. Otter's software can detect pauses, differences between statements and questions, and the end of sentences.


The free service also includes timed text. The only downside is that it only allows you to download the caption file in TXT format. So you'll need to use another tool to convert to SRT and other formats.

3. Subtitle Edit

This software runs offline so you'll need to download and install it on your device first.


You might feel a little overwhelmed when you run the app for the first time and see all the options it has on offer, but don't let that scare you. The software offers a learning mode for beginners and it'll guide you as you go with clear directions.


Once you get comfortable with Subtitle Edit, you can move on to the advanced interface where you'll get more creative control.


The editing feature allows you to fix grammatical and spelling errors. You can also generate srt files from your videos, import a wide array of subtitle files, and embed open captions into your video. 

4. Aegisub

This flexible software solution comes with a variety of caption customization options. Aegisubs allows you to change the color, size, outline, and font of your text and position them wherever you like on the screen. 


The tool is completely free and there's an array of caption file formats for you to choose from.


However, Aegisubs is not without a few drawbacks. There's no cloud-based option so you must download and install the tool before you can use it. The process of adding time codes to captions is quite tedious. Also, you can't edit a video while it's still playing.

5. Kapwing

Video captioning is one of several web-based video editing solutions that Kapwing offers. The tool is pretty easy to use. You just have to select the auto captioning feature, upload your video, then review and edit the auto generated captions.


When you're done making changes to the text, you can export it to your preferred storage destination.


Although Kapwing allows you to use their captioning services for free, their logo (watermark) will be added to the final product—your exported video. However, you can make a one-time payment of $6 to get rid of this. 

Conclusion

More and more consumers want to see videos from the brand they patronize and admire. Videos have become a huge part of our lives and it's going to get even bigger in the coming years. This is why making your video content accessible should not be overlooked.


Captioning opens the doors for your message to reach more people than ever, arouse their interest, and push them to take action that moves you closer to achieving your brand goals.


If you don't want to undertake the task yourself or you need high-quality and accurate captions, there are free and paid tools that can do the work for you.


If you want to consistently churn out great, fresh content without constantly coming up with new ideas, subscribe to Keevi, and transform your content marketing game.



Other Items to Learn