This page gathers the IT Subcommittee's resources and reviews of the accessibility of Otter.ai. This page will be updated as new information is available or further reviews are conducted.
Otter.ai is one of the most suggested AI transcription tools out there, with a focus on making meetings accessible to the Deaf/Hard of Hearing. Two of the major features is a high quality AI-generated audio transcript, as well as an integrated transcription editing platform that connects the recorded audio to the words in the transcript. This allows for a quick basic transcript to be developed from an audio or video file (or a live transcript integrated into Zoom pro), as well as making it easy to listen repeatedly to the troublesome audio when correcting the transcript. However, in testing, we found that there were major cognitive load issues as well as some basic navigation problems when using keyboard navigation and/or a screen reader. We’ve brought these issues to the attention of Otter.ai, as they may be bugs. Since the initial review of the software, Otter.ai has implemented some fixes to issues we had identified. The updated information is integrated into the page below.
Our testing focused on using the browser-based version of the software. After finding issues using a screen reader with the browser-based version, we tested the mobile apps on the recommendation of the Otter.ai support team.
Recording a Transcript
Recording a transcript (or uploading an audio/video file) is generally accessible. There is only a visual display that shows the recording is in process. When you start recording with the Record button from the homepage, you are automatically put into an active recording with no audio cues of what happened or where you are (or that it is already recording). Not all the buttons are labeled correctly, and some of the link and button labels are different from what they say visually. There are navigation issues whenever a pop-over window of information appears, as the focus doesn’t automatically switch to the top window. You have to navigate to the bottom of the page in order to access the pop-over window.
Recording via the mobile apps was generally accessible. However, if you aren’t using an alternate headphone or microphone, you will record the screen reader’s speaking the time codes instead of the audio you are trying to record.
The free version of Otter.ai restricts you to an audio length of 40 minutes, which may not be enough to transcribe an entire meeting. It is also restricted to 600 minutes per month, and you can only import 3 audio or video files per account (NOT per month). Paid accounts are required for longer audio support, more minutes, and other options.
Editing a Transcript
There are significant cognitive overload issues when using a screen reader to edit a transcript. We cannot recommend using Otter.ai to edit a transcript using a screen reader.
The AI-engine used to recognize speech is fairly high quality, though it has the expected issues with multiple voices and accents. With use, Otter.ai can learn to recognize speakers that have been identified in previous conversations. This option is only available after the audio is processed, and not while the live transcript is playing.
Exporting a Transcript
We encountered some difficulties while trying to export a transcript that might have been part of a temporary bug. Right now, you can access the More menu via keyboard and tab between the options in the menu. However, when the pop-over window appears to select how to export the transcript, you must navigate to the bottom of the page to access it.
Otter.ai recommended that if you’re having difficulties exporting using a screen reader in the browser version, switch to the mobile app. We tested this with iOS and Android, and they both seemed to work with a screen reader.
The free version of Otter.ai only allows you to export the transcript as a txt file or mp3. Paid account can copy the transcript to clipboard, export as a .docx Word document, a PDF, or an .SRT. There’s been reports of some difficulties with the timing accuracy on the .srt file when used with YouTube; in those instances it is best to use the YouTube automatic timing.
- What is the best auto captioning for video calls? Blogpost by Meryl Evans, April 22, 2020
- Otter.ai blogpost on accessibility
- Keyboard shortcuts by Otter.ai (only covers transcript and playback shortcuts)
- Otter.ai Help Center
- There is no dedicated accessibility support contact or information. For accessibility issues with Otter.ai, email email@example.com
Known Accessibility Issues
The biggest obstacle when using a screen reader is balancing the screen reader’s audio of the written transcript with the actual recorded audio while editing. It may require switching between focus and browse modes, as well as using the Otter.ai keyboard shortcuts. This was overwhelming for our testers, and made it nearly impossible to do any editing to a transcript. Some options would be to export the transcript in order to edit it in another app while still being able to control the audio playback in Otter.ai while pausing the screen reading. You can also export the recorded audio as an mp3, and play it back on an external device. Switching to mobile apps did not solve this issue. In the Android and iOS mobile app, editing was not accessible at all.
An additional difficulty with using a screen reader and the mobile apps is that the screen reader continually reads the screen. Without headphones/alternate microphone, you will simply record the screen reader speaking aloud. We were able to use the export option in the mobile apps with a screen reader, when it failed in the browser.
A strange issue that may be a bug is that with NVDA, the zeros in numbers are read as “24.” When you stop or pause a recording, it’ll say “twenty-four, twenty-four,” though when the recording starts, there is no audio to let you know its recording.
It may also be advisable to use a more traditional transcript editing program, where one can utilize transcription pedals to control the audio while typing the transcript. Other automatic transcription options would be Word 365 Online or Nuance Dragon speech recognition.
Legal, Ethical and Privacy Issues
- An article from the Politico outlines how there needs to be more attention paid to the fact that while the data/transcriptions aren't being actively shared with governmental entities, they are able to be subpoenaed. While we continue to rely on easy technologies to support our work in accessibility, they are all vulnerable to hacking. While Otter.ai isn't particularly vulnerable, but if you are working with vulnerable groups, you should keep privacy in mind.