Basic understanding of transcription services – 8 key factors

Basic understanding of transcription services – 8 key factors

After receiving several questions regarding transcription, I have decided to put together some factors and basic concepts about this service offered by many translators and linguists, which I hope will be useful especially for those who haven’t yet dared to take the first step into this field.

What is transcription?

Transcription is the act of documenting an audio file in hard copy; the typing of recorded speech by a qualified typist. Obviously, transcription involves more than just typing. It involves listening to a recording, researching, understanding and then typing.

So, how long does it take to transcribe a recording?

This process is actually quite time-consuming, but don’t panic.

One hour of recording can take between 4-10 hours to transcribe. Much longer that expected, isn’t it? Well, it actually depends on several aspects.
You will find out that many transcriptionists charge per hour of recorded audio (not per hour of work); others charge per recorded minute. So, it is a good idea to have a basic understanding of the factors that affect transcription turnaround time in order to set your rates:


First of all, the number of speakers. It seems quite obvious that a single speaker will be easier to transcribe. Who thinks it is a simple task to transcribe 3 or 4 speakers talking at the same time.
Besides, another key factor related to the number of speakers, is whether you require speakers to be identified. With recordings of simple one-to-one interviews, the identification of the speakers is usually easy.
On the other hand, with large groups, meetings, conferences or roundtable discussions, this task becomes quite complicated (unless the speakers identify themselves), as people in groups tend to all speak at once, interrupt each other or raise their voices, forcing the transcriptionist to ‘tune in’.


Certainly, recordings made without using a microphone, or outdoor, tend to have much more background noise than those recordings made indoor and using professional material. Poor quality recordings need several rounds of listening and understanding, and therefore take longer to transcribe.
Also, if the voice is hard to hear (sometimes the speaker is too far away from the microphone), or if the speaker mumbles, speaks too fast or too quietly, this will definitely make it difficult to figure out what he or she was trying to say. Obviously, each voice has a different tone, pitch, and speed, as well as a different accent, which we will see on the next point.


The stronger (and uncommon) the accents of the speakers are, the longer it will take to transcribe a recording. For example, in Spain, we have different accents depending on the region, and believe me, they are very different. So first of all, check the audio file and be sure you are familiarized with the accent!


You will probably notice that recordings with technical terminology require a previous preparation and research for spellings, and will therefore take longer to transcribe.
The knowledge or specialization of a subject can greatly reduce research time and improve transcription speed. Be sure to know what you are talking (typing) about.


If the participants speak very fast and with few pauses, the recording will definitely take a lot of listening and re-listening, and re-re-listening… That means extra transcription time, for sure.
The advantage is that after transcribing a few fast recordings, normal-speed recordings will seem much easier :-).


Remember your client is paying you to transcribe exactly what’s on the audio file for them. So it is very important to listen carefully, look up names (places, companies), read through the transcription and spell check the whole document once you’ve finished. And if you don’t understand something, just mark it and insert a note in square brackets with the time of the unclear section: i.e.: [unclear  20:15].


Okay, so this is THE key factor regarding transcriptions. Typing speed depends on your ability, practice, and all the previously mentioned points.
Let’s say that at a normal typing speed it would take approximately 4-5 minutes to transcribe a 1-minute recording. So basically, an hour of clearly recorded audio/video would take about 4-5 hours to transcribe.
People speak seven times faster than they write and four times faster than they type, which is why the standard typing speed is set to four times the length of a recording.


Some clients like to stick to their transcriptionist once they’ve found a good one, as familiarity with a speaker’s style helps improve transcription time and accuracy. Thus, it will also save your time (so try to do a really good job).
Happy client, happy transcriptionist!

So, after analyzing all these factors, we can state that:

– Simple transcription  (with no background noise and 1-2 speakers) will take around: 4-5 hours

– Complex transcription (with background noise and 3 or more speakers, strong accents, or technical vocabulary) will take around: 5-10 hours


Don’t forget to leave your comments below!

Acerca de la autora

Traductora e intérprete de inglés, francés y portugués a español. Traductora Jurada de francés.

Dejar un comentario

¡Sígueme en Facebook!

error: Content is protected.