Text in other languages

Whenever there is text in another language it’s very important to properly identify the language of the text. This ensures that screen readers, braille displays, and other assistive technologies can render the content accurately and read the content according to the pronunciation rules for that language. When no other language has been specified for a phrase or passage of text, its human language is the default human language of the book.

In some cases, though, it's not desirable to markup the change in language as it actually negatively affects accessibility. Do not mark up the language in these cases:

  1. Proper names
    1. Examples: Bellevue, Pierre
  2. Technical terms
    1. Examples: Homo sapiens, Alpha Centauri, hertz, and habeas corpus
    2. Most professions require frequent use of technical terms which may originate from a foreign language. Such terms are usually not translated to all languages. The universal nature of technical terms also facilitate communication between professionals.
  3. Words or phrases that have become part of the language
    1. Examples:
      1. "Rendezvous" is a French word that has been adopted in English, appears in English dictionaries, and is properly pronounced by English screen readers.
      2. "Podcast" used in a French sentence. Because "podcast" is part of the vernacular of the immediately surrounding text in the following excerpt, "À l'occasion de l'exposition "Energie éternelle. 1500 ans d'art indien", le Palais des Beaux-Arts de Bruxelles a lancé son premier podcast. Vous pouvez télécharger ce podcast au format M4A et MP3," no indication of language change is required.
    2. Frequently, when the human language of text appears to be changing for a single word, that word has become part of the language of the surrounding text. Because this is so common in some languages, single words should be considered part of the language of the surrounding text unless it is clear that a change in language was intended. If there is doubt whether a change in language is intended, consider whether the word would be pronounced the same (except for accent or intonation) in the language of the immediately surrounding text.
  4. Words of indeterminate language
    1. In the rare case where, for one reason or another, we cannot determine what the appropriate language information is, then we just leave it as is (do not mark it up). This might be a situation where we're not sure if the text is non-linguistic. We haven't come across this situation in an ebook yet!

For more info please refer to the WCAG page on languages through this link.

The important thing to keep in mind is why the guidelines exist. This guideline is for non-visual readers who use audio (text-to-speech) to access the text. I sometimes find it helpful to ask, “would this negatively affect reading comprehension if it were voiced in English or in French?”. You can easily test this out by activating the TTS on your Windows (Narrator) or Mac (VoiceOver)

Applying language styles

The language can be set using styles at either the paragraph or character levels. For entire paragraphs in a foreign language, we use a Paragraph style; for inline words or phrases in another language, we use a character style.

For example, in the image below, we can create a new Character style (let's call the style Turkish) and set the language to Turkish using the Format drop-down menu and selecting Language.

Following these steps will ensure that the text is spoken in the correct language, and converted into XML.

Step 1: Create a new style (character or paragraph)

Create a style

Step 2: Go to ''Language'' in the drop-down menu

Go to Language in the drop-down

Step 3: Set the language of the text

Select the language

For entire documents written in another language

If the entire book is written in another language, we will need to change the language of the document so that it is not English.

To change the document language on a Mac, you can follow these steps: Change document language on a Mac

On a PC, Word should automatically detect the language of the document: Change document language on a PC

A note about poetry

When you are working on poetry, you will not be able to apply a particular language style to words and phrases. In this case, you can just leave the Word version without language markup and use just the Poetry (Poem (DAISY)) style. Just make a note in the RT ticket that there are multiple languages.

Working with Images of Words and Different Alphabets

Sometimes a word or phrase will appear as an image in line with the sentence instead of typed text. This is a issue from the publisher. Words or phrases should not be formatted as images, but sometimes publishers do not follow these guidelines. When this happens you will need to transcribe the image of the term of phrase, and then apply the language style. Be sure to delete the images once you are done adding the text version.

Some languages cannot be transcribed due to the complexity of that language. An example would be Arabic. When it comes to languages like Arabic, unless you are a native speaker you cannot transcribe it correctly. In this case you would treat the image of the word like other images in the document and add Alt-Text stating it is an Arabic Word. You would then put a Producers Note at the beginning of the book to explain why you did this. If you are unsure if the language is something you can safely transcribe please contact you supervisor for more feedback.

Sometimes the terms or phrases are typed out in line with the rest of the text, but with a language that uses a different alphabet. In this case, if the text appears as typed text, and not an image, then you can simply apply a language style to it as usual.

In case you're not sure how to type in different languages, this is how you do it on a Mac Enable keyboard layouts in different languages in Office for Mac and Windows.

In other cases you can use unicode to enter the characters of the language. For more information on unicode go to the Symbols page.


Q: I have a book that uses Innuinaktun words, but it also has two images. One is an image of a table with the word symbols beside the sound (no english translation), and the other is a full pieces of text in Innuinaktun. How should I address these images in the Alt-Text? And should I also include a producers note about the Innuinaktun words?

A: Looks like this is the Inuktitut language, according to the publication information. Inuktitut can be represented by Unicode Canadian Aboriginal Syllabics. We will need to translate the images into Unicode. If you're using Mac, enable your "Unicode Hex Input" keyboard (see Language section in wiki for instructions). To type each symbol/letter into Word, hold down the alt key and type the 4-digit number, i.e. 1400.

Q: I am editing a poetry book that uses Italian, French, and Latin. If I apply a language to one word, it changes the entire line or stanza. Should I just leave it as poetry style?

A: Unfortunately, identifying languages in Word doesn't translate well to DAISY XML and requires manual editing of language tags in the XML. You can just leave the Word version without language markup and use just the poetry style. Just make a note in the RT ticket that there are multiple languages.

Q: I have a book that deals with hebrew words. Some of the words are typed, and I can create a style for them, but the other words are images of just a letter, or an entire word. How should I deal with them? Should I just put in the alt-text this is an image of this hebrew letter/word? Or should I put in a producers note? I included examples below:

A: Using images instead of text is a very bad publishing practice :( Images of text should all be converted to text in the body of the narrative. We should type out all the text including the Hebrew and Greek text and use a style to tag them as words in the Hebrew or Greek language (as we usually do with foreign language words).

In case you're not sure how to type in different languages, this is how you do it on a Mac Enable keyboard layouts in different languages in Office for Mac and Windows.

WCAG 2.0 - H58:Using language attributes to identify changes in the human language

