User Tools


Step 1: Convert Word to DTBook XML

The Save as DAISY add-in translates Microsoft Word 2003, 2007, 2010 and 2013 documents into DAISY. It only works on a Windows computer. NNELS uses this add-in to generate the DAISY XML for further processing.

Here's how to use it:

  1. Rename the Word document by removing the Production Assistant's last name from the file, so that it's simply Title_of_the_book.docx.
  2. Open the Word document.
  3. Check it over to ensure that everything is structured with the correct styles. Check if the book has endnotes/footnotes and, if so, see the below instructions for books with footnotes/endnotes.
  4. Validate the file by:
    1. going to the Accessibility tab in the top menu bar.
    2. clicking Validate. This should be quick and complete without errors ("The document is valid"). If you get an error, fix it and validate again. Tip: A common error is that your document has 2 subsequent headings with no content in between (i.e. a Heading 1 followed directly by another Heading 1). If it makes sense to combine these headings, do so; if it doesn't make sense then ignore this validation error and proceed…
  5. Convert the document to DTBook XML by:
    1. going to the SaveAsDAISY drop-down button within the Accessibility tab.
    2. selecting DAISY XML (from Single docx).
  6. A box called DAISY Translator should pop up. Tip: If it gets stuck at the "Initializing translation" window then there's a problem with the Word file (could be hidden anchors somewhere in the document; check there are no extra spaces before heading text; double-check all images have had their formatting cleared, etc.).
    1. In this box, set your Destination folder. Save the output (XML, CSS + image files) in its own folder.
    2. Ensure all the Document Properties are correctly set: Title, Creator, Publisher (National Network for Equitable Library Service), Uid (automatically created).
  7. Click Translate. The translation process should initiate and shouldn't take too long ("Translating to DAISY…"). The bigger the book, the longer it takes, but the progress bar shouldn't freeze.
  8. When it's done, you should get the message: "Successfully Translated the document". All the files should be in your chosen output folder.
    1. Tip: If the document contains notes you might get some type of error message but files will still be output. Double-check the generated XML file and ensure all endnotes are there. If they're all there, you can safely ignore the message…

Special instructions for books with footnotes/endnotes

The SaveAsDAISY plugin has a "notes bug" that we need to work around. If the Word document contains any footnotes/endnotes, then you need to follow the below steps before you convert the document to XML. This ensure that the endnotes will be referenced correctly (i.e. note 1 will link to the reference for endnote 1).

  1. Add an empty endnote to the document:
    1. put your cursor somewhere in the document - I usually put it at the end of the "About this digital talking book" section.
    2. go to the References tab in the top menu bar.
    3. click Insert Endnote. An empty endnote will be inserted into the doc. This will translate to endnote-0 in the XML output, which you can then just delete later.
  2. Now you can go ahead and convert the file to DAISY XML following the instructions at the top of this page. You'll get a Translation failed message, but just click ok and ignore.
  3. Delete this endnote from the Word document and save the file (we don't want the published Word document to have this empty reference!).
  4. The DAISY XML output will be generated as usual. When editing the XML you'll need to delete this endnote-0. The numbering for the book's endnotes will now be correct.

Special instructions for books with acronyms or abbreviations

The Save-As-DAISY plugin for MS Word allows you to identify acronyms and abbreviations in a text. By doing so, these acronyms or abbreviations will be voiced in full to readers. Please use this feature with caution. It is rare that we need to do this as most acronyms or abbreviations will automatically be voiced as intended, i.e. "TED Talks" be voiced as "Ted talks". As well, most texts will state what any acronyms or abbreviations stand for if it is not commonly used.

Follow the steps below to identify key acronyms or abbreviations that need clarification:

  1. In the Word document, go to the Accessibility tab in the menu bar
  2. Highlight the word that you want to identify as either an acronym or an abbreviation*
  3. Click on the Mark As Acronym or Mark as Abbreviation button
  4. You will then be prompted to enter the "full form" of your selected word, i.e. if you select the word "approx." then the full form would state "approximately". In the output XML file, you will see the <abbr> tag around each occurrence of the word.
  5. If you want all occurrences of the word in the book to be tagged then check Apply for all occurrences of this word.
  6. For Acronyms, there is also a checkbox for Pronounce the Acronym in Reader. Select this box if you want all occurrences of the word to be pronounced in full. In the output XML file, you will see <acronym pronounce="yes">NNELS</acronym>. This means that every time the screen reader comes across this word, it will say the complete text, "National Network for Equitable Library Service", instead of saying it letter-by-letter, "N N E L S".
    1. Please note that the plugin only allows you to mark a word as an acronym if that word contains dots to separate each letter (so, you can mark "N.N.E.L.S." but not "NNELS"). Note: you can always manually input the code in the XML file.
  7. Click Mark.

*Definitions:

Abbreviation: An abbreviation is typically a shortened form of words used to represent the whole.
Examples: Dr. (Doctor), Prof. (Professor), St. (Street), Ave. (Avenue), cm (centimeters), vs. (versus)

Acronym: An acronym, technically, must spell out another word. It contains a set of initial letters from a phrase that usually form another word.
Examples: NASA (National Aeronautics and Space Administration), AIDS (acquired immune deficiency syndrome), gif (graphics interchange format)

public/nnels/daisy/1-convert-word-to-xml.txt · Last modified: 2019/07/03 05:05 by farrah.little