User Tools

Remove page breaks

All page breaks in the DAISY XML need to be removed unless these page breaks correspond to the print version page numbers (this is rare, and we'll let you know if that's the case).

The page breaks are automatically inserted by the Save-As-DAISY plugin and almost always correspond to the Word document's page breaks and not the original book's page breaks.

You can remove the page breaks in bulk using regex. Try both of these operations below to remove all pagenum instances (you may need to devise your own regex to capture the page breaks.) Make sure you have the regular expressions turned on (looks like a .* button near the find & replace box):

Find: </p>\s+<pagenum.*\s+<p>
Replace: leave blank

If the above search does not work try the following:

Find: </p><pagenum.*<p>
Replace: leave blank

After you have completed the above there will still be some left over. To fix this do the following:

Find: <pagenum.*</pagenum>
Replace: leave blank


If you're not familiar with regex or "regular expressions" you might want to familiarize yourself with it. There are lots of resources online to help you devise handy Find and Replace operations, such as this Regex cheatsheet.

public/nnels/daisy/edit-xml/pages.txt · Last modified: 2019/05/04 00:39 by farrah.little