Basic markup process for ebook conversion

We’ll now cover the basic markup process you’ll need to undertake for a simple book layout. For more complex works, there will be other steps and markup elements but these four steps will be enough for many ebooks:

  • Apply a single paragraph style to the entire book
  • Apply basic character formatting (bold, italic, underline, font size, alignment)
  • Apply styles to the title, chapter headings and sub-headings. This is usually done by applying Heading 1, 2 and 3 styles.
  • Insert images

Starting with a clean slate

MS Word IconIf you have a manuscript with a lot of the formatting features noted above, you might find it is easier to start from scratch by first eliminating all formatting, then adding just the formatting you need.

There are two simple methods to accomplish this.

  1. Normalize text. This method applies a single style — MS Word’s default Normal paragraph style — to the entire document. This will give you a consistent style to start from. You can add back required markup styles such as chapter headings.
  2. Remove all styles from all text. This method is more extreme but it also guarantees there will be no hidden ‘gotchas’ in the file to trip up any conversion program you might use. The disadvantage of this method is that it strips all formatting (except paragraph marks, which Word reinstates). You’ll have to reapply everything else, including bold and italic formatting.

In spite of the extra work upfront, it’s strongly advisable that you use one of these methods. If you don’t, it’s likely that the few hours you’ll save at the start of the project will turn into many more hours later as you track down and fix frustrating conversion errors.

Instructions on how to normalize or remove all formatting from text

How to normalize text

Follow the steps below to normalize all text in the MS Word document. Remember to copy your file first so you can return to the original if you need to.

  1. Select all of the text. Do this by clicking on the text then using Control + A (Mac users Command + A) to highlight all of the text in the document.
  2. Apply the Normal style from Word’s Style menu.

Here are screen shots showing this process (the example uses MS Word 2010). (Click to enlarge images.)

MS Word Normalize - Before

Before applying Normal paragraph style

After applying Normal paragraph style

After applying Normal paragraph style

How to remove all styles

Follow the steps below to remove all formatting. Remember to copy your file first so you can return to the original if you need to.

Here’s how you do it:

  1. Select all of the text. Do this by clicking on the text then using Control + A (Mac users Command + A) to highlight all of the text in the document.
  2. Copy the text to the clipboard (for Windows, Control + C, for Mac, Command + C)
  3. Open NotePad for Windows or a Mac equivalent such as TextEdit. Every Windows PC has NotePad, you’ll find it in the Programs/Accessories folder; Mac users can find TextEdit in the Applications folder, but you’ll need to take an extra step by going to Preferences and checking, ‘Ignore rich text commands in HTML files’.
  4. Paste the text into NotePad (Control + V for Windows, Command + V for Mac). This will strip out all formatting and leave a clean, text-only file.
  5. Select all the text in NotePad then Copy it to the clipboard
  6. Paste it into a new Word document

Tip: Before you remove all formatting

If your book contains a lot of bold, italic and underline styles, they will be removed along with all of the other formatting. This will create a lot of extra work adding them back. Here’s how to reduce this problem.

  • Before you remove formatting, mark each occurrence of one of these formats using a distinctive combination of characters such as ‘b>’ and ‘i>’. To do this, use the handy Find and Replace Styles  feature which we will cover later in section Creating Multiple Versions. Choose a character combination that doesn’t appear elsewhere in the text (including  inside any words).
  • After you’ve removed all formatting, restore character styles by using Find [character combination] to locate the missing styles, and reapply the styles to adjacent text.
  • Finally, use Find and Replace again to find the distinctive character combination and replace it with a blank (backspace). This will delete it.

This technique will produce a Word document devoid of all formatting. You can then format it using just the styles you need for a successful ebook conversion.

By the way, this technique is useful in other situations where you want to copy formatted text from an outside application.

For instance, when you paste directly from a web page or other formatted source, hidden codes that are actually part of the page will be brought into your document and will produce messy, unwanted formatting. Using the technique above, you can bring in just the visible text then apply your own formatting.

Inserting images

Here’s how to get images into an ebook document in a way that will convert correctly. Later, we’ll go into more detail about how to create and enhance images for publication.

What to do: Make it an inline image

The best result comes from a centered inline image.  Inline images sit between blocks of text so the text does not wrap around. This won’t look as neat as properly wrapped text but it will convert correctly and will work more predictably in e-readers.

Click here to see how to center an inline image

1.  Insert the image:

    • (In MS Word 2003 or 2007): select Insert > Picture > From File
    • (In Word 2010):  select Insert > Picture

2.  Set the image to inline:

    • (In MS Word 2003 or 2007): Right-click on the image, select Format Picture > Layout > Inline with Text
    • (In Word 2010): Click on the image, then select Picture Tools Format (on the top menu ribbon) then  Wrap Text > Inline with Text

3. Select the image again and centre it using the Align Centre button

What to avoid

There are two common errors which lead to problems when dealing with images for publication.

  1. Do not use copy and paste to place images into the document. Instead, you must insert (also referred to as ’embed’) the images into the document. You do this with the Insert > Picture or the Insert > Picture > From File command.
  2. Do not use ‘floating images’, that is images you can drag around with your mouse. This is usually the default when you insert an image. It’s also used to neatly flow text around an image. Newer formats including EPUB3 and KF8 will provide better support for floating images but in the meantime they’re best avoided.

Creating and modifying images

If you have images that are correctly sized, cropped and modified for your ebook, you shouldn’t have to to do any more than we’ve just described to add images to a document. However, it’s far more likely that the images themselves will need to be modified before they’ll work.

Later in this module, we’ll look at the most common image editing tasks and which image editing tools to use. We’ll also take a short detour and explain some of the basics of how digital images work, what the jargon you’ll encounter means, and how to optimize them for ebooks.

But for now, we’ll assume you have images and text ready to use and we’ll look at how you assemble them ready for conversion into an ebook.


Find out more about this topic on our Digital Publishing 101 useful resources site.


Feedback Icon Feedback or suggestions for this page
(Visited 1,283 times, 1 visits today)