Quick Tip: Tag and Restore Italics in Word


You all know that the key to a good ebook format is a squeaky clean source file, right? Word doesn’t produce particularly clean documents. For best results, you should strip out extraneous codes before you begin to format. Mark Coker of Smashwords calls it the “Nuclear Option.” You copy/paste your document into a text editor and that will remove all the unwanted coding. Then you copy/paste the clean text back into Word and you are ready to format.

Anyone who has tried this knows that doing so will not only remove unwanted coding, it’ll nuke your italics, too (and other special formatting and styles). Here is an easy way to tag all your special formatting and then restore it. (What I will show you applies to bolding, underlining, different sized fonts, etc., too.)

Here is a document in need of a good cleaning:

TagOpen the search box and make it look like this:

Tag 1If you open the “Format” box you’ll see a drop down menu that gives you a “Font” option. Open that.

Tag 3Notice the many, many options you can search for. Cool, huh?

I have come up with tags through trial and error. I use several different programs when I format ebooks, so I needed something unique for search purposes that didn’t make any of the programs say, “Oh no you don’t!” and crash the search box. I use all caps and hyphens to make sure they don’t get mixed up in the text. The most common tags I use are:

  • -STARTI- for italics
  • -STARTB- for bold
  • -STARTU- for underline
  • -END- to close the tag

Back to the document. Click Replace All.

Tag 2Now all your italics are wrapped in tags. This is a good time to go through and make sure your tags are in the right place and that you don’t have any blank space tagged.

Now copy/paste into a text editor:

Tag 4All your formatting is gone.

Now open a new file in Word and apply your main style sheet. Copy/paste your text into the new file. Open the search box and make it look this:

Tag 5Do a Replace All and… ta da!

Tag 6I generally wait until I’ve formatted all my headers and centering and any other styling necessary before I restore special formatting. Once done, all that’s left to do is to get rid of the tags.

Tag 7Replace All and done!

In the time it took you to read this blog post, you could have tagged and restored six files. It really is that easy.



-SB-, -PB-, and Italics

So I’m chatting with a friend and she asks, “What you doing?” I sez, “Nuking tabs. Bwahaha! All gone.”

I was, of course, prepping a manuscript for ebook formatting. That means going through the manuscript and getting rid of everything that will screw up the ebook.

My learning processes are always convoluted, in the beginning overly complicated and then as I figure out what’s important and what is not, I streamline and pare down to the essentials. If you are a regular follower of this blog, you’ve seen my process regarding source files for writers. I’ve gone from suggesting writers set up and use style sheets (they should) to what I’m going to suggest today.

When creating a source file with the end goal of turning it into an ebook, all the writer needs to do, formatting-wise, are three things:

  • Indicate page breaks
  • Indicate scene or section breaks
  • Italics, bolding and underlining

When it comes to page breaks, “indicate” means exactly that. Don’t actually break pages either with inserted page breaks or multiple paragraph returns. Why? Because when you’re ready to format, you or the person you hire has to take them out.

The more formatting you put into your source file, the more formatting that has to be removed. The more that has to be removed, the greater the chances of something that might be missed (screwing up the ebook) and the more it costs in time and money.

When I get a manuscript to format, it’s generally been created with a word processor. Whether I’m going to format it for Smashwords (a Word file) or for everything else (html files), the very first thing I have to do is–

  • Remove extra spaces
  • Remove extra paragraph returns
  • Remove page and section breaks
  • Remove headers, footers and page numbers
  • Tag page breaks
  • Tag scene or section breaks
  • Tag special formatting

I don’t actually have to remove tabs because those are going to disappear when I transfer the text to a text editor, but it’s easy (one Find/Replace operation) and it gives me a clearer picture of the dangerous stuff.

Now, seriously, I’m a writer. I fully understand the NEED to make the manuscript look RIGHT. But writers, you have to understand that every effort you make to that end is going to have to be undone. Because of the nature of word processors, some of the fancy touches you include can actually corrupt the ebook.

The less you do–Honestly! Truly! I’m not lying about this!–the better the ebook will be.

So what’s a poor writer to do? Not much, actually. Use whatever font and font size you like. That’s not what will end up in the ebook, but use whatever is comfortable for you while composing. Line space however you like. It makes no difference in the end. Get out of the habit of using tabs. If you can’t stand not having indented paragraphs, set up a simple style sheet that indents the paragraphs with every hard paragraph return. Get out of the habit of two spaces between sentences. Get out of the habit of adding extra hard paragraph returns to space the text. Get out of the habit of making pages. There are no pages in ebooks.

How does one indicate a page break?

I use a code. -PB- It’s unique, the dashes keep it from melding with text, and thus it is easy to find. What my clean file looks like before I take it to the text editor is this:

Final line in chapter one.
Chapter Two
First line in chapter two and so it goes.

Use whatever makes sense to you. If you want to make extra sure you or your formatter don’t miss it, spell it out. -PAGE BREAK- That’s it. That’s all you have to do. When you do the actual formatting, that’s when you center, bold, add graphics, extra spacing, etc.

Scene breaks are another place you should get in the habit of tagging–especially if your habit is to use extra paragraph returns to make a blank line. Those can be easy to miss. My little code is -SB-. The text looks like

Last line of scene or section.
First line of new scene or section.

It doesn’t matter much what you use as long as you use something. Asterisks, a pound sign, plus signs, or spell it out -SCENE BREAK-. Use something so the scene break doesn’t get lost.

As for special formatting–italics, bolding and underlining–at some point, whether you do the job yourself or hire it out, you are going to have to tag the special formatting. I’ve gotten into the habit with my own writing to tag as I write rather than highlighting the text and italicizing (or whatever). It’s easier in the long run and I’m used to how it looks. Most writers are not going to want to do that. No biggie. If you are going to tag your special formatting, a few things I have learned–

  • If you’re going to format in html, you know to tag the special formatting with open/close codes.< i > and < /i >
  • If you are going to format your ebook in Word and are tagging for the purpose of stripping extra coding out of the document, do NOT use html tags. Using < i > TEXT or Wild Card < /i > in the Find box of a word processor can have… interesting results. Not the fun kind of interesting either.
  • For Word files I use -I- and -ENDI- to open and close italics. Easy to find and doesn’t give Find/Replace fits.
  • Make sure your special formatting is paragraph specific. In other words, don’t just highlight big blocks of text and toggle on italics. Highlight the necessary text within each paragraph, italicize it, then do the same in the next paragraph. Fewer chances for conversion programs to argue about what you really mean.

That’s pretty much it. To get the best results in your ebook, no matter who does the formatting, copy the following, print it out, and tape it to your computer monitor as a reminder:

This is a FILE not a document. Less is more. Less is good. The less you do now, the less you have to do later.