Indie Writers: Make MS Word Work for You Instead of Against You

A Quick Primer for Fiction Writers in using Microsoft Word in the Digital Age

It always saddens me a little when a writer sends me an overly formatted Word doc to turn into an ebook or print-on-demand. It’s not that I have to clean it up–I can strip and flip the messiest files in less than an hour. What bugs me is how much thought and effort the writer wasted on utterly useless manuscript styling.

Example of a Word doc that has been overstyled.

Example of a Word doc that has been overstyled.

The majority of writers I work with use Word. The vast majority have no idea how to use Word for their own benefit. I understand. I was a fiction writer for over two decades and even though I have been using computers and a variety of word processing programs since the late ’80s, it wasn’t until I started learning book production that I figured out how those programs worked. Why would I? All I needed was a printed manuscript in standard format to mail to my editor. Word processors made that easy.

Now I produce books for digital and print, and those old ways of “thinking print” make the writer’s job harder. Especially indie writer/publishers who might be doing it all alone or working with contractor editors and proofreaders and formatters.

Since it would take a full book–or volumes–to explain how word processors work, I’m going to urge you all to take what I tell you in this post and play around in your word processor. I will be talking about MS Word, but much of what I show you will apply to almost any word processor.

STUFF YOU DON’T NEED AND NEED NEVER USE AGAIN

  • Tabs
  • Page breaks
  • Headers
  • Footers
  • Page Numbers
  • More than one space for any reason
  • More than two hard returns for any reason
  • Multiple fonts
  • Text boxes
  • Justification
Example of a manuscript that uses NONE of the above.

Example of a manuscript that uses NONE of the above.

STUFF THAT MAKES WORD “WORK” FOR YOU

  • Style sheets (fiction writers can get away with using only two or three, four at the most)
  • Find/Replace
  • Save As
  • Web View
  • “Show” feature
  • Formatting tags
(Left) Basic manuscript formatting; (Right) Overly formatted manuscript.

(Left) Basic manuscript formatting; (Right) Overly formatted manuscript.

See that backward P-looking icon I’ve circled? That’s the “show” feature. Toggle it on and you can see paragraph returns, spaces, tabs and a few other formatting features. With the basic formatting on the left, all I had to do was apply one style (Normal) to the entire manuscript, then apply heading styles to the chapters and sections, and done. To style an entire manuscript takes minutes this way. The manuscript on the right is an entirely different matter. To get it looking the way I want would take hours, if not days, manually lining everything up, trying to get it to look the way I want it. Worse, I have to remember what I’ve done so I can remain consistent throughout. When I’m done, I still have to scroll endlessly through the entire document to find whatever I might need to find.

And what about what is happening behind the scenes? MS Word uses html to control all those features. If you’re printing a document, the only true concern you have is making sure your fonts print properly. If you’re turning your work into an ebook, all that hard work (and useless effort) works against you.

The html in the basic Word doc and how it displays in Firefox.

The html in the basic Word doc and how it displays in Firefox.

The overly formatted file in html and how it displays in Firefox.

The overly formatted file in html and how it displays in Firefox.

So let’s make Word work for you. The NUMBER ONE thing (print it out and blow it up to poster size and post it where you can see it while you work) is:

IT DOESN’T MATTER A RAT’S PATOOT WHAT YOUR WORKING DOCUMENT/SOURCE FILE LOOKS LIKE

(Seriously, if your Happy Place while composing fiction involves Comic Sans font, 22pts, with 2 inch margins, triple spaced, then go for it. The only time it matters what your document looks like is when you intend to print.)

STYLE SHEETS

Set them and forget them; the best tool in the MS Word

Set ‘em and forget ‘em; the best tool in MS Word

Every version of Word has a style sheets feature. If you’re using 2010, you’ll find them in the “Home” toolbar. Word comes with a huge variety of pre-built style sheets. You can use them as-is or modify them. You can create your own style sheets. The most useful styles for the fiction writer are: Normal, Heading 1, Heading 2.

  • Normal: apply to the body of your text. Set your paragraph indents, line spacing, and font. Never worry about spacing, margins and indents again.
  • Heading 1 & 2: apply to titles, chapter heads or sections. Bonus: Word will automatically list your headings in the navigation window. No more scrolling through a long document to find a specific chapter or section. Another Bonus: Ebook conversion programs recognize heading styles. Some, like Calibre, will automatically build a table of contents for you based on headings 1 & 2.

Additional styles fiction writers might find useful:

  • Emphasis: Remember, styles apply to paragraphs. “Emphasis” is italics. If your entire paragraph is italicized, use “emphasis”.
  • Strong: “Strong” is bold.
  • Custom style–“Center”: Instead of clicking on the icon for centering, create a style sheet. Makes life easy.
  • Poetry: For poetry, quotes, lyrics, anything you want with different margins and font style.

FIND/REPLACE

This is the most useful and the most underused tool in MS Word. You can use it to not only find words, you can find special characters, styles, highlighting, and special formatting (such as italics or bold).

Click on the dropdown menus and you can look for anything that appears.

Click on the dropdown menus and you can look for anything that appears.

A few useful search terms:

  • ^& (caret ampersand): Stands for a string of text. Say I want to tag my italics. I would leave the Find box blank, but ask it to search for italics. In the Replace box I’d type -STARTI-^&-ENDI-, do a Replace All and Word will wrap all my italicized text in tags.
  • ^p : Hard return. You can search for them or insert them
  • ^l  (caret lower case L): Soft return (shift enter)
  • ^t : Tab. Working on a document in which you or someone else used tabs and want to kill them all? Type ^t in the Find box, leave the Replace box blank, and do a Replace all. Done.
  • * (asterisk): A string of text. Use as a ‘wild card’ when you’re restoring your special formatting. Say I want to restore my italics. In the Find box type -STARTI-*-ENDI-, click the ‘wild card’ box, and leave the Replace box blank but ask it to replace text with italics. Do a Replace All and all your tagged text is italicized. Then use Find/Replace to get rid of the tags.

SAVE AS

When I’m working on a project, I might have four, five, ten versions of a file. If I’m making major formatting changes, I NEVER EVER mess with my source file. Let’s say I want a printed version. I do a Save As to make a new version that is named Print_Docname_date. Then I apply headers/footers, page numbers, page breaks and modify my styles to make it suitable for printing. My original source file remains unchanged and ready to use. Using Save As is the best habit you can get into while you’re working. (And it’s not like you’re having to save your work to floppy disks–your computer has lots of space. Use it!)

WEB VIEW

Basicformat4Forsake print view and get used to web view while you work. This view is flexible (flow text) and enables you to easily display multiple screens and compare text while you work. You can adjust the width of your screen, too, and not lose chunks of text or reduce the image size in order to see everything.

FORMATTING TAGS

Because I use a variety of programs, and I dislike intensely losing formatting such as italics or trying to remember where I want a block of offset text, I tag my formatting. Now, because Word is html-based, you do NOT want to use html tags in your text. It’s okay if you’re outputting a file to a text editor, but if you’re going to a program that is html-based such as Scrivener or InDesign, or if you intend to bring the text back (you’re ‘nuking’ it, according to Smashword’s style guide), then those html tags are going to seriously mess things up.

My tags are arbitrary. I’ve come up with them because they are unique and easy to search for; they don’t show up in text (normally). Feel free to use mine if you want or come up with something that makes sense to you to use. IMPORTANT TO REMEMBER: Special formatting such as italics or bolding require OPEN and CLOSE tags.

  • Italics: -STARTI- (open) -ENDI- (close)
  • Bold: -STARTB- -ENDB-
  • Underline: -STARTU- -ENDU-
  • For any special formatting such as headlines, poetry, etc: -SPECIAL- (this tag is a note to myself)
  • Placing Images: -IMAGE-
  • Scenebreaks or deliberate blank lines: ##

That’s it. Simple, no? This is MS Word in the digital age, a writing tool you can make work for you instead of against you.

 

Managing File Sizes for Ebooks

The majority of fiction writer/publishers will not run into overall file size problems. Text doesn’t create monster files. Using graphics or illustrations can add significantly to the overall file size, but I’ve yet to create an ebook that exceeds –or even comes close to–Amazon’s 50MB limit (which may be changing due to the introduction of the new Fire HD tablets). Even with illustrations and graphics, I do my best to keep the overall file size under 5MB because of Amazon’s delivery fees ($.15 per MB). Those fees are charged against the publisher and can eat up royalties quickly.

As I said, most fiction writer/publishers will not run into problems with overall file size.

Where fiction writer/publishers do run into problems are with the size of individual chapter files within the ebook. When you use <h1> or <h2> tags in html, or the Heading 1 or Heading 2 style in a word processor, you are alerting the conversion programs (such as Calibre or KindleGen) that this is a new chapter and should be split into a new file.* If you don’t use the headings or tags, the conversion programs look for certain words–Chapter, Part, Section, etc.–to determine where the file should be split. What is NOT reliable at all is using page breaks (in a word processor) or the “page-break-before” command in html/CSS. (I have absolutely no idea why those work sometimes, but sometimes they don’t–my best guess is the whims or moods of the Digital God.)

I always split html (text) files into chapters or parts, which manages the overall ebook very nicely. Even though this example is from a novel (Prophet of Paradise by J. Harris Anderson) that is almost 200,000 words long, notice the size of the individual chapters:

File Size

What happens if you don’t use tags or headings and your chapters have titles the conversion programs don’t recognize? What happens if you don’t have chapters at all and your ebook is deliberately one long tract? If it runs up against the 300KB file size limit (approximately 45,000 words), several things could happen:

  • Your file fails to convert
  • The conversion program inserts page breaks whether they are appropriate or not
  • The file converts, but some devices tell the user the ebook can’t be loaded

If your files are less than 300KB, but still largish (over 150KB) your readers could experience serious screen lag as they page through your story. This is an important consideration for genre fiction writers since the chances are your readers are Super-Readers and might have hundreds or even thousands of ebooks loaded on their devices. They will not be happy if your file sizes and their addiction cause several seconds of lag every time they “turn” the page.

What to do?

  • If you are using a word processor to style your ebooks, use the Heading 1 and Heading 2 styles for your chapters, parts and sections. (Do NOT depend on the conversion programs to recognize your inserted page breaks!)
  • If you are styling in html, use the <h1> and <h2> tags.
  • If your project does not have natural breaks such as chapters or parts (it’s long short story or novella) consider a minor restructure. Use the page count as your guide and try to find natural breaks around the 15,000 word mark–a scene break or time or pov shift or even an illustration that sits on its own “page”.

* If you are using Calibre to convert your ebooks, you can check the file splits in Calibre’s EPUB editor. You’ll see the list of individual text/html files and can open each one on the viewer/edit screen. If you are experiencing inappropriate page breaks, you can manage the fixes in the editor.

 

 

Why You Shouldn’t Format Your Word Docs

Dungeon babyThere’s a reason my ebooks are superior–two reasons, actually–and neither has anything to do with my technical prowess (I don’t have much) or talent (anyone can do what I’m about to tell you).

Reason Number One: Pre-production, I clean the text. As soon as a document comes up in the queue, I open it and start stripping it of everything that can mess up an ebook: extraneous paragraph returns, extra spaces, and tabs. I tidy up punctuation, tag areas that require special coding, neaten italics and check for special characters that won’t translate. As a writer and editor myself, I know most of the writer tricks and have a rather lengthy list of things to look for. By the time I’m ready to start coding, the text is so clean it squeaks.

Reason Number Two: Post-production, the ebook is proofread. I don’t care who proofreads the ebook. I can do it, the writer can do it, the writer can hire the job out to someone else. I give the writer a proof copy of the ebook and a mark-up document and encourage them to be as picky as they can stand. Even if they hire me to proofread, they still get the proof copy to load on a device or their computer so they can check the formatting and layout. The point is to find mistakes before the readers do. The point is to make sure the ebook works properly.

I am shocked and appalled that every single person who produces ebooks doesn’t do the exact same thing. They don’t and I know they don’t because I read ebooks that are filled with the types of errors and hiccups that text cleaning and proofreading would have rooted out.

The trad pubs are actually worse offenders than are indies, especially when it comes to back list. I can see it with my own eyes, but it’s amusing to see a publisher admit it publicly on The Passive Voice blog:

J.A. Our experience with Kindle is that as soon as a customer complains they take down the file and send the publisher a takedown notice. It’s actually a real pain in the neck. It could be one person complained and something very minor. We get them occasionally and we fix them right away. They give the reader a credit for the download. I should add that when files are converted they generally aren’t checked page for page like a print book might normally be. We rely on the conversion house to do a good job. If we keep catching errors or getting complaints we would change vendors. We pay pretty good money for these conversions. Our books are almost all straight text so conversions aren’t generally a major issue, but books with columns or charts, or unusual layouts do cause problems and need to be checked carefully. –Steven Zacharius, CEO, Kensington Books

Emphasis mine.

Having personally cleaned up well over a million words of scanned and OCR’d text, that statement offends the shit out of me. Writers deserve better. Readers deserve better.

So what’s that got to do with formatting Word docs? Everything.

If you’re a Do-It-Yourselfer, and are formatting your own ebooks, you cannot skip these steps. (On a sidenote, my biggest gripe with Smashwords is how difficult they make it to proofread an ebook. An upload has to go through the whole publishing process before you can look at it live on a device. Depending on how fast you are at proofreading, the ebook can be live–all goofs intact–for weeks before you can fix them and go through the process again.) My suggestion for the indie formatting Word docs for Smashwords (or any other distributor who accepts Word docs) is to convert them first with a program like Calibre and proofread the results. Find and fix problems before uploading the Word doc to Smashwords.

If you’re hiring a formatter, find out first if they clean up your file pre-production. Many do not. If that’s the case, you need to do the cleaning. Some pros charge by the hour to clean up the Word doc. The more elaborately you’ve formatted your document, the longer it will take to clean it up and the more expensive it will be. (Not to mention wasting your own time on needless work.) My suggestion, if you have special requirements, arrange for a system of tags to let the formatter know what you want. I ask writers to put instructions inside square brackets, i.e. [HEADLINE, PUT IN SMALL CAPS, CENTERED, EXTRA SPACE ABOVE AND BELOW].

Find out, too, the professional’s policy on proofreading. Do you get a proof copy? Does the formatter charge extra to input changes and corrections? (I charge for actual proofreading, but I don’t charge to input changes and corrections from somebody else’s proofread.) If you are not allowed to make post-production changes to your ebook, find another service. Trust me, no matter how well edited, cleaned and formatted the file is going in, you will find something to fix while proofreading. (Gremlins!)

So, for you writers working in Word, one final suggestion: Post the following where you can see it while you work and keep repeating it until it sinks in:

What I see on the computer screen is NOT how how my text will look, or act, in an ebook.

Word to Calibre to MOBI: Part 2: The html File

You finished Part 1 of this tutorial. Now on to Part 2. If you’re not familiar with html, what happens next is going to be freaky. But trust me, if you can copy/paste, you can do this.

NOTE: If your ebook is as simple as the one I’m using as an example, with no images and limited styles, you can stop right now and directly upload your Word file to Amazon. It will convert just fine and work well.

STEP 1: Do a Save As of your styled .doc file as an html file. It will look something like this:

CAL5Now you are done with Word.

STEP 2: Open your html file in Notepad++

Holy Moley! This is what it looks like?!?

CAL6CAL7STEP 3: Turn your special formatting tags into proper html tags

  • Italics <i> </i>
  • Bold <b> </b>
  • Underline <u> </u>

Easy to do with Find/Replace in Notepad++.

CAL8

Very important. ALL tags that are open must be closed. So if you have <i> for italics, then you must have </i> to close the tag. So use Find/Replace and make sure your numbers match up (Notepad++ will tell you how many items it replaced)

STEP 4 (Optional): Get rid of soft returns. Word has a nasty habit of inserting soft returns at the end of lines in paragraphs. In theory, they are meaningless. If you leave them in, they won’t affect your ebook very much. I have noticed, however, that they cause a wobbly quality to the justified text and some unusual behavior in line spacing. Not enough to affect reading quality, but enough to bug hyper-sensitive readers (like me). I prefer to remove them. If they bug you, too, let me know and I’ll show you how to use Find/Replace in Notepad++  to quickly remove them.

CAL9STEP 5: Get rid of the Section junk. If you styled your document the same way I did, you will have two lines of code–one at the beginning that says something like <div class=Section1> and a closing tag at the end of the document, </div>. They are extraneous. Delete them.

CAL11CAL10(by the way, if your Notepad++ file doesn’t look the same as mine, it’s because I have turned off word wrap and eliminated the extra soft returns)

STEP 6: Extract your styles. In my example there are three: MsoNormal, Center, and h1. Select them, copy them and paste them into a new text file.

This is what they look like. Comments in italics are mine.

h1
{mso-style-next:Normal; (Word junk, delete)
margin-top:48.0pt; (We are going to change this)
margin-right:0in;
margin-bottom:48.0pt;
margin-left:0in;
text-align:center;
page-break-before:always;
mso-pagination:none; (Word junk, delete)
mso-outline-level:1; (Word junk, Delete)
font-size:14.0pt; (We are going to change this)
mso-bidi-font-size:16.0pt; (Word junk, delete)
font-family:”Times New Roman”; (Delete)
mso-bidi-font-family:Arial; (Delete)
mso-font-kerning:0pt;} (Delete)

p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-parent:””; (Junk, Delete)
margin:0in; (Delete)
margin-bottom:.0001pt; (Delete)
text-indent:.3in; (Change)
mso-pagination:none; (Delete)
font-size:12.0pt; (Delete)
font-family:”Times New Roman”; (Delete)
mso-fareast-font-family:”Times New Roman”;} (Delete)

p.Center, li.Center, div.Center
{mso-style-name:Center; (Delete)
margin-top:6.0pt; (Change)
margin-right:0in;
margin-bottom:6.0pt;
margin-left:0in;
text-align:center;
mso-pagination:none; (Delete)
font-size:12.0pt; (Delete)
font-family:”Times New Roman”; (Delete)
mso-fareast-font-family:”Times New Roman”;} (Delete)

STEP 7: Modify the styles. The coding in an ebook is actually quite simple. The major bits for your css stylesheet are as follows and most are self-explanatory:

  • margin /This is the margin for each paragraph block. This controls the top, bottom, right and left
  • text-indent /This is for paragraph indents
  • font-size /Kindle books render in either “ems” or percentages. Converters do their best to recognize points (pts) and inches, but results are iffy. That is why we’re going to change them.
  • font-style /For italics
  • font-weight /For bold

We are going to keep this very, very simple. Because there will be some coding for the body text, you don’t need much in these paragraph styles. Basically, we will whittle and adjust so they look like this (feel free to copy/paste these):

p.MsoNormal
{text-indent: 1.4em;}

h1
{margin: 2em 0;
text-indent: 0;
text-align:center;
page-break-before:always;
font-size: 1.4em;
font-weight: bold;}

p.Center
{margin: 0.5em 0;
text-indent: 0;
text-align:center;}

If you want to play with the styling, go to the w3schools website. To know what works in a Kindle book, you can look at their “approved” list (which often seems to change on a whim).

STEP 8: Replace the header. Copy the text that follows:

<?xml version=”1.0″ encoding=”UTF-8″ ?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.1//EN” “http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd&#8221; >
<html xmlns=”http://www.w3.org/1999/xhtml&#8221; xml:lang=”en” >
<head>
<meta http-equiv=”Content-Type” content=”application/xhtml+xml; charset=utf-8″ />
<title>BOOK TITLE</title>
<style>
/*===Reset===*/
html, body, div, applet, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, acronym, address, code, del, dfn, img, ins, kbd, s, samp, small, strike, strong, sub, sup, tt, var, center, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, article, aside, canvas, details, embed, figure, figcaption, footer, header, hgroup, menu, nav, output, ruby, section, summary, time, mark, audio, video
{margin: 0; padding: 0; border: 0; font-size: 100%; vertical-align: baseline;}
body {text-align: justify; line-height: 120%;}

<!– Insert your paragraph styles here –>
</style>
</head>
<body>

Paste it in your file as follows:

CAL12New header and styles pasted in:

CAL14Step 9: In the menu bar in Notepad++ find Encoding and click it. In the drop down menu it will say: Convert to UTF-8 without BOM. Click that.

To see your styling live, in the menu bar you will see “Run.” Click it and in the drop down menu choose “Launch in (whatever browser you use)” Here is mine in Firefox:

CAL15See, that wasn’t so hard was it? Now you have a serviceable html file you can convert into an ebook. BUT, your job isn’t done quite yet. In Part 3 I’ll show you how to convert your file into a MOBI file that works.

_________________________________________

Styling ebooks isn’t difficult. Armed with only a few lines of code, you can create beautiful ebooks and some very interesting text effects. If anyone is having trouble getting their styles just right, feel free to email me–jayewmanus at gmail dot com–and I can probably come up with just the paragraph style you need.

 

 

Word to Calibre to MOBI: Part 1: Styling in Word

So, I’ve been obsessanating–again. In my last post I promised that there was a way to convert Word files in Calibre into ebooks that work perfectly on Kindles. That is true. It can be done. I was looking for a quick and dirty hack that worked every time. That is not possible.

Here’s the real problem. You got your indie writer who has put her heart and soul into writing her story. She’s not technical. She’s not a computer geek. She just wants readers to find and love her stories. Problem: How to get the story from Word onto a reader’s Kindle? Enter Calibre. Just save your Word file as an html file, load it into Calibre, convert it into a mobi file and upload it to Amazon. Done!

The problem with that? Calibre mobi files don’t quite work right when uploaded to Amazon. Period. They can work, at best, almost right. For the writer who’s eager to get back to writing her next story, that’s good enough.

As a reader, that attitude pisses me off. I buy and read a lot of ebooks. It pisses me off when the user preference controls don’t work. It pisses me off when I can’t navigate an ebook. (It’s not just indie publishers, folks. I get pissed off by the Big Pubs who can’t bother proofreading the ebooks and by the nastiness that turns up in ebooks built with InDesign, and don’t even get me started on the crap that happens when they turn scanned backlist books into ebooks.) A poorly produced ebook is equivalent to a writer using a mimeograph and newsprint, stapling the pages together and saying, “Here you go. That’ll be five bucks.” I’m insulted.

As an ebook producer, I get it. Amazon doesn’t make it easy. It’s next to impossible to break open a mobi file to tinker around in the code and fine tune it. Plus, as I explained before, Amazon has… quirks. They build their devices, then create the platforms, then play catch up with updates to older models, and it’s not easy keeping up.

NOTE: The last time I bitched about Calibre being the wrong tool, Calibre’s creator informed me that the “line-squish” problem could be solved by converting the ebooks into azw3. That works. Except… I didn’t explore far enough. Amazon rejects azw3 files, so they are useless for distribution through Amazon.

The easiest thing a writer can do to ensure having a perfect ebook to sell on Amazon is to hire someone who knows what they are doing. For any number of reasons, that isn’t always realistic. I’m a realist. Hence, this series of posts that will take you step-by-step through the process of turning a Word file into a commercial-quality ebook to sell on Amazon. The beauty of this is, you don’t really need to understand html or how ebooks work or anything technical at all. All you have to know is how to Copy/Paste.

Before you begin, you will need four–FOUR!–programs on your computer.

Microsoft Word
Notepad++
Calibre
Kindle Previewer

I assume since you are using Word, you have Word. The other three are freeware. A note about Word. You do not want to do this with .docx files. You want .doc files. Older versions of Word actually work a lot better for making ebooks than do later versions of Word.

Ready? Let’s begin.

PART 1: STYLING IN WORD

Step 1: Do a Save As so your original stays intact.

Step 2: Tag your special formatting (italics, bolding, underlining). A word about “special formatting.” This only applies to words or passages that are italicized, bolded and underlined in the body text. Such things as headers and sub-heads will be dealt with later.

Calibre1I use a simple tagging system for special formatting.

  • Italics: -STARTI- -ENDI-
  • Bold: -STARTB- -ENDB-
  • Underline: -STARTU- -ENDU-

STEP 3: Turn “manuscript” punctuation into “printer” punctuation.

  • “Curly” or “Smart” quotes, not straight quotes (and apostrophes). Do make sure your quote marks and apostrophes are turned in the proper direction–Word has a bad habit of reversing them.
  • Proper em dashes, not two hyphens or en dashes or spaced hyphens
  • Proper ellipses

STEP 4: Kill “soft” returns and tabs, and eliminate extra spaces

  • To turn “soft” returns into hard returns: In Find/Replace search for ^l (that’s a caret mark and lower case L) and replace with ^p (caret mark and lower case P)
  • To get rid of tabs: In Find/Replace, search for ^t (caret and lower case T) and replace with nothing
  • Don’t forget to get rid of extra spaces before and after paragraphs

STEP 5: Select all, copy and paste entire file into Notepad++

Calibre2Yes, that is what it looks like. That’s what it is supposed to look like. This is a straight text file.

STEP 6: Finish cleaning up the file

  • Delete blank lines
  • Tag scene breaks (I use ## because it is easy to find)
  • Search for and clean up special formatting tags. Word is very sloppy and you’ll find tags around empty spaces and jumping paragraphs and other untidiness.

STEP 7: Back in Word, open a New Document and set your Styles (I am going by the assumption that you know how to use style sheets in Word.) For the purposes of this tutorial, I used three styles for my ebook:

  • Normal (built in style in Word, modify as you wish)
  • Heading 1 (built in, also modified)
  • Center (user-defined style)

CAL1It doesn’t matter much what font you choose. Times New Roman is fine.

CAL2This will be used for your chapter heads. Again, font doesn’t matter much.

CAL3STEP 8: Apply the “Normal” style to the new document. Select all and copy the text file in Notepad++ and paste the entire document into Word

Calibre3STEP 9: Style the document.

  • Apply the Heading 1 style to all chapter/story headings
  • Apply the Center style to any text you want centered (in this case, I applied it to the scene break indicators, THE END and table of contents entries)

CAL4Calibre4STEP 10: Bookmark all your Heading 1 entries (Word automatically bookmarks Heading entries, but those will not transfer over so you need to insert bookmarks manually)

STEP 11: Link your bookmarks in the table of contents

That’s it for Part 1. Your document is now clean and styled and ready for Part 2: turning your .doc file into a proper html file.

_____________________________________

A word about styles. Like I said, for this tutorial I am using only three styles. You can use all sorts of styles to create visually pleasing ebooks–just remember one very important thing: Word is a program whose main purpose is to create print documents. What you see on the screen is pretty much what you will get on a sheet of paper, but it is not at all what you would get in an ebook. I suspect after you finish this full tutorial you will have a better understanding of how ebooks work and how Word works, and you will understand why it is so important to use style sheets religiously.

A word about questions. I know you have them. Let’s make them useful for everybody. If you have a question about this tutorial, especially if it is a “How do I do this…?” type of question, email it to me at

jayewmanus at gmail dot com

I’ll put together a post with questions and answers.

 

 

 

 

Calibre, Word and MOBI: A Tale of Three Programs

(Yes, I know, MOBI is not a program, but my blog, my headlines…)

Ever since I started blogging about ebooks, I’ve cautioned people against using Microsoft Word to format their ebooks. Not because Word is a bad program and not because it’s impossible to create ebooks with it. It’s because it’s the not quite right tool. Word’s strength lies in creating print documents or pdfs.

Recently, I’ve been cautioning people to not use Calibre to convert their Word files into MOBI files in order to sell them on Amazon. Not because Calibre is a bad program and not because it’s impossible to create MOBI files with it. It’s because it’s not quite the right tool. Calibre’s strength lies in managing a person’s digital library. It was not created to convert commercial ebook files.

EPUB files are not as troublesome as MOBI files. EPUB is EPUB is EPUB, and while each device has its own special way of rendering the file to fit the platform, the differences between devices aren’t big enough for most people to notice. A single EPUB file will work pretty much the same on a Nook as it does on an iPad.

Calibre is set up for optimum use with EPUB files. If a publisher converts a Word (html) file into an EPUB file using Calibre, then what they see there is pretty close to what a Nook or iPad reader will see.

This is not true with MOBI files. The reason is Amazon. You see, EPUB devices have evolved and changed and upgraded and gone the way all technology goes, ever upward and onward. But the device makers built the newer devices around the existing ebook platform. So an EPUB ebook formatted five years ago will work pretty much the same on a new iPad as it did on a first generation Nook. Amazon went bass-ackwards. They built the new devices then tinkered and recreated entirely new ebook platforms to fit the new devices. So a MOBI file being sold on Amazon isn’t just a MOBI file. It’s also a KF/8 file and an iOS file and an AZW3 file and god knows what else is there. I don’t quite get all the technical stuff. What I do get is that the same ebook can work fine on a Kindle Fire, but go to hell on a Paperwhite and look okay on a Kindle Keyboard and turn into gibberish if an iPad user gets hold of it.

The whys and wherefores don’t matter as much as the fact that a file formatted in a program which is optimal for printing documents and then converted with a program that is at its best with EPUB files, is going to have trouble meeting the very odd demands of Kindles.

(By the way, if you are using Scrivner or InDesign to create your ebooks for sale on Amazon, you will run into the same exact problems because Amazon is constantly tweaking and fiddling with the platform(s) and updating devices and they don’t necessarily share what they’ve done with the rest of the world.)

I realize that none of what I just wrote is going to dissuade people from using Calibre to convert their Word docs into MOBI files to sell on Amazon. I know this because people are using Word because that’s the program they know and love(hate) and they need a way to convert those Word files and Calibre is the shortest distance between A and B.

So instead of wagging my finger and clucking my tongue, I did some research. Question: Is it possible to format a file in Word and convert it with Calibre and create a MOBI file good enough to sell on Amazon? (Here, I make a very clear distinction. If your Nook died and you bought a Kindle, and you want to convert all your Nook books into MOBI files you can load onto your Kindle, Calibre is a great tool. That’s personal use. You expect that the ebook might not work completely right, but that’s okay, at least you have it. You can’t ask your paying customers to accept that standard.)

What I discovered is: Yes, it is possible.

I managed to fix the worst problems I see with Calibre-converted ebooks. I managed to create ebooks that respond properly to all the user preferences in three generations of Kindles (Kindle Keyboard, Paperwhite and Fire). I almost got Calibre to build a toc.ncx (what the user sees in the Go To features on Fires and Paperwhites) the way I want it to. I think with some more tinkering and fiddling around inside the opf file, I can fix that problem. I couldn’t get the cover to display on the bookshelf in my Paperwhite, but that’s kind of a non-issue, since Amazon will handle that when the book is uploaded. (It is only a big deal if a publisher is selling direct.)

Even though the ebooks I created this way aren’t up to my standards, they will respond to user preferences and they will look fine and read fine, and thus, they are good enough for uploading to Amazon.

There is a caveat. If you format your document, save it as an html file and convert it as is with Calibre, your ebook will be broken. It will be a substandard product you should not ask people to pay for. What you have to do first and foremost is format your Word file so it works within Calibre’s parameters, and secondly, you have to fix the html coding in the Word file.

Sound scary? It is, kind of. Word’s html coding is a nightmare, full of mso odd bits that give Kindles the hiccups. The good news is, all you really need to do is remove some very specific lines of code and rearrange a few others.

Since this post is running long and I don’t even have any pretty pictures to enliven it, (plus I have a buttload of Christmas gifts to wrap) I am going to explain how I did it in my next post. It’ll have pictures. In the meantime, if any of you, Dear Readers, have figured this out and feel like sharing in the comments, feel free.

Quick Tip: Tag and Restore Italics in Word

TRY THIS AT HOME

You all know that the key to a good ebook format is a squeaky clean source file, right? Word doesn’t produce particularly clean documents. For best results, you should strip out extraneous codes before you begin to format. Mark Coker of Smashwords calls it the “Nuclear Option.” You copy/paste your document into a text editor and that will remove all the unwanted coding. Then you copy/paste the clean text back into Word and you are ready to format.

Anyone who has tried this knows that doing so will not only remove unwanted coding, it’ll nuke your italics, too (and other special formatting and styles). Here is an easy way to tag all your special formatting and then restore it. (What I will show you applies to bolding, underlining, different sized fonts, etc., too.)

Here is a document in need of a good cleaning:

TagOpen the search box and make it look like this:

Tag 1If you open the “Format” box you’ll see a drop down menu that gives you a “Font” option. Open that.

Tag 3Notice the many, many options you can search for. Cool, huh?

I have come up with tags through trial and error. I use several different programs when I format ebooks, so I needed something unique for search purposes that didn’t make any of the programs say, “Oh no you don’t!” and crash the search box. I use all caps and hyphens to make sure they don’t get mixed up in the text. The most common tags I use are:

  • -STARTI- for italics
  • -STARTB- for bold
  • -STARTU- for underline
  • -END- to close the tag

Back to the document. Click Replace All.

Tag 2Now all your italics are wrapped in tags. This is a good time to go through and make sure your tags are in the right place and that you don’t have any blank space tagged.

Now copy/paste into a text editor:

Tag 4All your formatting is gone.

Now open a new file in Word and apply your main style sheet. Copy/paste your text into the new file. Open the search box and make it look this:

Tag 5Do a Replace All and… ta da!

Tag 6I generally wait until I’ve formatted all my headers and centering and any other styling necessary before I restore special formatting. Once done, all that’s left to do is to get rid of the tags.

Tag 7Replace All and done!

In the time it took you to read this blog post, you could have tagged and restored six files. It really is that easy.

 

Fun With Formatting Ebooks: Paragraph Styles

Whether a reader is conscious or not of doing it, they are judging at least some of the quality of your writing by how it looks on the screen. When you send your writing into the world you want it to look polished, professional, and assertive. Even if you don’t use fancy bits and curlicues, you can make your ebook look polished, professional and, yes, assertive–as in, “I am a smart and sophisticated writer who knows what she is talking about, so pay attention!“–just by taking care with your paragraph styles.

The most basic of basic styles are indented and block paragraphs. Convention says, indented paragraphs for fiction and block style for non-fiction. Why the convention? Indented paragraphs are quicker to read (not really, but doesn’t it seem that way?), while block paragraphs tend to be weightier, denser, and can add a measure of gravitas to the text. It’s really a preference and not about right and wrong. Readers do expect text to look a certain way, though, and you take a chance of distracting them from the prose if you mess with their expectations.

For those of you using anything other than html to format your ebooks, (pardon my shouting) NO TABS! Tabs, and using the space bar to indent paragraphs, play havoc with ebooks. NO TABS. Your word processor enables you to use style sheets–use them. NO TABS.

How wide an indent?   para6

The narrow indent is a leftover from the days of pulp fiction when every sheet of paper counted against the bottom line and so the publisher needed to cram as much text onto a page as possible. It looks a bit squishy, especially if the reader prefers narrow line spacing on their device. Wide indents are a writer habit, I think, from being used to working on manuscripts with their half inch indents. Too wide, though, and the ebook can assume the look of a manuscript, and that’s not polished. I prefer a medium width indent of 1.4ems (.3″ in a word processor).

Block paragraphs require spaces between the paragraphs so they don’t run together.

para5Whether you’re using a word processor or html, you need to include that extra leading in your style sheet–not (never) by manually inserting a blank line between paragraphs. Be aware, too, that you do not want to increase the space between indented paragraphs. Doing so means users of the Kindle iOS app will end up with huge spaces between paragraphs. Smashwords will reject files for inserting extra space.

Another style is one I don’t recommend for full paragraphs. Centering.

para4para3

Don’t forget that centering IS a style. Don’t just highlight the text then click the “center” command in the menu bar. Make sure your text indent is set to zero so the center doesn’t end up off-center.

Sometimes you’ll need to set off text. Quotes, song lyrics, poetry, missives.

para2para1The only difference in coding between the first block quote and the lines of poetry is the use of italics.

What if you want to set off an entire section of text?

para7Keep it simple, aim for sophisticated, and keep your reader’s comfort in mind while you style your paragraphs.

What about the rest of you? Any fun styling tricks you’d like to share?

Book Templates From the Book Designer!

This isn’t an ad–it’s a public service announcement. Joel Friedlander aka the Book Designer has launched a new service for indie publishers. It looks like a good one.

BOOK DESIGN TEMPLATES

booktemplates

If you don’t know who Joel is, pop over to his blog for a minute and look around. Not only does he post interesting and informative articles, he also does the monthly cover awards–open to all–AND he does the Carnival of Indies, a monthly round-up of the best blog articles for indie publishers.

His new service offers templates to help you create a professional looking print book using Word. Yes, that bane of the formatter’s existence, MS Word. What makes these different and better than the templates offered by CreateSpace is the heart and mind of an experienced, artful book designer behind them.

If you are interested in DIY print book formatting, you should at least check this out.

Format A Nice-Looking Novel For Smashwords

Everybody knows, or at least regular readers know, I don’t like using Word to make ebooks. Just about all distributors allow you to submit a doc or docx file to be converted into an ebook. You shouldn’t. You really, really shouldn’t. An ebook converted from Word will not work properly on many ereaders.

But. One major distributor does require Word files–Smashwords. They have their reasons and until they change those reasons, Word it is.

Rather than bitch again about the sheer silliness of using Word for ebooks, I’ll be constructive. Here is a quick primer on how to make a Word document that will make its way through the Meatgrinder without too much damage. (This is for fiction only. Trying to shove complicated formatting through Smashword’s Meatgrinder will give you hives and bald spots, so if you want to give it a shot, you’re on your own.)

I recommend before you do anything that you go in to TOOLS on Word and turn off all the auto-correct and auto-format features. This will cut down on Word’s “helpfulness” and make a better ebook. I also recommend that you turn on the SHOW feature so you can see the paragraph returns and extra spaces (in the menu bar it looks like a pilcrow).

STEP ONE: START WITH A CLEAN FILE

This is imperative. You will prevent 95% of ebook glitches by making sure your document file is clean. By clean I mean free of the excess or extraneous coding that Word inserts at every opportunity. You must use a text editor for this. I use Notepad++, which is a free downloadable program. Easy to use once you get used to the way it looks.

After your text is edited in Word, go through this checklist:

  1. Make sure your curly quotes are turned the right way.
  2. Get rid of tabs and extra spaces, including those before and after paragraph returns. Including those between sentences. You do not want double spaces between sentences in an ebook.
  3. Get rid of extra paragraph returns.
  4. Tag your special formatting such as italics, bolding and underlines. (VERY IMPORTANT: Your special formatting will disappear in the text editor)***
  5. Make sure you have proper em dashes and ellipses.

Now COPY/PASTE your text into the text editor. This makes a txt file (text). Go through your file and make sure you have gotten rid of all your extra spaces and hard returns. It will look a little odd, but don’t worry about the lack of formatting–you DO NOT WANT any formatting at this stage. If you are using Notepad++, open the Character Panel (it’s in the Edit drop down menu). That will give you ASCII characters. If you need to change your double or single quotes, em dashes, special characters, etc. use the characters and symbols from the Character Panel.

STEP 2: MAKE YOUR STYLE SHEETS

Smashword’s Meatgrinder is set up to work best with certain stylesheets already built in to Word. If you are not familiar with using stylesheets in Word, now is the time to learn. You’ll find them under FORMAT in the main menu. For most fiction, all you need are four stylesheets.

  1. NORMAL
  2. HEADING 1
  3. HEADING 2
  4. center

NORMAL: This is what you’ll use for the body of your text–the main style. You will find listed in style sheets. You can modify it. My recommendation is to stick as close to ereader defaults as possible. So don’t modify too much. Safe settings are:

  • Font: 12 point Times New Roman
  • Align: Left
  • Level: body text
  • Indent: 0 for right and left
  • Special: First Line by 0.3″ or 0.4″ (this is the paragraph indent)
  • Spacing: Before 0; After 0
  • Line Spacing: single

HEADING 1: This is what the Meatgrinder will look for to title your book. For most projects, you only need to use it once. Here you can increase the font size (don’t go higher than 16 points and use the same font as for the rest of your book) and bold or italicize it. You can also center your text, drop it down on the “page” and add some space between your title and the author name. A set up might look like this:

  • Font: 16 point Times New Roman, bold
  • Align: Center
  • Level: body text
  • Indent: 0 for right and left
  • Special: (none)
  • Spacing: Before 12; After 6pt
  • Line Spacing: single

HEADING 2: This is what the Meatgrinder will look for to find your chapters so it can build the toc.ncx (very important).

  • Font: 16 point Times New Roman, bold
  • Align: Center
  • Level: body text
  • Indent: 0 for right and left
  • Special: (none)
  • Spacing: Before 12; After 3pt
  • Line Spacing: single
  • Page Break Before (check this box under Format > Paragraph > Line and Page Breaks)

CENTER: You don’t have to have a style sheet for centering text, but it makes life easier since you don’t have to remember to get rid of the indent. Set it up exactly like NORMAL, except:

  • Align: Center
  • Special: (none)

STEP THREE: Open a new Word file and apply the NORMAL style sheet. COPY your text from the text editor and PASTE it into Word. Your text should be formatted in NORMAL style with indented paragraphs. (Just in case a hiccup occurred, scan through the text and make sure there aren’t any extra paragraph returns–look for blank lines and delete the extra paragraph return)

STEP FOUR: Use FIND/REPLACE to restore italics, bolding and underlining. Then use FIND/REPLACE to delete your special formatting tags.

STEP FIVE: Make your title page. Highlight your book title and apply the HEADING 1 stylesheet. A nice title page for Smashwords will look something like this:

SW Title PageOnly the title uses HEADING 1. With everything else I used the CENTER stylesheet.

STEP SIX: Do your chapter heads. Select (highlight) your chapter and apply the HEADING 2 stylesheet. If you set up stylesheet the way I recommended, it will give you a page break.

STEP SEVEN: If you used scene breaks, go through and select whatever you used to indicate scene breaks and center them. I also like to add a paragraph return before and after a scene break just to make them stand out a bit more.

SW scene breakCHAPTER EIGHT: Add links and/or make a table of contents. Both are optional. Links and hyperlinks are something Word handles very well and generally cause no problems with Smashwords. Use the INSERT HYPERLINK command from the menu. If you make a table of contents, use the BOOKMARK option, and bookmark the chapter heads then link within the document.

And there you go. A simple format, rather generic, but it will go through the Meatgrinder, have minimal formatting errors and be readable on the platforms Smashwords distributes to.

Have fun!

***A word about tagging special formatting. The text editor will strip out your special formatting, so you must tag it. All you need are unique strings of text that you can search for. I use hyphens and all caps to make sure the tags don’t get mixed up with my story text.

  • Italics: -STARTI- and -ENDI-
  • Bolding: -STARTB- and -ENDB-
  • Underlining: -STARTU- and -ENDU-

To tag quickly–for italics–in the FIND box, ask it to look for italics but leave the box empty. In the REPLACE box type -STARTI-^&-ENDI- and do a REPLACE ALL. That will wrap all your italics in tags. To reverse the process, toggle on “wild cards”, type -STARTI-*-ENDI- in the FIND box and toggle on “italics” in the REPLACE box, but leave it empty. Do a REPLACE ALL and your italics are restored. Then do a FIND/REPLACE to delete your tags.

QUICK UPDATE: NOW Smashwords is kicking back files if there is leading (extra space) after a paragraph (indented paragraphs only, not block style). So let us cross our fingers that Apple has fixed whatever it was that caused them to squish paragraphs. grumble grumble grumble…