In the Formatting Wars, How Writers Can Win

In my post earlier this week I talked about standardizing ebook formats. My friend, Jonathan Allen, did a pretty good job of explaining why there are so many platforms and proprietary formats for ereaders. Today he continued the explanation as to why it probably is going to take quite a while before we have ONE format and universal ereaders. Even though I now have a better understanding about how the situation turned into a mess and what it will take to untangle it, I don’t hold out much hope the situation is going to resolve itself anytime soon.

As a writer and ebook producer, this is not particularly heartening. I guess I hoped that if I turned my energies toward learning HTML, all my stress would magically disappear. Now I know that is not true. And it is stressful. I just finished a big project. (It turned out great, by the way, I am very proud of it–you can look at it here) The night before I had nightmares about the book being all messed up. After uploading it yesterday to Amazon, while it was in review, I tossed and turned and fretted all night that the book would be all messed up. Today I uploaded it to Smashwords and my stomach is clenched up with worry that I accidentally did something that will mess up the book.

All it takes is one little bit of wayward code that I can’t even see and weird crap could show up on some unsuspecting reader’s device.

Ay yi yi.

All is not lost. While the computer wizards are hashing it out, there is one thing we writers can do to make sure our ebooks don’t become casualties of the formatting wars.

Clean source files.

I bet 80% of the writers who read this post use a version of MS Word. As much as Word frustrates me because it’s so darned helpful, I love the way it produces documents. Therein lies the problem. To produce those gorgeous documents, Word uses a lot of codes, hidden and unhidden. In a printed document, it doesn’t matter how much junk is hidden in the file. Most printers have no beef with MS Word. In most cases, whatever you tell Word to print, it will gladly do so. Ebook files, however, are not documents. Much of that lovely formatting–tabs and extra paragraph returns and centering and font changes and special characters and headers and footers and page numbers and footnotes–will be interpreted by other programs as junk that needs to be fixed. Or some program, somewhere, might throw up its hands in despair and fill a screen with gobbledegook.

Writers who intend to publish their writing as ebooks–whether they do the formatting and conversions themselves, or hire someone else to do it–need to get out of the “document” mindset. What they need to start doing is thinking of the composition–the novel, short story, article, whatever–as a “source file.” Start thinking of formatting as a completely separate process. You compose a source file, then you use a copy of it to create a printed document, a pdf, an ebook or whatever else you require. There is no special formatting in a source file.


The suggestions that follow are for Word users, but no matter what word processor you’re using, you can adapt to suit your needs.

  1. No tabs. Ever. Never ever use the tab key in a source file. Not even one for good luck. No tabs!
  2. No extra spaces. Not between sentences, not after paragraphs, not at the top of the page, not to indent a passage, not to set off text. No extra paragraph returns either.
  3. No page breaks. But, but, Jaye, what about between chapters? No. Not even one.
  4. No headers. No footers. No page numbers.
  5. Turn off Auto-Correct and Auto-Format. You are safe with leaving on italics, bolding and underlining, but everything else, turn it off. Even curly quotes can cause a problem, so turn them off, too.
  6. Use “typewriter” special characters. Two hyphens for an em dash. Three connected periods for an ellipses. (c), TM, (R) instead of the special symbols. Do not insert subscript and superscript characters. If you have words requiring umlauts, accents or whatever, keep track of them. They can be made right during formatting.
  7. No bullets or ordered lists or outlines.
  8. Set up a Source File style sheet. (I give instructions for how to set up style sheets in Word here) Make it simple, bare bones, with a font you like to work in. Use it religiously.

Source files are plain as milk and not particularly pretty. What they should be is clean. Make a copy of your source file to create a printed document with headers, footers, special characters, centering, specified page breaks, and whatever you desire. Make a copy of the source file to format your ebooks according to different platform requirements. If you outsource the work, include a set of instructions to the formatter as to how you want your book laid out, along with a list of special characters, symbols and any special formatting you desire.

That’s how we keep our heads during the format wars, Writers. Clean source files. Make those standard and we can endure the wait until the powers-that-be, whoever they are, get their act together and stop making things difficult for the rest of us.



18 thoughts on “In the Formatting Wars, How Writers Can Win

  1. What I can’t help noticing is that every article like this makes the case for switching to a more friendly word processor and leaving Word behind for anything that’s going to be turned into an ebook. It makes me very grateful to be one of the 20%, a number that I suspect is increasing.

    • I suspect you are right, Catana. It’s really not about the software, it’s about the mindset. I think as writers slowly stop thinking about “documents” and how the file will look in print–and how it looks on the screen while you’re working–and start getting in the groove with digital files, they’ll make their upgrades and changeovers to programs that produce good source files. “Word processors” will give way to file generators. It’s just going to take time.

  2. I’m learning this stuff, too. Question: what do you do instead of a page break? How do you force the file to end this chapter halfway down the page and start a new chapter at the top of the next page?

    • Depends, Margaret. If it’s your source file, you don’t. If you’re formatting an ebook file in Word, use the INSERT>PAGE BREAK command and make sure you do it at the exact end of the text so you don’t end up with a blank page. If you’re generating a digital file, you cannot depend on what you see on your screen translating onto another device. Spacing is a bitch and I still haven’t cracked the code on it, and quite frankly with all the differently sized screens, I don’t know if there is a code to crack.

      If you’re doing a layout for a print book, that is way beyond my skill set. The most I’ve learned so far is how to make a fairly nice looking pdf file. I know why the print book designers make the big bucks. 😀

  3. I’ve started using Scrivener. Any advice there? Scrivener puts in an indent at the start of each paragraph and then the entire document is one long page. No fuss, no muss. But what about italics and m-dashes? Scrivener seems to convert two dashes into an m-dash-like-thing. Are those okay?

    Thanks 🙂

  4. Great post, Jaye. I’ve started using Scrivener, too. What happens if I take an existing Word doc and move it into Scrivener. Will Scrivener clean it up magically with a few settings? I’ve been too busy writing the new story to experiment much with formatting in Scrivener yet but I really do enjoy Scriveners’ features.

    • Char, I like to clean up files in Word, because its find and replace function makes it very easy to find extra spaces and such. Once the file is clean, then I move it into Scrivener for formatting. Scrivener’s formatting features are very basic, elegant in their simplicity. What it’s not so great at (that I’ve found so far) is super fine tuning. I have become fairly proficient at using Word and Scrivener simultaneously, taking advantage of both their best features. What I have found out, the hard way, is that some special formatting in Word will make Scrivener balk. Clashing codes, so to speak. If I’m doing something that requires a lot of special formatting, I will make a copy of my Scrivener file and work on that just in case something makes the program go, “Nuh uh, can’t make me.”

      To make sure you’re not introducing balk-inducing code, copy your Word file into a text editor like Notepad (make sure you tag your italics, bolding and underlines because they will be lost) then copy that and paste it into Scrivener. That will give you a clean slate to work from.

      Or, give up Word altogether (if you’re just working on your stuff–I do a lot of work on other peoples’ stuff) and get used to doing all your writing in Scrivener. It takes some getting used to, but after a while it feels just fine, and you can be sure you aren’t introducing any rogue codes.

      If you intend to use Smashwords, Scrivener generates .doc files with no problem at all.

  5. Great post, Jaye. I’m working on creating an eBook that will give step-by-step instructions for being able to do the kind of things you’re talking about with Word. I’d add one minor note about your no-page-breaks rule. You’re definitely right, NEVER insert a page break (we have an unwritten rule at my day job about this), but you can still have your page-breaking cake and eat it too by defining a style that puts a page break before. That way the style interpretation is entirely internal to Word and won’t throw off anything that has to deal with Word’s screwy interpretation of XHTML.

    • Indeed. I will be very interested in seeing your book, Jonathan.

      Rule of thumb: In a DOCUMENT for print, page breaks don’t matter, use as many as you like and make them anyway you like. In a DIGITAL FILE, page breaks are persnickety and must be handled with care.

  6. Jaye,

    Writers need to understand the hard truth behind how and why standards work in some areas and not in others. The tech world pretty well fits the Hobbesian view, life is nasty, brutish, and short. There’s not going to be an ebook standard format in your lifetime. Let’s take a short detour into the wonderful world of web standards to see why.

    Forget the morass that is HTML and think for a minute about HTTP. No one, aside from a tiny group of geeks, ever thinks about HTTP. That’s because HTTP just works. All around the world, millions of times per second, connecting the widest array of devices humanity has ever invented, carried on virtually every type of network, in application scenarios never envisioned by its creator (some of them wildly inappropriate), across all but the most strigent of firewalls, HTTP works as flawlessly as any software system there is. The number of failures caused by protocol implementation issues is essentially zero (not quite zero, but close enough).

    HTTP stands for HyperText Transfer Protocol and HTML stands for HyperText Markup Language. The two standards were very closely linked in the beginning. They were invented by the same guy (Tim Berners-Lee) at more or less the same time. But the course of the two standards couldn’t be more different. The fundamental reason for the difference is pretty simple. Everyone needed a reliable way for machines to communicate at the application level and verification of standard compliance was relatively straightforward. There was no advantage to anyone in creating breaking proprietary extensions to the protocol. In short, the natural forces of software ecology meant that the protocol that became the dominant application protocol would be rigorously adhered to. HTTP became that protocol largely because it was dead simple to implement.

    Now let’s come back to HTML. Think about the natural forces working on that standard. There are billions of end users, millions of web page creators, thousands of applications that generate HTML, and tens of companies that make browsers. It’s not hard to see that the browser companies, with their software in the hands of all those end users, will end up calling the shots in this environment. The browser makers have to compete based on features. They all have incentives to extend the standards in ways that make life miserable for the content creators. End users don’t care about standards (and really shouldn’t, in my view). No group of web page creators or HTML tool maker controls any significant portion of end users, who just want to get to the whatever part of the web strikes their fancy. Blaming the lack of HTML standards solely on Internet Explorer is silly (as is blaming Amazon for the same problem in ebooks, as I will explain).

    The situation in ebooks is very similar to the one in HTML but on slightly smaller scale, with one interesting exception. Millions of readers, tens of thousands of content creators, and tens of companies making ebook readers (or software). But where are the hundreds of companies making software to create ebooks? Like the early days of the web, there are content creators like Jaye who are doing the formatting themselves with the tools at hand (on the web, it was writing your web page in Notepad or EMACS) and companies offering hand tuned formatting as a service. The cartel that is Big Publishing is holding back the expansion of ebooks which means the natural market for good software for formatting ebooks is being retarded.

    The good news for writers is that we (the software folks) know how to solve this problem. Forget about a single markup standard for displaying ebooks. It won’t happen. In the software business we look for commonalities in the forces that create problems so that we can apply patterns to our solutions. The appropriate pattern to solve the problem that faces writers in the digital era is called “pipes and filters”. Think of an oil refinery. A pipeline (or other transport mechanism) brings crude oil into the refinery. The oil flows through a series of stages where its components are separated, treated, and mixed with various additives and catalysts. A bunch of different hydrocarbon products may come out the other end (gasoline, heating oil, diesel, natural gas, etc.).

    The ebook refinery should work the same way. Your crude oil is a manuscript. It gets edited, proofed, and perhaps translated. Then it gets transformed into whatever display formats are current at the time of publication. When a new format comes out, you just have to add a new filter stage to the pipeline.

    The key to making this work is coming up with a intermediate format for the conceptual notion of a book, separate from the way it is displayed. This format represents a book structurally; it knows titles, tables of content, chapters, paragraphs, and emphasized text (what you think of as bold, italic, and underlined). It might also know more complex structural elements like notes (such as end notes and footnotes), non-standard textual features (line fragments for poetry or recipes, tables, lists, code listings, etc.), non-textual insets (graphics, etc.) and indices.

    Designed properly, this format can grow over time to take into account most of the features necessary for almost any traditional book and quite a few non-traditional books. We could start to free people from the tyranny of hand-coded pseudo-HTML and vendor-supplied meatgrinders.

    • You all heard William, smart computer people. Get cracking!

      Indeed. I am so not interested in the mechanics of how anything works (until i run into a problem and have to figure out the mechanics, and quite frankly, I spend way too much time with that). I would love a program that gives me the basics so that I could turn all my energy into the creative end of it without worrying that the creative tweaks are going to make my files break down or blow up.

      • Yes, exactly. I have no doubt it will happen. The question is, how gracefully it will happen.

        After all, people used to have to hand code HTML and whatever to build a website. Now, if you don’t mind some inflexibility, you can use WordPress or the equivalent without looking at a single .

  7. Exactly, Marie. And then, knowing the basics will work just fine and continue working fine, those who want the fancy stuff can learn to do the fancy stuff.

  8. Pingback: The #TESSpecFic Weekly: Happy เดือนมิถุนายน! | Shaggin the Muse

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s