Punctuation & Special Characters

When it comes to punctuation and special characters in ebooks what you see is not always what you get. In example, here is the standard keyboard and some special characters in MS Word:

I took exactly what you see on that screen shot and turned it into an ebook for my Kindle. This is what happened:

Why does it do that? It’s because ebooks use ASCII characters. Word processors can use dozens of different character sets, and will allow users to utilize them–for print–interchangeably. Ebooks turns non-ASCII characters into gibberish.

The ASCII character set contains most special characters that are used in fiction–letters with acute or grave marks, umlauts, copyright symbols, em dashes, currency symbols. To find out if you have special special characters outside the ASCII set, do a quick convert of your file and see if gibberish shows up. If it does, you’ll need to use named entities (if you are formatting in html) or find a replacement (if you are using a word processor).

You can find the complete list of named entities here: HTML Symbol Entities Reference guide at w3schools.com.

Be aware, too, that some characters (special or otherwise) are considered “reserved characters” in html. Straight quotes (double and single), the ampersand, and less-than and greater-than marks. Depending on the text, they can break an ebook. Those characters should be turned into named entities.


Curly Quotes:  Curly quotes are a pain. The temptation is to dispense with them and go with straight quotes, except, straight quotes look like crap in an ebook and can even break it. If you’re using Word, turn on the auto-correct feature that allows for curly quotes, then do a Find/Replace with a quote mark (or single quote/apostrophe) in the Find box and the same in the Replace box. Do a replace all and Word will automatically turn the quote marks (and apostrophes) in the right direction. Unfortunately, the “right” direction is sometimes the wrong direction. For instance, if you have a bit of dialogue that ends in an em dash—

“Watch out for that—“

Word will put a left double quote mark at the end of the sentence. You can use Find/Replace to find the wrong-way quote marks and apostrophes (and it’s a bit of a pain, but it must be done). This is actually faster in a text editor than it is in Word.

Ellipses: You wouldn’t think three little dots would give anybody any trouble (and yes, writers, there are ONLY THREE periods in an ellipsis). It is quite proper to use punctuation after an ellipsis (comma, period, question mark, etc.) BUT, if you merely do a dot dot dot you could end up with orphaned periods…

. or lonesome quote marks sitting on lines by themselves.

If you space them the way they often are in print . . . they could end up unevenly spread across a line in an ebook (because of the text justification).

Treat your ellipses as special characters. In Word do a Find/Replace with … in the Find box and … in the Replace box then do a Replace All. That will turn the three periods into proper ellipses.

Dashes: I’m not sure about devices that use the EPUB format (Nook, iPad, Kobo Reader, etc.) but Kindles now handle em dashes reasonably well. They break at the end of sentences the way they are supposed to. Refer to a style manual for the proper use of hyphens, en dashes and em dashes.

12 thoughts on “Punctuation & Special Characters

  1. Hi Jaye:

    Great info, especially on the substitution of named entities for special characters!

    However, following your example of “‌—‌” has, for me at least, led to a number of “issues.” On some PC-based e-readers (including the Nook Reader, Sigil, and Calibre), the “‌” entity actually appears on-screen as a thin vertical line. And, while my Nook Tablet e-reader doesn’t show this entity, it does appear in the results of doing a Find within said e-reader that includes text around the found text.

    It’s a great idea, but my experience is that it’s not working correctly. Any suggestions for something that will work? I may just change the “‌” to a plain ol’ space. The positioning in the text may not be perfect, but that seems preferable to the vertical line.

    Just my observations….


    • Ay yi yi, Jon—another thing to worry about? I have no idea why that particular named entity would translate into a different character. The problem is in the ereaders and how they read formatting codes. Experimenting with different devices is the only way I would know how to root out the problem. Alas, I do not have access to a multitude of devices. What version of HTML coding are you using for your ebooks? I’ve been having good results with UTF-8. (knock on wood)

      I’ve come to the conclusion that ereaders just plain don’t like the em dash—they certainly don’t respect it. I’ve fussed and experimented with it until even I am sick of the subject. Judging by the number of search term results about the em dash that send people to this blog I’d have to say that I’m not the only person who is frustrated with it.

      Maybe it is time to partially retire the em dash. That’s not being a smart aleck. The em dash sort of replaced the parentheses to bracket parenthetical statements. I’ve had editors tell me over the years that parentheses are “old fashioned” and “distracting.” To use the em dashes instead. Well, ereaders don’t mind parentheses and crappy text justification doesn’t distort them overmuch. The only problem I ever have with them is that they look like crap when italicized, but taking care to put the italics inside the parentheses takes care of that.

      • Hello,
        I’d like to weigh in on the “em-dash” issue. I write a lot of dialogue. Speakers are interrupted from time to time, so the em-dash is my way of dealing with that to let the reader know that the character has been interrupted. If I can’t use an em-dash because the readers can’t translate it into the proper grapheme, what can I use? It seems a little crazy to me that this would affect my writing style, but it will if I have to use finished sentence. One reader suggested I use the ellipsis however that conveys a completely different tone. I guess I’m frustrated because it seems that, rather than being able to write my book the way it is intended, I now have to “change” the phrasing and writing style to match what is readily available from a formatting standpoint. That’s just crazy. I have seen books on e-readers that do have em-dashes in the text body, but they are published via traditional publishing houses rather than self-published . That’s not an option for me at this point. Any ideas?

  2. Pingback: Damn It. ePubbing Is No Place For Purists | J W Manus

  3. Pingback: Self-Publishers: Do You Need Nurturing? | J W Manus

  4. Pingback: An Admonition for Self-Publishers. Ahem… | J W Manus

  5. Hi Gabrielle,

    What I have learned. If you are formatting your ebooks in Word, use Times New Roman as your font. The converters at all the big companies recognize the font and the “special characters” it uses. When I compose in Word, I use the old typewriter trick of two dashes to indicate an em dash. When I am finished, I do a Replace/All with the two dashes in the Find box and ^+ (caret plus sign) in the Replace box. That will turn your dashes into em dashes and the special character will be recognized by ereaders.

    If you are using html, you must make sure your file is coded in UTF-8 and that you use ASCII characters (of which, the em dash is one). If you need characters outside the ASCII selection, you’ll need to use named entities (the em dash is & mdash ; but closed up).

  6. Hi,

    I’ve been reading through all of your posts and have found your advice to be hugely valuable. I do have a question about speech marks in regard to ebooks, and as I write this I wonder if it’s a silly question, but here goes. Does it matter what style the speech marks come in? Either the double ” or the single ‘ one? I ask because I’ve noticed both ebooks and traditional vary from one author or publisher to another.

    Is there a rule of thumb for this or is it simply based on the preference of the author?

    A second question, just to be annoying, is that as I’ve used html for designing websites, would it worth copying text from my Word doc into Dreamweaver and editing the html there prior to uploading to Amazon? After reading your topic on Clean Source Files I figured this would be a good way to ensure text is as clean as possible. Any feedback would be much appreciated.

  7. Hi Dave,

    Punctuation is a cultural thing. American English standard is to use double quotes for direct speech and single quotes for everything else. In the UK, it’s the opposite. If you go further around the world you can find all types of punctuation styles ranging from double brackets to dashes to nothing at all. If you’re writing for mainly an American audience, go for the double quotes. Best of all is consistency. Readers will accept just about anything as long as it is consistent and doesn’t confuse them (even if it takes them a little while to get used to it). A pretty much can’t go wrong with it style manual is Strunk & White’s Elements of Style. It’s short, to the point, and if you follow their style recommendations, you won’t confuse anybody.

    As for the html question. Ebooks act just like little websites, so you shouldn’t have any trouble transferring your skill set. I haven’t used Dreamweaver, but a lot of readers of this blog do and they get good results.

    • Jaye,

      Thanks for your reply and clarifying that for me.

      I’ve ordered Element’s of Style from Amazon after having a look inside, a worthy read indeed. It looks similar to My Grammar and I by Caroline Taggart, though it can’t hurt to delve deep and discover more about the craft of writing.

  8. Pingback: Book covers, Skyrim and the speech mark conundrum | Dave Farmer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s