Punctuation & Special Characters

When it comes to punctuation and special characters in ebooks what you see is not always what you get. In example, here is the standard keyboard and some special characters in MS Word:

I took exactly what you see on that screen shot and turned it into an ebook for my Kindle. This is what happened:

Why does it do that? It’s because ebooks use ASCII characters. Word processors can use dozens of different character sets, and will allow users to utilize them–for print–interchangeably. Ebooks turns non-ASCII characters into gibberish.

The ASCII character set contains most special characters that are used in fiction–letters with acute or grave marks, umlauts, copyright symbols, em dashes, currency symbols. To find out if you have special special characters outside the ASCII set, do a quick convert of your file and see if gibberish shows up. If it does, you’ll need to use named entities (if you are formatting in html) or find a replacement (if you are using a word processor).

You can find the complete list of named entities here: HTML Symbol Entities Reference guide at w3schools.com.

Be aware, too, that some characters (special or otherwise) are considered “reserved characters” in html. Straight quotes (double and single), the ampersand, and less-than and greater-than marks. Depending on the text, they can break an ebook. Those characters should be turned into named entities.

PUNCTUATION

Curly Quotes:  Curly quotes are a pain. The temptation is to dispense with them and go with straight quotes, except, straight quotes look like crap in an ebook and can even break it. If you’re using Word, turn on the auto-correct feature that allows for curly quotes, then do a Find/Replace with a quote mark (or single quote/apostrophe) in the Find box and the same in the Replace box. Do a replace all and Word will automatically turn the quote marks (and apostrophes) in the right direction. Unfortunately, the “right” direction is sometimes the wrong direction. For instance, if you have a bit of dialogue that ends in an em dash—

“Watch out for that—”

Word will put a left double quote mark at the end of the sentence. You can use Find/Replace to find the wrong-way quote marks and apostrophes (and it’s a bit of a pain, but it must be done). This is actually faster in a text editor than it is in Word.

Ellipses: You wouldn’t think three little dots would give anybody any trouble (and yes, writers, there are ONLY THREE periods in an ellipsis). It is quite proper to use punctuation after an ellipsis (comma, period, question mark, etc.) BUT, if you merely do a dot dot dot you could end up with orphaned periods…

. or lonesome quote marks sitting on lines by themselves.

If you space them the way they often are in print . . . they could end up unevenly spread across a line in an ebook (because of the text justification).

Treat your ellipses as special characters. In Word do a Find/Replace with … in the Find box and … in the Replace box then do a Replace All. That will turn the three periods into proper ellipses.

Dashes: I’m not sure about devices that use the EPUB format (Nook, iPad, Kobo Reader, etc.) but Kindles now handle em dashes reasonably well. They break at the end of sentences the way they are supposed to. Refer to a style manual for the proper use of hyphens, en dashes and em dashes.

4 thoughts on “Punctuation & Special Characters

  1. Hi Jaye:

    Great info, especially on the substitution of named entities for special characters!

    However, following your example of “‌—‌” has, for me at least, led to a number of “issues.” On some PC-based e-readers (including the Nook Reader, Sigil, and Calibre), the “‌” entity actually appears on-screen as a thin vertical line. And, while my Nook Tablet e-reader doesn’t show this entity, it does appear in the results of doing a Find within said e-reader that includes text around the found text.

    It’s a great idea, but my experience is that it’s not working correctly. Any suggestions for something that will work? I may just change the “‌” to a plain ol’ space. The positioning in the text may not be perfect, but that seems preferable to the vertical line.

    Just my observations….

    Jon

    • Ay yi yi, Jon—another thing to worry about? I have no idea why that particular named entity would translate into a different character. The problem is in the ereaders and how they read formatting codes. Experimenting with different devices is the only way I would know how to root out the problem. Alas, I do not have access to a multitude of devices. What version of HTML coding are you using for your ebooks? I’ve been having good results with UTF-8. (knock on wood)

      I’ve come to the conclusion that ereaders just plain don’t like the em dash—they certainly don’t respect it. I’ve fussed and experimented with it until even I am sick of the subject. Judging by the number of search term results about the em dash that send people to this blog I’d have to say that I’m not the only person who is frustrated with it.

      Maybe it is time to partially retire the em dash. That’s not being a smart aleck. The em dash sort of replaced the parentheses to bracket parenthetical statements. I’ve had editors tell me over the years that parentheses are “old fashioned” and “distracting.” To use the em dashes instead. Well, ereaders don’t mind parentheses and crappy text justification doesn’t distort them overmuch. The only problem I ever have with them is that they look like crap when italicized, but taking care to put the italics inside the parentheses takes care of that.

  2. Pingback: Damn It. ePubbing Is No Place For Purists | J W Manus

  3. Pingback: Self-Publishers: Do You Need Nurturing? | J W Manus

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s