Kindlegen and the OPF file

In an earlier post, I talked about Kindlegen and how to make it sing. This post talks about the key to all that singing: the OPF file. The OPF file is essentially a list of things for Kindlegen to do. It consists of three parts: the “manifest.” the “spine,” and the “guide.” You feed the OPF file to Kindlegen rather than the HTML file. Let me demystify the OPF file.

First of all, an OPF file is just a plain old ordinary text file. You DO NOT create it with your word processor. You create it with a text editor. If you are running Windows, you have Notepad (dreadful) and Wordpad (not quite so dreadful) available at your fingertips. I use TextPad. Lots of people use Notepad++.

The OPF file is an XML file, so it has some stuff in the header:

<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="tut">

Notice right off the bat that I created my own value for the ‘unique-identifer.’ It doesn’t matter what it is, just be consistent between files.

We now come to the metadata. The word “metadata” means data about data. Sort of data-squared. The metadata starts with, appropriately enough, a metadata tag:

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">

Don’t worry what it means, just pick it up verbatim and put it in your file.

Now you need to add some metadata. You can probably figure out what to do from context:

<dc:title>The Unexpected Traveler</dc:title>
<dc:language>en-us</dc:language>
<meta name="cover" content="my-cover-image" />

Obviously your title goes in place of mine (The Unexpected Traveler). Copy the next two lines after the titleline verbatim. The line about the cover is just a prelude to actually telling it what the cover image is.

<dc:identifier id="tut" opf:scheme="ISBN">9781375890815</dc:identifier>

Use that line only if you have an ISBN. Hint: for a Kindle book, you don’t need an ISBN.

<dc:creator>David Casler</dc:creator>
<dc:publisher>Mt. Sneffels Press</dc:publisher>
<dc:subject>Fantasy</dc:subject>
<dc:date>2012-03-01</dc:date>
<dc:description>The best fantasy novel you've ever read!</dc:description>
</metadata>

Obviously you’ll put your own data in there. The date is the book’s release date. And you’ll want to make your description longer—think of it as the description that goes on the back cover. You close the metadata with the </metadata> tag.

The next element is the manifest. Now this doesn’t mean “manifest destiny” or anything so magnificent. It’s just a list. A list of the cargo on a ship is called the ship’s manifest. That’s where the word comes from: blame the shipping industry. But it’s just a list. You start it and close it in the usual way:

<manifest>

What comes after is your list of HTML files. Note that I split my HTML into three parts: the front matter, the table of contents, and the rest. Each gets a mention in the manifest.

<item id="front_matter" media-type="application/xhtml+xml" href="tut_front_matter.html"></item>

The “front matter” is everything from the cover up until the start of the table of contents. Next, I pulled out the table of contents into its own html file:

<item id="item1" media-type="application/xhtml+xml" href="tuttoc.html"></item>

And now comes the rest of the book:

<item id="item2" media-type="application/xhtml+xml" href="tut.html"></item>

Now comes your cover image. In this case, you copy the tag verbatim except you put the file name of your cover image. In this case, I have a concept image that I’m using for testing. (You need to change JPEG to GIF if you’re using a GIF cover image.)

<item id="my-cover-image" media-type="image/jpeg" href="Brett_Concept.jpg"/>

Now this next part is very important. It looks like a table of contents. But, no, it’s an NCX file, which is the subject of another post. Note again that you keep everything verbatim except the actual file name.

<item id="toc" media-type="application/x-dtbncx+xml" href="tutbas.ncx"/>
</manifest>

Okay, the next part is the spine. The spine tells Kindlegen in what order you want things to appear. (Aha, you think, didn’t the manifest convey that information? No. The manifest is just a list. Not an ordered list. Just a list.)

<spine toc="toc">
<itemref idref="front_matter"/>
<itemref idref="toc"/>
<itemref idref="item1"/>
<itemref idref="item2"/>
</spine>

The careful observer will note that the idref tags refer to id items in the manifest. Make your items match accordingly. Note especially here that you can break up your book into chunks to make handling easier. For example, each chapter can be its own HTML file. The spine section sets the order so that Kindlegen can sew things together.

Now there’s just one last thing, the guide section. You’ll note a suspicious relationship between the items in the guide section and the navigational items found on a Kindle or Kindle App. Specifically here we tell Kindlegen that the Table of Contents is actually the Table of Contents that it’s looking for. Further, the Welcome is actually the place to which Kindle first opens the book, usually at the beginning of the first chapter.

<guide>
<reference type="toc" title="Table of Contents" href="tuttoc.html"></reference>
<reference type="text" title="Welcome" href="tut.html#start_here"></reference>
</guide>

We now close the file:

</package>

What could be simpler? (I’m laughing through gritted teeth. It took me hours to figure all this out. Hopefully my notes are a helpful adjunct to Amazon’s rather inadequate explanations.)

This entry was posted in Computer Tips and tagged , , , , , . Bookmark the permalink.

26 Responses to Kindlegen and the OPF file

  1. IRAL NELSON says:

    Dave,

    Thanks for the prompt reply and info.

    Iral

  2. Dave says:

    As you can tell from the date of the post (four years old), it’s been a long time since I’ve been through this. Amazon is always changing how they do things. I put my TOC in a separate file and referenced it in the OPF file. What’s in the OPF file has to carefully and exactly match the actual file names. Otherwise I have to plead ignorance—it’s just been too long. Sorry I couldn’t offer more help.

  3. IRAL NELSON says:

    Dave,
    Thanks for the info on the OPF. I have yet to put it together for my book, but will give it a whirl tomorrow.
    Right now I have my book published on Kindle, but the TOC is greyed out. All of the TOC links in the HTML file (which is the whole book) I submitted work.
    So my questions: 1) Do I need to extract the TOC links in the HTML file and redo them in another HTML file? 2) If using my current HTML file (or another TOC HTML file) and the OPF, how to I get the two onto the KindleGen using the open Books feature on Kindle Previewer?
    Please advise.

  4. Dennis says:

    “Appearance of Text” on Kindle for PC app.

    I am running Kindle for PC. This is how the text appears on that device. To me it appears that there is one blank line between one paragraph and the next.

    You say your paragraphs are evenly spaced on your ipad, does that mean no space or one space between paragraphs?

    I will happily take this jpg down if you object to it. It was derived from a screen capture saved to MS Paint.

    Dennis

  5. Dave says:

    Dennis, I didn’t actually intend the extra space between paragraphs. On my iPad it shows up evenly-spaced. I just looked at a preview in my web browser and it looks evenly-spaced, so I am guessing it must be different for different devices. I’m glad you enjoyed the sample. By all means, buy the book! I look forward to your review. Once consistent comment I get is that there are so many characters (it is, after all, an epic fantasy action-adventure). I’m thinking I should put something up on the website—maybe a list by chapter of the new characters and a little about their background. –Dave

  6. Dennis says:

    I finally got a moment to download the free sample of The Unexpected Traveler.

    The first question I have, is how do you get the blank line between paragraphs. Is this an html tag or an entry in your style sheet? I’d really like to get some “white space” in my book.

    Also, I was favorably impressed by the sample. Especially the carefully-crafted maps at the front give a highly favorable first impression.

    I liked your opening line. “Three hundred,” said Tom Ranier. Here’s a main character’s name that suggests a mountain and he’s exploring a mine. Great choice.

    As a kid, I remember reading a short story that started off, “John Smith rolled his mechanic’s slider from beneath the equipment he was working on only to see Bill Jones standing over him,” or something like that. I remember marvelling at how anyone could come up with that first line. That being said, I’d like to attend your fiction-writing seminar, but I’m 1,000 miles from you and that’s just a bit too far.

    Dennis

  7. Dennis says:

    Personal Liberty & Constitutional Liberty in 2013 is now up with the corrected mobi file. If you’re interested in the constitution, I highly recommend the book. If you’re more interested in the formatting results and the exhaustively hyperlinked ToC and List of Illustrations, get the free sample.

    Enjoy.
    Dennis

  8. Dennis says:

    Well, last night I did it. I uploaded the mobi file to amazon. In 2010, it took 3 days to get it available after the upload. Now it’s 6 hours. That’s the good news.

    I had vowed not to do it – no – no matter what. I would not upload a changed version. But then I saw it. And then I saw them. Mistakes.

    The first one I saw was that I had swapped the art between figures 5 and 6. It would be painfully obvious to anyone paying half attention.

    First there was a mountain then there was no mountain then there was. Which is to say my foreword page was there, but it wasn’t there. It was in a separate file. When I started scrolling down from the cover page, I got the full title page, the acknowledgements, and then the table of contents. No foreword. But I could click on the foreword link in the ToC and voila – there it was. It was great on the previewer. Then I noticed the file position was at 100%. The foreword was at the END of the book, not near the front. I had failed to list it in the [spine] section of the .opf file. I made an entry in the appropriate space and it went where it was supposed to be.

    From this, I gather that the only way kindlegen “knew” the file was there was because it was listed in the toc.ncx file and in the manifest section of the .opf file. Also, there were hyperlinks to anchor name tags in the file in the ToC file. Apparently, if it’s not in the spine, kindlegen just puts it at the back of the bus.

    Then there was the matter of the internal title page, that page with the full title in text only immediately following the cover. I used heading tags. Kindle defaults to left justify on headings. I went to the [font size=”7″] tag. It’s officially deprecated after html 4.0.1 or something like that but kindle supports it.

    I used sizes 7 5 and 4. Default is 3. Funny thing, I had to put [center] and [/center] tags before and after each item I wanted to be on a single line. Also, the [br] tag worked better to give vertical spacing than the [p] tag on this page.

    Kindlegen still spits out a flurry of forcefully closed [p] tags. I don’t have any idea what that’s about, but there they are!

    Last but not least, this version of the book is the final unedited draft. I found I had typed “undedited” instead of “unedited” (note stray “d” after “n” in first typing). I fixed that too. How do YOU spell onometapaeia?

    After taking care of these items, I uploaded the new mobi file. When I know it’s up there, I’ll post a link. But the foreword helps sell the book. It’s nice if it appears in the sample. With it sitting in the caboose, that doesn’t happen. The rendering of the internal title page was not conducive to creating a favorable impression. Neither was the misspelling of unedited. Fixing those will not hurt anything. And getting that art in the proper spot is another big plus.

    Dennis

  9. Dennis says:

    A few things. I just achieved everything I wanted to in this book.

    One of the things I took the most satisfaction from accomplishing in the first book was having a exhaustively linked table of contents. The ToC took you to the corresponding heading in the text. Clicking a “Back” hyperlink just above each heading took you back to the corresponding entry in the ToC.

    In this book, I wanted to hyperlink the headings themselves, and get rid of the “Back” hyperlinks.

    There was some erroneous info on-line, as one might imagine.

    Here is a sample ToC and text link

    (ToC)
    [a name=”toc163″][/a]
    [a href=”chapter10.html#text163″]Amendment I[/a][br]

    (text)
    [a name=”text163″][/a][br]
    [h2][a href=”htmltoc.html#toc163″]Amendment I[/a][/h2][br]

    Two points:

    You cannot have the href attribute and the name attribute in the same tag. You have to use separate tags placed immediately adjacent to each other.

    Also, the hyperlink for the heading to ToC link must be nested within the heading tag – not the other way around. Both ways work with MSIE, but only the method shown works with Kindle.

    The other guy’s stylesheet disabled the [blockquote] tag. I needed nested blockquote tags to show h1, h2 and h3 levels of headings. I took out the line in his css file that disabled it. Voila! It works fine.

    Turns out, you can still run kindlegen on an html file. I did this to test my new heading hyperlinks. Still, you don’t get the virtual navigation links you do if you use the toc.ncx file. A friend with a Kindle bought my previous book and was very disappointed to find that menu empty. I hope he’s a better poker player than he showed that day! Really, you could see it on his face. Kindle owners really want these features to work.

    I wrote a program in visual basic 5 (circa 1997 – the file i/o in vb 2005 baffled me – If MS ever quits supporting VB5, I’ll be out of the programming business). I had intended to make it elegant and pepper it with plenty of comments. The way it went, I barely can follow it and I wrote it in 3 days. Oh, well – if anyone wants the source code, contact me.

    With 230 or so headings to hyperlink, you can see how a program is worth the time and effort – especially when what worked in 2010 did not work this time.

    Also, he forced a page break before both h1 and h2 headings. You cannot just change the option from always to avoid – you have to remove the line entirely from the css file.

    Amazon’s publishing guidelines recommended .gif files for line art. All my art (except the cover) is line art, so I went to the trouble of converting from jpg to gif. They rendered atrociously in the Kindle Previewer.

    A couple of my jpegs had “ghost” shading in a place I wanted unshaded. If you worked on that in the jpeg format in MS Paint (anybody besides me on a shoestring here?!), you get very spotty fills.

    I reprocessed my original files. I began all my work in double-sized bit map files. Kindle allows 500 X 300, so I made the oversized version 1000 X 600. Lacking as I do, the hand-eye co-ordination of a neurosurgeon, the oversized versions made up for a lot of my limitations. Last time (in 2010) I’m pretty sure I went to jpg too soon. This time, I did everything in the oversized bitmap (with a _oversized suffix in the file name) and then shrunk it to final size (with a _finalsize suffix in the file name) all in the bmp format. Then I saved the final-sized bmp file as a jpeg. Voila! No ghost shadows.

    My cover art only includes the first line of the full title along with my name. The title page immediately following includes the full title, which covers almost an entire page. It uses h1, h2, and h3 headings for effect. Kindle left-justifies these. I wanted everything centered. With the css stylesheet, the page rendered horribly in MSIE, but at least it was centered. I tried commenting the css file out for that file, but kindlegen refused to compile the mobi. Except for the left-justification where I don’t want it, it’s ok.

    Oh well, that’s everything I can think of right now.

    Dennis

    PS: One VERY helpful pointer I got from your site is the thing about feeding kindlegen the content.opf file. I have seen that nowhere else, but it works! Thanks. That probably saved me from giving up on this entirely.

  10. Dave says:

    Dennis, thanks for sharing this – this page is one of the most-visited on my site, so you’re helping lots of other people too. My most recent experience was last May, when I uploaded my fantasy epic action adventure. You can download the first few pages as a free preview by going to the book’s Amazon page, and you’ll see that my TOC links all work, plus a link to my site where higher-res maps are located. (Of course, if you downloaded the whole thing, read it, and wrote a five-star review, I’d really like that too 😉 ) So good luck with your book. Please give me the Amazon link so I can check it out!- Dave

  11. Dennis says:

    Well, I finally got a mobi file tonight.

    Frankly, I have made so many changes in such a fast and furious fashion, I cannot begin to guess what the problem was before. I only got 2 lines, one being a “forcably closed unclosed tag in line 7050. Funny thing, the tag was there.

    I was happy to see my cover art rendered in black and white. It turned out very nicely.

    On the other hand, I went to a great deal of trouble to exhaustively hyperlink my table of contents and text forward and backward – click on title in toc, go to related text. click on title in text, go to that title in toc. I have 228 titles in 3 levels.

    In 2010, this worked. 2013, not so much. I used nested <blockquote> tags. The other guy’s sample stylesheet canelled <blockquote>. Amazon publishing guidelines say <blockquote> is supported, so I removed the offending reference.

    I still get a mass of forcefully closed unclosed tags – but now, they’re all tags. Go figure.

    There are still a couple of vague errors there in the kindlegen output.

    Oh, well, just in case anyone was interested in the results of my new approach.

    Dennis

  12. Dennis says:

    I could not stop browsing. Am doing a shotgun research project on this, as I had hoped to publish by Saturday and had even made announcements to that effect.

    I found highly promising info at this link:

    http://bbebooksthailand.com/bb-epub-kindlegen-tutorial.html

    I consider it highly promising because the guy discusses directory structure and the proper content for several of the “administrative” files (files other than content). Also, he is addressing kindlegen specifically.

    Following the heading EPUB Structure and just above the heading The mimetype and container.xml Files he shows the directory structure.

    I see there are no fixed file names for these files, except possibly the container file. Some guys call the css file style.css, others stylesheet.css etc. Each of these is specified in a superior/parent file, along with path. That being so, why not put everything in one directory and just reference the file name with no path? It should work, but that’s one of the things I’ve been doing all day long and it hasn’t worked. So tomorrow, it’s time to try something rather different.

    I’m too beat to do any more coding tonight – I think I’ll hit this fresh tomorrow. If your next kindle book bombs, this guy looks like a good reference too.

    Thanks again for your help. (So far you’re the first and only person to respond – and in less than an hour. That was really nice.)

    Dennis

  13. Dennis says:

    Thanks for the input.

    I used kindlegen in 2010. Those were the days! I just ran it on the html file and it did everything. Gosh!

    At least I know that having all the files in one directory worked for you. I bet that’s not the source of the problem I’m having.

    Dennis

  14. Dave says:

    Dennis, the whole thing requires considerable experimentation. I had all my files – HTML, JPG, OPF, NCX, you name it, all in a single directory. Of course, the HTML called for the JPGs in the same directory (sometimes when you create an HTML file from your word processor, it might want to put the images in a subdirectory). I would suggest that you start with a small experimental file, maybe just the first chapter of your book, and keep experimenting until it works. Then you can expand until you get the whole thing. Good luck!

  15. Dennis says:

    Do you put the .opf file in the same directory as the .html files? I’m havin’ heck with this. I keep getting cover not specified and closed unclosed tag at line 7050. Odd thing is, the tag is closed.

    I have read a few sites that mention various directory names where certain files go.

    Can you help?

    Thanks,
    Dennis

  16. Sean says:

    Problem solved! My bookmark wasn’t named “toc”, but something else. As soon as I changed it in the html document and in the opf, it now works!!
    Thanks your post and answer to my first question which has played a big part in getting me to the finish line.

  17. Sean says:

    Thanks for the tip re changing the .txt. I wasn’t able to do it on my mac but was able to do it ok in Notepad on a PC. So, I now have my html file (which has no images) which looks fine on the Kindle Previewer when I upload it on its own. I now also have an opf file which as far as I can tell (having cross-checked many times) seems to have all that it needs. But when I upload them both together in a zip file, the Table of Contents button is still greyed out! I’ve tried many times now, tweaking here and there, but no luck. There must be something that’s preventing Kindle from recognising the opf file or what’s in it. Should there be a third file uploaded?
    Thanks again.

  18. Dave says:

    Sean, I wish I could help you, but Macs are mysteries to me. You can try this URL: macdoktor.blogspot.com/. John knows Macs far better than I do.

  19. Sean says:

    This was very helpful, thank you very much. I am using TextEdit on a Mac. I have a question about saving the document in .opf format. Is it possible to get this extension using a text editor? I tried saving it as mybook.opf, but it automatically adds the .txt extension. Do you know if Notepad allows you to save it as .opf? If not, is a special software needed to save the text file in .opf format?
    Many thanks.

  20. dave says:

    David, I haven’t updated my Kindle info since I uploaded The Unexpected Traveler. Kindlegen is constantly changing, so I probably can’t offer you more than encouragement to keep experimenting.

  21. David Roys says:

    Hi Dave,

    Brilliant explanation! I wish I’d read this before trying to figure it out for myself. I used the Amazon-supplied sample and replaced it with my files.

    I think I’ve done everything right (don’t we always?) but when I download my book from Amazon, it opens at the table of contents, even though I have the text element in the guide section pointing to chapter 1. It works as expected when I email the Prc (can’t remember if that’s the book) produced by KindleGen to my device, but when I download it’s messed up. Any ideas?

  22. Stephen Goss says:

    Hey, Dave!

    Thanks again for the insight you’ve shared with me and other authors.
    Heidi’s Hope is now available on the Kindle. I decided to charge $2.99.
    I’m glad I took the extra time to figure out (with your kind help) how to get my Kindle book to work with the Kindle’s navigational tools. A friend of mine bought HEIDI’S HOPE for his Kindle, then he whipped out his IPAD and we looked at HEIDI’S HOPE on both devices. Wow, what a treat (you’re an author, so you know what I mean!) The navigation tools on the Kindle and IPAD worked as hoped! Yeah!!!
    I trust you are making progress on your Kindle book.
    Thanks again.

  23. Stephen Goss says:

    This reply covers four things: 1 – Thanks, 2 – A Problem, 3 – A Solution, 4 – A Question

    1 – Thanks: I’ve spent over a week scratching my head and other parts of my anatomy trying to figure out how to assemble my book through Kindlegen and get the navigational tools to navigate to the cover and the Table of Contents. Your blog helped me over the hump. THANK YOU!

    2 – A Problem: I have Word 2003. I took my book file and broke it into 3 parts, as you suggested on your blog. But when I extracted my original table of Contents (TOC) and made a separate TOC.html file, I lost all the links. Even when I edited the links with long path names, it didn’t work right; the links were broken. I was rethinking the syntax when I got the following solution:

    3 – A Solution: I still used 3 html files as you suggested, BUT I kept my toc in the 3rd file. I made the second file a “contentless” html file. I figured kindlegen would build nothing from it and then tack on the real Table of Contents from the third file. The actual code of the second file is as follows:

    <html>
    <head>
    <meta http-equiv=Content-Type content="text/html; charset=windows-1252">
    <meta name=Generator content="Microsoft Word 11 (filtered)">
    <title>Heidi’s Hope</title>
    </head>
    <body lang=EN-US>
    </body>
    </html>

    Other changes I made to the OPF file were as follows: 1 – I removed the line which specified the NCX file, and 2 – I took the toc reference out of the spine. The spine code is as follows:

    <spine>
    <itemref idref="part1f"/>
    <itemref idref="part2t"/>
    <itemref idref="part3m"/>
    </spine>

    “where “part1f” is the front end material, part2t” is the contentless htlm file described above, and “part3m” if the main file containing the Table of Contents and the story.

    With these changes, I ran the OPF file through KindleGen and the resulting prc file was successfully built with no errors or warnings. Most importantly, the navigational tools work when I run the prc file through my Kindle For PC application.

    4 – A Question. Is it true that the prc file is COMPILED? I hope that it is. The way I see it, if the prc file is compiled and it works correctly on the kindle today, then it should work on the kindle tomorrow or 5 years from now. Would you agree?

    God’s Grace, and thanks again!

    Stephen Goss
    Author of HEIDI’S HOPE
    available now (in book form) on Amazon
    Kindle version pending

  24. Peter says:

    You are a life saver. Thanks so much for saving me sooo much time!

  25. dave says:

    Well, not actually twice. You can skip the NCX file if you wish. But there’s a feature on some Kindles that can take advantage of the NCX file. It would seem that one ought to be able to create a single HTML file and have it tell Kindlegen what the cover is and what the TOC is. But, alas, I have figured out how to make the single file tell Kindlegen what the cover image is, but not what the TOC is, hence the need for the OPF. I created the TOC in Open Office and told it to make all the entries be hyperlinks. Then I created the HTML file, then yanked the resulting HTML for the TOC out into a separate file, then “manifested” them in the OPF. To me it all seems absurdly complicated. However, we have to deal with the world the way it is, not the way it should be!

  26. Sam says:

    So you need to have the toc twice? Once as html file for display (item1 in your case) and once as ncx file for kindle itself (toc in your case)?

    Can’t the html toc be contained withing the html file for the book? This way it could be tested just by opening the html file in a browser and clicking.

Comments are closed.