+

What's the Difference: e-book formats

29 Jan 2018

Surprisingly, for such a really simple thing as an e-book, a lot of different formats have been invented.

Sometimes it seems that every manufacturer of devices for their reading considers it their duty to come up with something of their own.

What are they different from each other, what readers understand and what to do if your "reader" is not able to open the desired file - about all this in the article.

WHAT FORMATS EXIST

AZW

The proprietary Amazon format used by it in its Kindle readers (AZW is supposedly deciphered as Amazon Word). It is based on the standard Mobipocket (developed by Amazon in 2005) and almost completely repeats it, with the exception of the nuances like lack of JavaScript support and the use of compression. Books in AZW can be either with DRM-protection, or without it. Protection binds the book to the account from which the purchase was made, so you can read it on all devices associated with it (only one account can have up to six devices). The AZW format supports the use of additional files that store bookmarks, citations, progress of reading and some other metadata.

In addition, recently a new version of the format - AZW4. It's basically a PDF, and Amazon calls it "Print Replica", meaning it's an exact copy of the printed page. In addition to the usual functionality of the PDF format, AZW4 also supports the unique capabilities of Kindle - annotations, synchronization of reading between different devices, etc.

CHM

The full name is Microsoft Compiled HTML Help. The proprietary format of contextual help from Microsoft, based on HTML. Unlike the latter, it can contain a set of pages and graphic images in one file. In the context of e-books, it can be interesting mainly as a format used to store any documentation, since ordinary books are not distributed in it. A variation of CHM is the LIT format (short for literature) used in Microsoft Reader software (whose support, as well as the use of the LIT format, will be discontinued in August 2012)

DjVu

The format is designed to store scanned documents. Thanks to the complex processing algorithms that separate text and graphics into different layers with different compression methods, it is possible to achieve an unprecedented compression ratio - for a document comparable to PDF quality, DjVu will be 10 times smaller. This makes DjVu the best option for storing large arrays of technical documentation with graphical illustrations. It should be noted that if you have a text layer in such a file, the user will be able to search the document in full. If only one graphic layer is used, this format option is called IW44, and some readers separately indicate it in the list of supported ones - although in fact any device capable of opening a DjVu document will not experience any problems with IW44 files.

ePub

The title is an abbreviation for electronic publication. Open format, developed by the International Digital Publishing Forum (International Digital Publishing Forum). ePub is based on XHTML and XML with the optional use of the CSS stylesheet. The format was developed for documents with floating layout, which allows you to adapt the display of books to the screens of various devices. ePub replaced the previous standard of this organization - Open eBook. The ePub container is actually a Zip archive with the extension .epub - it contains texts in the formats xHTML, HTML or PDF. Also the container can contain files with graphics, including vector graphics, and embedded fonts. The latest version 3.1 is designed to eliminate the shortcomings, for which ePub was criticized earlier - insufficient format fitness for use in books with fixed layout, lack of support for mathematical expressions MathML and a number of others.

At the moment, ePub is the most common format and is supported by almost all modern readers (except perhaps Amazon Kindle). The ePub standard allows you to include DRM protection in a file, but the specification does not restrict the publisher in its selection.

FB2

FB2, or FictionBook version 2, is based on the XML format. According to the developer, the main tasks in its creation were full preservation of the document structure with the possibility of easy (ideally - automatic) conversion to other formats and accurate display on any device. The difference from other formats is the focus on preserving the structure, not the appearance - FB2 does not determine how the document will be displayed on different devices or printed, instead, special elements are used to indicate the various parts of the book, such as quotations, epigraphs, poetry, etc. The book in FB2 is stored in one XML file - the images included in it are converted to the Base64 system and inserted into it using a special tag, which slightly increases the file size.

The format is developed in Russia, therefore it is not surprising that he deserved special popularity in the Russian-speaking environment - his support is present at all readers of local brands, in addition, a number of online libraries offer books in FB2.

At the end of 2008, the first information appeared on the development of the next version of the format, FictionBook 3.0, but it seems that it did not go further than its description, although the features looked quite promising (using the Open Packaging Convention standard, a container in the form of a Zip archive with separate files for text, images and metadata, etc.).

MOBI

The format used in the free MobiPocket Reader software (available on Windows, as well as on mobile platforms). The main "consumer" of books in MOBI format is the family of Amazon Kindle readers, in which it, in fact, is the only non-native supported format. Books in MOBI can be with .mobi and .prc extensions (the second one was introduced due to PalmOS restrictions on the extensions used). Initially, the MOBI format was based on the PalmDOC format, in which some HTML tags were added, later a new version appeared, using a higher degree of data compression. Recently, with the addition of new functions, the creator follows the Open eBook standard. In doing so, MOBI has a fairly large number of restrictions on formatting, especially with respect to indenting text, as well as images and tables inserted into text.

PDF

Developed by Adobe in the far 1993 format Portable Document Format, or abbreviated PDF, is great for modern devices for reading books. Initially, the format was developed for printing, so it fully describes how the document should look - including paper size, types of fonts (which can be included in the document), etc. In addition to text, PDF can contain vector and bitmap graphics, as well as metadata. It is with the help of PDF readers that readers can add functionality missing in their devices, like alternative fonts, hyphens in text, etc.

The main problem with PDF files that are not optimized for devices with a small screen is problems with reading pages that were created, as a rule, under the resolution of monitors (primarily PDF versions of magazines, as well as technical literature). In this case, the user has to constantly switch from viewing the entire page to an enlarged section. Some readers support the reflow function, which allows you to change the layout depending on the screen size and scaling, but it works, as a rule, not in the best way.