The Project Gutenberg FAQ - V-79

V.79. What are "8-bit" and "7-bit" texts?

For practical purposes, 7-bit texts are plain ASCII; 8-bit texts have accented letters.

This comes from computer jargon. You can represent the 128 characters of ASCII using 7 bits--binary digits--but to represent the 256 characters needed for the various codepages and ISO-8859 standards, like accented letters, you need 8 bits. Hence, we call a text that uses non-ASCII characters in a character set like Codepage 850 or ISO-8859-1 an "8-bit" text.

When we post a text as both 8-bit and 7-bit, as we do when ASCII is not enough to render the text acceptably, we name the file with an "8" or a "7" at the start. So, for example, Crime and Punishment by Dostoevsky is named 8crmp10 for the 8-bit version with accents, and 7crmp10 for the 7-bit version without accents.

See also FAQ [R.35]: "What do the filenames of the texts mean?"

Top