We have to divide this question into two answers, for books up to 10,000, and books after 10,000, or reposted since we moved past 10,000.
Books posted after 10,000 go into a new, simpler, naming scheme. Books REposted after we passed 10,000 (around November 2003) also use this scheme. We are reposting many older books, with improvements and corrections, all the time, and older books may also be reposted into the new scheme.
You can see clearly from the line in GUTINDEX.ALL whether the book is in the old naming scheme or the new naming scheme. Where the line starts with a Month and Year, and contains a file-name template in square brackets, the book is still in the old scheme, for example:
Feb 2005 Mike, by P. G. Wodehouse [mikewxxx.xxx] 7423
The line for the same book, in the new naming scheme, would omit the Month and Year, and the filename base, and look like:
Mike, by P. G. Wodehouse 7423
Books after 10,000 -- the new naming scheme
To find a text with a number over 10,000, or one that has been reposted since we passed 10,000, you must know the eBook number. You can get this from <http://www.gutenberg.net/GUTINDEX.ALL>.
Once you know the number, you can find the directory containing all formats of it. Formally, the directory for the eBook will be contained in a hierarchy of directories, each one a single digit, being all the digits of the etext number except the last, in order. The name of the directory for the eBook itself will be the number of the eBook. But it's easier to see by example.
The files for eBook number 10214 will be found in the directory /1/0/2/1/10214 on the download site you choose. So, for example, if you are downloading eBook 10214 from our main site by HTTP from www.gutenberg.net, you can just go to
http://www.gutenberg.net/1/0/2/1/10214/ and download whichever of the formats you want.
Or, instead of typing in the whole address, for numbers beginning with the digit "1", you can just go to http://www.gutenberg.net/1/ and navigate down the list of directories.
Books before 10,000 -- the old naming scheme
In short, just browse to:
<http://www.ibiblio.org/pub/docs/books/gutenberg/>
choose the schedule year of the text (newly-posted texts will usually be in the latest year) and look down the list to find the filename you're looking for.
In general, you need to know:
a) the address of an FTP site
b) the schedule year of the text you want
c) the basename of the text you want.
The fastest and safest FTP site to use for this is ftp.ibiblio.org, which is the first of our two primary posting sites (the other being ftp.archive.org). We post to these two sites, and then other sites copy from them at intervals, so with any FTP sites other than these two, the file may not be available immediately.
You can get the schedule year and basename of the text from its line in GUTINDEX.ALL. Let's take an example. The file
Mar 2004 The Herd Boy and His Hermit, by C. M. Yonge [#32][hrdbhxxx.xxx]5313
has been posted just a few hours ago as I write this. From the GUTINDEX entry, the schedule year is 2004, and the basename of the text is hrdbh.
We divide our texts into directories (folders) based on the schedule year, so this eBook will be in the directory for 2004, which will be named something ending in /etext04. All the directories are named etext plus the last two digits of the year. (Somebody's going to have to change that convention in about 87 years from now! :-) We currently have directories starting at 90, running through the 90s and then 00, 01, 02, 03, 04. All eBooks produced before 1991 are in the /etext90 directory, so if you're looking for
Dec 1971 Declaration of Independence [whenxxxx.xxx] 1
or
Aug 1989 The Bible, Both Testaments, King James Version [kjv10xxx.xxx] 10
you should look in /etext90.
As it happens, ibiblio supports both HTTP (web) and FTP access to the text, so we can just browse to <http://www.ibiblio.org/pub/docs/books/gutenberg/> and choose the 2004 directory from there.
If you want to automate this, you could also use the more direct address <ftp://www.ibiblio.org/pub/docs/books/gutenberg/etext04/>
The equivalent address for ftp.archive.org is <ftp://ftp.archive.org/pub/etext/etext04/>
Either way, we see a long page of files, in alphabetical order. Scroll down to the "H"s and look for hrdbh. We see four files with this basename:
hrdbh10.txt
hrdbh10.zip
hrdbh10h.htm
hrdbh10h.zip
This means that both plain text and HTML formats are available, and you can choose to download them either zipped or uncompressed. For more detail about conventions for filenames, see the FAQ "What do the filenames of the texts mean?" [R.35]. The main thing you need to know is that any file beginning with hrdbh is some format or edition of this book.
Finally, all you have to do is click on the format you want to download.