Mirroring HOWTO

Project Gutenberg Europe is always seeking sites to mirror (copy) our collection. This can bring the collection closer to people in your region. This HOW-TO describes how to set up a mirror.

The Project Gutenberg Europe etext collection may be distributed by FTP and/or HTTP. For example, these urls point to the same content:

The collection is over 135GB (May 2004), and expected to double every year or less. New etexts are added almost every day, so it's desirable to mirror nightly. There are over 100,000 files, 21 languages, and 20 different file formats.

Our experience has been that a static IP address and T1 (~1.5Mb symmetric) or better permanent network connection is minimal for a public mirror. (Of course, you can build a private mirror with a DSL or cable modem, but sharing it with the world requires a somewhat higher bandwidth.)

The best place to mirror from currently is our master download site at ftp://ftp.ibiblio.org/pub/docs/books/gutenberg. Most mirrors use rsync (easiest), wget (easy) or the mirror PERL software (requires some configuration). Here is an overview for each:

  1. Rsync (available for all Unix systems; standard on Linux; part of Cygwin for Windows). The last argument is the local directory for the mirror destination:

    rsync -rlHtSv --delete ftp@ftp.ibiblio.org::gutenberg /home/ftp/pub/mirrors/gutenberg

  2. Wget: Freely available from any GNU mirror. With appropriate command-line options, this can be used with either a HTTP or FTP interface, but please use the FTP URL above for Project Gutenberg Europe. The wget homepage is http://www.gnu.org/directory/wget.html The key is to only get updated files, not files you already have. A wget command line that should work with some adjustment for your local needs (run it from wherever you want the mirror to go) is:

    wget --mirror --no-host-directories --passive-ftp --no-parent --cut-dirs=4 --output-file=/tmp/wget-gutenberg.log ftp://ftp.ibiblio.org/pub/docs/books/gutenberg

  3. Mirror PERL software: Available from http://sunsite.org.uk/packages/mirror/ (among other places). We can help you set this up for a Unix system. The mirror PERL software has been reported to work with PERL for WinNT, as well as Unix/Linux/BSD. Note that the wu-ftpd software patch supplied with the program must be applied for it to work!

For any mirror method, run a daily job to check for newly updated files. Unix/Linux employs cron for this; Windows systems could use the task scheduler. We can help you with setting up the mirroring software, or any other details, if you would like.

We don't distribute the Web-based search engine that's available on the main PG page at http://www.gutenberg.org. However, we'll add your site to the list of mirrors, so people can find you. The FTP directories are the only part we offer for mirror, while the central list of mirrors and search capability is centralized at http://www.gutenberg.org

Once you tell us your mirror is active (email mirrors_AT_pglaf.org, we'll announce it in our next weekly & monthly newsletters. After a month or so (to confirm stability) we'll add you to the mirror list and download facility at http://www.gutenberg.org

You might want to view our mirror list to check whether the geographical location of your server would be a good addition to the list.

Thanks for your interest in helping Project Gutenberg Europe reach more readers.