Project Gutenberg Newsletter:
Distributed Proofreaders Update

Contact Us
DP Updates

Distributed Proofreaders Update for 19 October 2004


"Books are the legacies that a great genius leaves to humanity, which are delivered down from generation to generation as presents to the posterity of those who are yet unborn."

--Joseph Addison


[ ] The 5,000th Title Collection - What has been accomplished. - Why it is significant beyond commemorating a milestone. - Why the Collection was organized. - How it was accomplished. - How the past lead us to this expansion of DP's potential. - What the 5,000 commemoration means for the future of DP. - What is presently involved in supplementing the 5K Collection. - The DP News & Information Center - Acknowledgements.




by Thierry Alberto

"I have always imagined that Paradise will be a kind of library."

--Jorge Luis Borges

On the morning of October 8, 2004, near his library window overlooking a quiet lake in upstate New York, David Widger ran a series of final checks and verifications on a partitioned element of the 11th edition of the Encyclopedia Britannica. Yes, that same EB11 which has long been known as a formidable processing challenge throughout the Project Gutenberg community. This latest approach towards its digital conversion did little to diminish that reputation. It was about to do something of great importance, however. Dr. Widger had been anticipating the project files for some 48 hours by the time they were at last wrestled into acceptable compliance.

For everyone aware of what was on the Whitewashing agenda that morning, an intermingled sense of relief and excitement began to spread. This "slice" of EB11 was not simply another single project being posted to the PG shelves, but the final component in a varied and impressive collection which had been in preparation for several months. When the final checks were cleared, the Britannica text joined 50 other finished works as a single commemorative event organized to mark the completion of Distributed Proofreaders' 5,000th unique title produced for Project Gutenberg and the digital public domain.

This event is the largest single contribution to Project Gutenberg in the four year history of Distributed Proofreaders. Through the orchestrated collective efforts of its members throughout the world, DP produced 50 diverse and significant written works to accompany the 5,000th title on its journey to the PG library. A total of over 13,000 original pages from a distinctive variety of books in several languages were carefully chosen to display the merits and strengths of the DP production model. A sister objective for this collection was to produce a newsworthy example of the fertility of public domain content yet to be converted into digital formats.

In our own "Field of Dreams," we have learned collectively, that if we build it ... they will definitely come. What we do not know yet, is how strongly "they" and others to come will defend what we are building, should the public domain be challenged further. At the heart of the inspiration for this Collection is a belief that we can help insure that present and future volunteers will care enough to stand up should they be called to. If we make a dramatic and valid statement now and then, proving what a fertile field this really is, we may indeed embolden many to care and nurture these legacy resources.

"All that Mankind has done, thought, gained or been--it is lying as in magic preservation in the pages of Books."

--Thomas Carlyle

These 51 projects have much in common that binds them together as a family with a focused intention. In one manner or other, all of these titles belong to a larger set of projects, some of them so vast that it will require several years to complete them. The titles within the collection were selected from amongst its most challenging and complex projects in order to demonstrate the varied strengths of the Distributed Proofreaders production model.

There are four classes of super or "uber" projects which make up the 5,000 Collection. These are:

1) 'Classic Ubers' - Truly massive works such as EB11 or the Bureau of American Ethnology reports, spanning over thirty years.

2) 'Large Book Sets' - Smaller in scale than the Classic Ubers, yet still daunting enough to discourage commencement by an individual. Among this class are Hakluyt's 'Principal Voyages of the English Nation', 'The Psychology of Sex' by Havelock Ellis and 'The Library of the World's Best Literature.'

3) 'Author Collections' - Organized programs to provide entire libraries of an author's work for Project Gutenberg, or to complete catalogs of authors partially represented. The intention with these sets is to systematically advance PG towards comprehensive coverage of commonly requested authors with works in the public domain. Featured authors include: Hendrik Conscience, Alexandre Dumas, Victor Hugo, Edward Lear and George Sand.

4) 'Periodicals' - As a whole, the processing of entire runs of many volumes of journal titles is perhaps the largest initiative DP has ever undertaken. Among the many periodicals represented within the 5K Collection are: 'The Atlantic Monthly', 'Blackwood's Edinburgh Magazine', 'Notes and Queries', 'Punch', and 'Scientific American.'

The accomplishment represented by the 5,000 milestone is a source of great pride to the membership of DP. Fueling the labor of our intentions for the 5K Collection was a desire to produce for the world a gift of immense value that would dramatically exemplify the best of which Distributed Proofreaders is now capable. Read through the manifest at the end of this newsletter and see if that wealth of titles does not stir up a sense of profound excitement about the work we are all engaged in here. There is a fine hope which we are capable of encouraging together with our dedication. Intending towards a greater good for the world with our industrious energies, we become a living example of what is possible when like-minded people join together in creative endeavor. It is a rare example, true ... but it is powerful ... and so greatly needed within this time of deep and uncertain change. Faith in a brighter future is an essential source of courage and stability. Preserving history and cultural legacies of the past sustains that faith by providing recognition of the continuity of the human story across time and change. This is not the day to day reason for why we preserve public domain works, yet it is a subtle, derivative effect of what we do.

"For books are more than books, they are the life, the very heart and core of ages past, the reason why men lived and worked and died, the essence and quintessence of their lives."

--Amy Lowell

5,000 unique written works. As accustomed as many of us have become to the steady accomplishments of the Project Gutenberg community, being book lovers, we are still capable of being awed by our milestones. 5,000 published works is a vast library by the measure of any human lifetime. To secure such an archive in print form would cost a fortune, the likes of which few of us will ever amass. Yet here they all are! ... available to anyone in the world with access to the Internet; and now to some even without such access. We are not talking about just any books we can lay our hands on. Amongst these 5,000 titles are a large number of creative works that are treasured among the great legacies of world culture. In organizing the choices for this collection, it was decided from the beginning to emphasize the value of content within the public domain which has yet to be converted into digital formats. This has been successfully achieved through the inclusion of exemplary representatives of such written treasures.

Having been closely involved with the previous milestone celebrations, I can avow that these rituals mean much more to the members of DP than crowing over past accomplishments. What we each derive from these celebrations has more to do with inspiration than anything else. When we pause in our daily efforts at DP to commemorate the rounded figures of Golden projects, posted to PG, what we experience is a refreshed recognition of why it was that we came to devote our time and energy to this cause in the first place. These are times when we reflect on how far we have each come, the projects we have been involved with and what we have learned. At these times we remember (or blissfully forget) the various challenges we have passed through along the way and we share all these recollections and reflections with people who have grown to become our friends, even though continents may lay between us. There is always a renewed sense of energy and commitment after each celebration, and this time the inspiration is more deeply felt than ever before. The reasons for this are as varied as the individuals who make up the DP membership. In this past week, I have received a wondrous outpouring of distinctive expressions from people involved in all fields of project production. One common feeling that I have been noticing is joy. It is a joy that comes from participating in something greater than oneself, which is at the same time intimately familiar and personally valued. This is a deeply felt and powerful emotion experienced by many involved in this field of work. Yet, all we are doing here is crafting some old, dusty tomes into e-texts. Right?

"Read, read, read."

--William Faulkner

It is just four years now since Charles Franks put forth an innovative idea for how to enhance the development process of public domain texts for Project Gutenberg. It is not at all uncommon for someone to pull down a good or even brilliant idea from wherever it is that spirit of invention springs. What is very uncommon is for someone to take a great idea and follow through upon it across the forge of trial and error unto eventual success. Project Gutenberg is an impressive example of what can happen in this world when one individual does pass through the crucible of creative innovation to see an idea crafted into manifest utility. Sometimes the hardest challenge while bringing a new idea into existence is rallying enough supporters to your cause. An idea in and of itself will not do the work for you. Action must be taken and the validity of the idea must be proven, if others are to ally themselves with the values and intentions underlying the idea. The positive effects of Michael Hart's contribution to our age will continue on well into the lives of generations to come. We may not be able to measure the full extent of Project Gutenberg's benefit, but we none the less know it is extensive, world spanning and long lasting. What Charles and the early circle of DP members proved, was that inspirational lightning could definitely strike the same ground twice.

It took a while and a lot of puzzle work to figure out how best to implement those early concepts of distribution, but in time they were well worked out. As with most productive endeavors, we can measure the success of DP by the quality and growth of its expressive output. Both of these measures have been improving at a steadily impressive rate over the past three years, from about the time the foundational process of DP was settled upon. For quality, I offer that you merely spend some time with any number of the texts within this collection. Let the work convey its own merit to you, in ways I could never hope to match. For the quantity measure, we can look to the primary indicator of growth at DP ... the number of pages being proofed each day.

By the middle of 2001, with the basic production models in use, the daily average of pages proofed was 259. One year later that figure had nearly quadrupled to an average of 1001 pages per day. Step forward two more years, to the present, and DP is averaging in the neighborhood of 6,000 pages proofed on a daily basis. Of course this measure alone does not come near to conveying the advances of all the production processes involved in the creation of finished texts at DP. It is merely a single snapshot ...yet one that captures a clear sense of how successful the distributed model has proved to be.

Another new and dramatic measure which became evident in 2004, is the broadening interest in adapting the DP model to other archival projects. The most advanced of these at present is Distributed Proofreaders Europe, initiated and maintained by the Rastko family of archives and cultural initiatives based in Belgrade. The European DP will provide support for Project Gutenberg Europe which is soon to begin its official testing phase. PG Europe will expand what is available online in the public domain according to European copyright terms.

What the future holds for DP seems bright with promise in many directions, and it is to the future, more than anywhere else, that we are turning our attention this month.

"The oldest books are still only just out to those who have not read them."

--Samuel Butler

The significance of the public domain edition of Encyclopedia Britannica and the broad based interest in its development, led to its selection as a milestone project for DP. While EB's 11th edition easily meets the classification as one of the most complex of DP's multi-volume projects, its selection was also influenced by future possibilities. By giving the Britannica project spotlighted prominence as the 5,000th title during ongoing promotions, we will also be drawing attention to the need for new waves of volunteers to assist in its eventual completion. This is true for most of the projects within the 5K Collection. Some of the book sets and periodical titles are nearing completion, a few are even completed. These will serve as inspiration ... impressive examples of the significant work which people can participate in. For the most part however, the complex projects represented in the collection have a long way to go before they are entirely available from the PG shelves. The ongoing promotional work beginning this month with the 5,000th title, shall carry forward through the end of the year, reaching out to new generations of volunteers, who will see these vast works through to the final page.

It is for these future generations of volunteers that the second exhibition of the 5,000 Collection was conceived. Yes ... a second exhibition. After providing everyone, including myself some time to recuperate from the preparations of the First Collection, the Second Collection is now in production. The aim of this follow through pageant is to sustain the messages of the 5,000 Collection while providing a dynamic demonstration of DP's production capabilities. The presentation of this broad selection of complex projects was an organized event, yet it was not an extraordinary exercise spent for the sake of promotional theatrics. Processing large scale, complex projects is part of DP's daily stock & trade. Because the labor is broadly distributed, we often lose sight of the scale of what is being accomplished over time. The purpose of extending the presentation of the 5K Collection is to provide a stage upon which the present strengths of the DP model can be substantiated.

The Second Collection is timed to coincide with the end of the month. The actual date of posting will depend upon the reluctance of those few projects who may shy away from their initial public premiere. Among the featured titles will be several additional volumes or periodical issues appearing in the First Collection as well as some projects more complex, in their own way, then any among the initial 50. If the Fates are kind, we expect to deliver another portion of EB11. With what we have been learning from these first two partitions, we do expect to soon witness a steady stream of 11th Edition "slices" making their way to PG. Once each volume has been completed, a single file matching the original print edition will be compiled.

An original content feature is being crafted as a supportive initiative to the 5,000 events. Background information about each project is being gathered and prepared to form a permanent resource for public access. These articles will explore the history of each work and its author, augmented by a chronicle of the project's development at DP. In time, other features will join the project backgrounds to form an evolving news and information center at DP. Among other content being developed for the center are articles documenting the history and lore of DP's evolution, a variety of community resources, support for new visitors and timely coverage of current events. Watch for announcements on the main page at DP regarding the progress of the news and information center.

"Books--the children of the brain."

--Jonathan Swift

It is a little over a week now since the First Collection posted to PG. In that time, over 100 titles have followed after those 51 and turned to Gold. As we turn our attention to the labors of the Second Collection and move on towards the future, we carry with us a happy certainty: wherever that future takes us, there will be plenty to read when we get there.

With an initiative of this scale it is difficult in the space allowed to fairly credit each person who participated. Once the second collection is complete we will compose a roll of all those involved and post it in a very public place at DP. At present, I want to express a deep sense of gratitude to three volunteers on the PG side who deserve noted recognition. Well in advance of posting the first collection David Widger, Jim Tinsley and Joseph Lowenstein began working in close coordination with us. Performing like great chefs over a two week period, the Whitewashing team orchestrated all 51 projects into organized assembly. Presenting these works as a cohesive collection would not have been possible without their advice and dedication.

To everyone who made the 5,000 Milestone a reality ... one page at a time ... Congratulations! I look forward to reporting on all the future milestones yet to be realized as we continue the labor begun by Michael Hart of building the world's grandest library.

For now.


"Books, books, books had found the secret of a garret-room piled high with cases in my father's name; Piled high, packed large,--where, creeping in and out among the giant fossils of my past, like some small nimble mouse between the ribs of a mastodon, I nibbled here and there at this or that box, pulling through the gap, in heats of terror, haste, victorious joy, the first book first. And how I felt it beat under my pillow, in the morning's dark. An hour before the sun would let me read! My books!"

--Elizabeth Barrett Browning



Notes: - Initial number: PG E-Text number - Second number: DP Unique Title number - Where other than English, language is noted in parenthesis.

13600 / 5000 - Encyclopedia Britannica 11th edition, Vol II, Part I - AND- ANI

13601 / 5001 - Expositions of Holy Scripture - Romans, Corinthians, by Alexander Maclaren

13602 / 5002 - Slave Narratives, a Folk History of Slavery, Vol. IV, Georgia, I

13603 / 5003 - Bureau of American Ethnology Publications - The Romance of Laieikawai by Haleole & Beckwith (English & Hawaiian)

13604 / 5004 - TIA Children's Library - Thrilling Stories of the Ocean by Park, Marmaduke

13605 / 5005 - The Principal Navigations, Voyages, Traffiques and Discoveries of the English Nation, Vol. XII, by Richard Hakluyt (English & Latin)

13606 / 5006 - General History & Collection of Voyages and Travels, Vol. XVIII, by Robert Kerr

13607 / 5007 - Historie de la RTvolution frantaise, Vol. X, by Adolphe Thiers (French)

13608 / 5008 - Filosfia fundamental, Vol. I, by Jaime Balmes (Spanish)

13609 / 5009 - Vector Analysis and Quaternions by Alexander MacFarlane (Cornell Math Collection) TeX and pdf format only

13610 / 5010 - The Psychology of Sex, Vol I - Evolution of Modesty, by Havelock Ellis

13611 / 5011 - The Psychology of Sex, Vol II - Sexual Inversion, by Havelock Ellis

13612 / 5012 - The Psychology of Sex, Vol III - Analysis of the Sexual Impulse, by Havelock Ellis

13613 / 5013 - The Psychology of Sex, Vol IV - Sexual Selection in Men, by Havelock Ellis

13614 / 5014 - The Psychology of Sex, Vol V - Erotic Symbolism, by Havelock Ellis

13615 / 5015 - The Psychology of Sex, Vol VI - Sex in Relation to Society, by Havelock Ellis

13616 / 5016 - The Philippine Islands, 1493-1898, Vol. III by Blair & Robertson

13617 / 5017 - A Compilation of the Messages and Papers of the Presidents - Harrison, Benjamin, ed. Richardson

13618 / 5018 - Bell's Cathedrals - The Cathedral Church of Peterborough, by W.D. Sweeting

13619 / 5019 - Little Journeys to Homes of the Great, Vol. 5, English Authors by Elbert Hubbard

13620 / 5020 - The Worlds Greatest Books, Vol. XIII. ed. Mee & Hammerton

13621 / 5021 - Jonathan Swift - Poems, Volume II

13622 / 5022 - Une histoire d'Amour by Paul MariTton (French)

13623 / 5023 - Library of the World's Best Literature, Vol. VI, ed. Charles Dudley Warner

13624 / 5024 - Chronicles Vol. I, The Historie of England, Part 2 by Raphael Holinshed

13625 / 5025 - De Kerels van Vlaanderen by Hendrik Conscience (Dutch)

13626 / 5026 - The Forty-Five Guardsmen by Alexandre Dumas

13627 / 5027 - Memorie del Presbiterio by Emilo Praga (Italian)

13628/ 5028 - La Esmeralda by Victor Hugo (French)

13629 / 5029 - Correspondence, Volume I - George Sand (French)

13630 / 5030 - As Farpas, Junho a Julho 1882 (Portuguese)

13631 / 5031 - Atlantic Monthly - Issue 71 Sept. 1863

13632 / 5032 - Bay State Monthly - Vol. I, Issue 5 May, 1884

13633 / 5033 - Blackwood's Edinburgh Magazine - April 1844

13634 / 5034 - Continental Monthly - Vol. I - Issue 2 Feb 1862

13635 / 5035 - The New York Times Current History: The European War, Vol. 1 Issue 1, What Men of Letters Say

13636 / 5036 - Lippincott's Magazine - February, 1873

13637 / 5037 - McClure's Magazine -January 1896

13638 / 5038 - Notes and Queries - Vol. I, Number 19, March 9, 1850

13639 / 5039 - Punch - Vol. I, Issue 1, July 17, 1841

13640 / 5040 - Scientific American Supplement - No. 821, September 26, 1891

13641 / 5041 - The American Missionary - October, 1888, Vol. XLII. No. 10.

13642 / 5042 - The Journal of Negro History - Vol. I, No. 1, Jan. 1916

13643 / 5043 - International Weekly Miscellany, Vol. I, No. 6, August 5, 1850

13644 / 5044 - The Mirror of Literature, Amusement & Instruction - Issue 360, March 14, 1829

13645 / 5045 - The Tatler, Vol. I

13646 / 5046 - A Book of Nonsense by Edward Lear

13647 / 5047 - Nonsense songs, stories, botany, and alphabets by Edward Lear

13648 / 5048 - More nonsense, pictures, rhymes, botany, etc by Edward Lear

13649 / 5049 - Laughable Lyrics by Edward Lear

13650 / 5050 - Nonsense Books by Edward Lear, DP Compilation

Links to Articles

<-- 19 October 2004
18 February 2004
11 February 2004
4 February 2004
14 January 2004
17 December 2003
3 December 2003
19 November 2003
12 November 2003
5 November 2003
29 October 2003
22 October 2003
15 October 2003
8 October 2003
1 October 2003
17 September 2003
10 September 2003
3 September 2003
27 August 2003
20 August 2003