| The Processed Book
Joseph J. Esposito
Summary
The "processed book" is about content, not technology, and
contrasts with the "primal book"; the latter is the book we
all know and revere: written by a single author and viewed as the embodiment
of the thought of a single individual. The processed book, on the other
hand, is what happens to the book when it is put into a computerized,
networked environment. To process a book is more than simply building
links to it; it also includes a modification of the act of creation,
which tends to encourage the absorption of the book into a network of
applications, including but not restricted to commentary. Such a book
typically has at least five aspects: as self-referencing text; as portal;
as platform; as machine component; and as network node. An interesting
aspect of such processing is that the author's relationship to his or
her work may be undermined or compromised; indeed, it is possible that
author attribution in the networked world may go the way of copyright.
The processed book, in other words, is the response to romantic notions
of authorship and books. It is not a matter of choice (as one can still
write an imitation, for example, of a Victorian novel today) but an
inevitable outcome of inherent characteristics of digital media.
_________________________________________________________
The electronic book or ebook has arrived, but
it has not come very far. Optimistic expectations of the rate of ebook
acceptance have been dashed, and numerous people are debating why something
as obviously useful as the digital display of text has not already begun
to replace paper. It may be that the current debate about electronic
publishing is missing the point, however; it may be too focused on devices
(however amazingly cool these devices may be) and is not reaching to
the heart of the matter, which is why we care about books in the first
place. We care about books because of what's inside them, because of
what they mean. The intriguing aspect of electronic publishing is not
simply whether we will all someday dump print in favor of screens or
what file format will become the standard, but how electronic publishing
will affect what goes inside of books. It is my view that our current
notion of books is naive, raw, and that what electronic publishing will
give us is something that is highly thought out, cooked and processed.
To the world of processed food and processed hair, we now add the processed
book.
Some definitions are in order. Usually books are identified with their
physical package. That package is generally between four and six inches
in one dimension and seven and nine inches in the other; it is printed
on paper; and it is the product of an author (usually one). The content
of such a package, however, is also called a book, and that is the kind
of book I wish to discuss here. As we begin to publish some books in
electronic form, the print package gets tossed out and only the content
remains. Is an ebook (or e-book or eBook) the content, the device that
displays it, or both? Some interested parties now use the term etext
to distinguish the content from its package. This would be more helpful
if enough people subscribed to the convention. What I will call books
are texts or etexts or content. This kind of book is the same whether
it is displayed in a handsomely bound hardcover book, within a Web browser,
as an Adobe PDF file, or in Microsoft Reader (among a multitude of other
formats). By this definition, the book of the future will be a . . .
book!
Here we should note that once we separate a book's
content from its hardcopy package, the notion of what is "book-length"
disappears. A very short book is likely to be about 120 pages, which
comes to about 40,000 words. Most books are roughly three times that
length, and those long, gooey novels you curl up with on the beach can
be twice that again--well over 200,000 words. I am drafting this document
just after completing a commercial novel of about a half-million words,
and I enjoyed every one of them. The connection between the physical
book and our sense of a book-length idea is important because the literal
physical package has come to define what we mean by a well-thought-out
argument or story--because such an idea would fill the pages of a book.
In other words, the accident of the convenient size of a single volume
has served to create an arbitrary image of an intellectual category;
the medium in this case has served to define the message. But in electronic
form, anything goes. A book (probably better to refer to it as a text,
though that term lacks the historical resonance of book) could be millions
of words long or it could be a simple e-mail of a few lines: no particular
length serves to define what is meant by a complete idea and the physical
display of such a book or text (whether on a computer screen, a personal
digital assistant, or whatever) is thoroughly agnostic when it comes
to meaning. It would be interesting to speculate what it will mean culturally
to lose the sense of a well-developed idea when such an idea is no longer
hardwired to paper and ink. Throughout this essay I use the term book
to refer to texts of any length.
I
Before we have a processed book, however, we
must have a traditional book, a primal book, an utterance that precedes
or has escaped the bureaucratization and systematization of the modern
world. The primal book is a curiously romantic myth that a number of
otherwise skeptical and dispassionate people (mostly authors) cling
to unreflectively. The primal book is usually written by a single author,
someone who has Something to Say. The author's job is to get it out,
to get it on paper. It is a serious task. It requires a serious person.
To assert the seriousness of the effort, the author may rent a garret
and embrace poverty; even more reckless souls may teach at a university.
It is a spiritual mission. It is hoped that the author's creation will
ultimately be wrapped in the appropriate robes of ritual: a stiff hardcover
binding with a glossy dust jacket, acid-free paper, perhaps a colophon
page, with extra points for deckle edge. The most important aspect of
the primal book, however, is its air of authenticity. The author, the
creator, has made the book in his image. Such a book is a bit of the
inner life of the author brought into the world for all to admire.
The notion of authenticity is insidious and,
apparently, resistant to all attempts to stamp it out. Perhaps one could
identify a young acolyte by the expression of an early interest in the
poetry of Wordsworth and drag the reprobate to the woodshed for improvement;
or the ungrateful cur could be presented with the complete works of
Kerouac and admonished: Do you see now? Do you see where this could
lead? It is to no avail. The author has Something to Say and the book
is Where It Is Said. This myth can emerge unexpectedly. So, for example,
I am drafting this essay in the wake of a well-publicized plagiarism
scandal. The crime of plagiarism is an assault on the church of authenticity.
It threatens to undo the primal book.
People familiar with the book industry, especially those few who pay
attention to the numbers, are aware that the primal book is a myth.
For starters, many books are not created by a single author or even
a dynamic duo but by teams of writers, who may be writing to scripts
created by someone else. This is virtually always the case with reference
books and for many textbooks and is often true of paperback series publishing,
where legions of writers (we probably should not call them authors)
"fill in" the details of plot and character that have been
outlined by a project head. Then there are all the books that only pretend
to be books to get bookstore distribution. (The late Hayward Cirker,
founder of Dover Publications, used to publish juvenile titles that
he called "toys within covers.") And there are books that
are compilations of the works of others, sometimes of people whose identity
is unknown (e.g., The Darwin Awards). Celebrity books represent
a particularly cynical attack on authenticity. Such a book has a prominent
personality "tell" his or her story with the assistance of
a ghost writer; and in a twist that almost seems like postmodern wish-fulfillment,
often the "ghost" is no longer invisible but is cited on the
dust jacket right below the name of the celebrity, whose true role is
that of marketing lightning rod. But the myth of the primal book is
too potent to be troubled by the facts. Books are what authors write.
Books express authors' ideas. Books have a certain integrity born of
the fact that they are the authentic manifestations of the serious men
and women who create them.
I offer this caricature of the traditional book
in order to more easily contrast it with the evolving forms of electronic
publishing, where the author's authentic voice is buried within a network
of references and interpretations. This is the world of the processed
book, the book where the primal utterance of the author gives rise to
hyperlinks and paralinks and neural networks and whatever other kinds
of connections and cross-connections computer scientists come up with.
Do a Google search on "computational linguistics" and you
will never again think of books in the same way. A processed book is
processed in two senses: the original utterance undergoes a series of
transformations before the end-point is reached, and it is micro-processed,
that is, it uses the astounding capabilities of computers to augment
the original text.
The current crop of electronic books is often criticized
as being dressed up with bells and whistles (to which a programmer might
say: Whistles! Why didn't I think of that!?). Well, you ain't seen nothing
yet. The list of features differs from format to format (you can do
more with an etext on a personal computer than you can with the relatively
weaker processing power of a handheld reader), but generally includes
such worthwhile things as bookmarking, highlighting, perhaps hyperlinks,
and the integration of a mediocre dictionary. We are often told of the
wonders of not having to carry six heavy books on a plane when you can
put that much and more into a handheld reader's memory. This is perhaps
not as wonderful a feature as some (nonreaders) might suppose, as virtually
no one would dream of carrying six books on a plane (you can't even
finish a novel in the time it takes to fly from San Francisco to New
York and there are hardcopy bookstores everywhere). Despite these added
features, though, all the current ebooks brag that they preserve the
text and spirit of the original, the printed book; one device even looks
like a classy leather-bound book! Now, when in the last century has
anyone routinely read a leather-bound book? The trap that the ebook
publishers and the device manufacturers have fallen into is the myth
of the primal book. The ebook is supposed to have the same aura as a
tony printed book. To my mind, this is an insult to the digital medium.
The processed book is not boundless, but it is
vast. It is not limited to dedicated handheld devices but can be displayed
on any computer screen. Indeed, some of the most interesting examples
of the processed book are difficult to use except on a personal computer,
as they require a large screen for display and insist on being read
with a Web browser (see, for example, the outstanding search engine
and links of the Reed Elsevier collection of scientific, technical,
and medical journals). But as we project into the future, we assuredly
will see a multitude of computing devices, some that sit on desks, some
that slip into our pockets, some that are combined with wireless phones,
and perhaps some that are surgically implanted at the base of our skulls.
And the good news is: they will all display books. These books will
be everything books have always been in the past and more. By "everything"
from the past, I mean just that. If you want the smell of a leather
binding, it can be programmed in. If you want the complete text of The
Adventures of Huckleberry Finn or Little Dorrit, you can
have it. You can have the text of Crime and Punishment in Russian
and English; for that matter, you could see an exact copy of the manuscript
displayed in a small window. Critical commentary? Pack it in. Full text
of footnoted sources? Included. It does not matter where some of this
data resides physically, whether it sits in the memory of the reading
device or must be instantly retrieved from a remote outpost on the Internet.
The processed book collapses time and space and makes all the civilization's
documents available in the palm of your hand. The processed book is
thus an assault on the natural rhythm of things; it occupies a deracinated
world of ideas. We have exchanged the garret for the microprocessor.
When placed into the context of the processed book, the primal book
doesn't disappear; rather, it is stripped of its air of being a vital
expression of a human being and is reduced to its text. If this is beginning
to sound like some abstruse critical theories of literature and texts,
this is because aspects of those theories have proved to be predictive.
This is painful to behold for someone who prayed earnestly that Isabel
Archer would not return to Osmond, but words are symbols and are ideally
suited for the manipulations of the symbolic logic of computers. The
processed book takes Isabel Archer and shows her to be the collection
of words that she is. She then can be processed.
II
Above and beyond the text of the
primal book that serves as its staging point, the processed book has
at least five aspects, which may overlap; and some of these aspects
are more developed today than others:
1. The book as portal
This is the aspect of ebooks that most people are
familiar with. A book becomes a specialized portal by encouraging readers
to click through to other sources of information. The most primitive
example of this is a book with a built-in dictionary. Every word in
an ebook can be linked to its definition, pronunciation, etymology,
etc., which augment the reading experience. Some ebooks link to proper
names or Web sites where background information on the primary topic
can be found. With hyperlinked footnotes an ebook can point a reader
to its sources, including in some instances the full text of those sources.
The ebook thus becomes a window on a bigger, interpretively supportive
world of data.
Can't you do this with even our lowly primal book?
You can do some of it. Printed books can have footnotes and bibliographies;
they may have other metadata as well, such as an author's preface, an
afterword by a scholar, or even a collection of critical essays (see,
for example, the excellent Norton Critical Editions). Publishers have
done great things with print, and they have every reason to be proud.
But the primal book breaks down in the face of the microprocessor much
as a horse would in a race with an SUV. The printed book has a footnote,
but the processed book can have the full text of the citation. A bibliography
in a processed book can be tantamount to a library dedicated to a particular
subject. You can read a printed book with a good dictionary at your
side, but with the processed book you can look up every entry in Webster's
Third New International Dictionary with a mere click or two; and
the lucky members of some academic institutions can now read a word's
entire history in the OED.
But the electronic portal goes far beyond even this, connecting readers
to specialized databases of information and online services. A bibliography
in an ebook can link to the online catalog of a nearby university library,
where you can determine if a particular book is in the collection. Or
readers of Books in Print, a huge reference work marketed mostly
to publishing-industry professionals, can look up any title they desire
and then check the inventory status of that book with two of the nation's
leading book wholesalers. This particular reference book, in other words,
has become a "front end" to a business process by which booksellers
can restock their stores. The processed book has the potential to make
the contents of a book actionable, not merely readable.
One aspect of the book as portal is that it undermines
the reading experience even as it augments it. Reading is linear and
requires concentration. A portal link takes the reader away from the
author's linear design and focuses his or her attention on other text.
While that text may enrich the meaning of the original book, it also
distracts the reader, who then must reorient him or herself upon returning
to the primary material. As authors become increasingly aware of the
potentialities of the processed book, we should expect that they will
begin to write with these jumps in attention in mind. Perhaps they will
encourage leaping, perhaps not; or perhaps they will learn to accommodate
this aspect of the medium, just as audiobook publishers have learned
to give their listeners special cues to help them with the transition
from print to sound ("This is Moby-Dick by Herman Melville,
cassette four, side two").
What will be especially interesting to see in
the years ahead is whether authors will begin to regard their own work
as portals and begin to write with an eye toward extending the book
beyond its own contours. They could never do this in the printed form;
it would be peculiar to ebooks. Perhaps some authors will be more open
to including obscure references, knowing that the reader can obtain
a gloss with a simple click. One wonders what T. S. Eliot would say
if he were alive today and could view Ray Parker's online annotated
version of "The Waste Land" (http://world.std.com/~raparker/exploring/thewasteland/explore.html).
On the other hand, it may be that some authors will resent the ease
with which obscure references can be glossed electronically. For example,
part of the meaning of the notoriously elliptical poetry of Ezra Pound
lies in its very obscurity, in the sheer difficulty of catching all
the allusions to other works. (For those unfamiliar with Pound, think
of the allusive music of Elvis Costello or Smashmouth.) As a processed
book, Pound's Cantos would lose some of its aesthetics of difficulty
as every allusion is presented to the reader with helpful commentary.
If Pound had been aware of the possibility of the processed book, he
might have written a different kind of poetry altogether. I am inclined
to think, though, that the processed book will make all writing, from
serious literature to notes to the baby-sitter, more Pound-like for
everyone--except Pound's disciples, who, perversely, will seek to distinguish
themselves by the clarity and completeness of their expression. Ebooks
are likely to become increasingly compressed as the need to spell everything
out in the primary text is lessened by the one-click availability of
explanatory texts. Writing, in other words, becomes not simple expression
but also computer-assisted calculation. It is not too much to say that
Pound, with his enormous influence on Modernist aesthetics, helped create
the intellectual pressure that made the development of electronic publishing
necessary.
2. The book as self-referencing text
Books consist of words that are organized in a particular
way. Change the organization and you change the book's meaning, but
some changes reveal some of the original meaning that previously had
been obscure. You can try this with a printed nonfiction book with an
index. Read the book through without looking once at the index. Now
turn to the index and read all of the headwords. New patterns will appear.
By taking the distillate of the book, an index becomes an interesting
heuristic device.
It goes without saying that computers can do
this better. The processed book can index a book in any number of ways,
and each method will highlight a different aspect of the primary text.
This capability is beautifully spoofed in Italo Calvino's If on
a Winter's Night a Traveler, where a literary critic proclaims
that she no longer reads literature, preferring instead to study a computer
print-out of word-frequency counts. The processed book can show us word
frequencies; it can map such frequencies against a statistically determined
dictionary of "normal" usage, note the standard deviation,
and output the result visually; it can associate certain words with
specific characters; it can identify webs of metaphors that even the
most attentive of readers may have missed. By identifying these patterns
(or, it would be more accurate to say, by revealing these patterns for
us to see and interpret), the processed book is doing some of the reader's
work. A reader of The Scarlet Letter notes that there is a
red "A" sewn on Hester Prynne's bosom, and later notes a passage
where Hester walks in a garden of red roses: the connection noted, interpretation
is possible. A computer can pick up this connection and many, many more,
potentially providing us with some of the richness that we usually associate
with rereading. The processed book is spatial: it takes the linear progression
of a book and makes events from different times spring to mind simultaneously.
It takes the primary book and makes it comment upon itself.
While this particular aspect of the processed
book is generally unavailable in the current generation of commercial
ebooks, it has been in use in research institutions for several decades.
(Indeed, many of the features of the processed book mentioned in this
essay will not become widespread for years.) I first saw word-frequency
counts used in the study of literature over twenty years ago at an exhibit
at the Modern Language Association annual convention. There was an alphabetized
list with corresponding frequencies of every term in Joyce's Ulysses--one
way to help to understand a notoriously difficult book. Dictionary-makers
now routinely search through large content databases in this way, seeking
to isolate new words and new meanings for old terms. (The latter--new
meanings--requires a bit of artificial intelligence to be effective.)
Dictionaries can also be made to "read" themselves, a good
way to check for spelling errors and to make sure that every word that
appears in a definition also is given its own headword, making the text
of a good dictionary into a closed hermeneutic circle. For the most
part, when fiction writers, as in the example from Calvino above, study
this self-referencing characteristic, they treat it humorously. For
example, a character, a writer, in a novel by David Lodge gets writer's
block when he sees one of his own books processed in this way. He simply
couldn't bear the self-consciousness that comes with knowledge. In this
case, humor is the revenge of the primal upon the processed.
3. The book as platform
There is a simple and a complex form of the book
as platform. The simple form is where commentary is heaped upon the
poor unsuspecting text of the original work. This is not peculiar to
electronic books, of course; hardcopy books often feature extensive
critical commentary. To some extent this simple form of platform overlaps
with the book as portal, in that the commentary is often found by clicking
through the primary text. The commentary need not be restricted to formal
criticism; it could include such things as a student's highlighted text
or notes provided by an instructor. There are a number of companies
exploring the simple platform now. We should expect this technical capability
to become widespread, especially (at least initially) in higher education,
where many students are required to have laptop computers and many instructors
supplement classroom activity with online communication.
The complex form of the book-as-platform, however,
may be a bit obscure. Currently, it is much less developed than the
processed book's portal and self-referencing qualities. It is the opposite
of the book as portal. As a portal, a processed book points to other
things; as a platform, a book invites other things to point to it.
Books want to be pointed to for the same reason that
people want to be the center of attention. In a hardcopy book this desire
may take the form of crafting curmudgeonly aphorisms, which lend themselves
to quotation. In scientific work, the production of primary data can
place a particular publication at the center of a huge web of citations.
This is an extraordinary aspect of the processed book and bears some
reflection. The ISI company publishes quantitative reports on how often
particular scientific articles have been cited by other reports. A high
score is presumed to indicate a good paper, which redounds favorably
on the author. Well, do more citations mean that a paper is better?
And what do we mean by "better" anyway? Or is it simply that
we have thrown up our hands at the really hard question, the determination
of value (an artifact, we should note, of the primal book), and have
chosen instead to use a computer-assessible mechanism, a simple count,
as a proxy for the hard question? This is not to say that this measure
of the processed book is wrong; it simply isn't exactly right. We don't
use it because it gives us the right answer, we use it because it gives
us an easily derived answer--not unlike the old joke about a boy who
loses a quarter on one end of a dark street, but chooses to look for
it on the other end, because the light is better.
A platform is a specific and important thing in the
computer industry; Bill Gates owes his wealth to having overseen the
development of one of the most significant, the Windows operating system.
A platform is what other things rest upon. Those other things (called
applications) draw on or "call" the resources of the platform
to perform certain tasks. So, for instance, software developers don't
have to teach computers how to output color on the display; they simply
invoke the platform's capability to display in color.
In the world of books, reference books most readily
lend themselves to being reinvented as platforms. Dictionaries are now
being created with software tools to allow them to be "called"
from any word displayed on the screen: highlight the term and click
and the definition appears. Encyclopedias are being used as platforms
as well, though the implementation is generally limited to manually
inserting hyperlinks in the primary text, links that then "call"
the encyclopedia database. We shouldn't limit our thinking about platforms
to natural-born backgrounders like reference books, however. For example,
some books, in their primal form, come to be thought of as seminal.
On the Internet something that is seminal can be instantiated as a parent
text that links to all its offspring. When a book achieves seminal status,
the publisher may then provide tools to make it easier for other works
to link to it, converting it from a primal text to a platform. The ultimate
book-as-platform is the Bible, which serves as a platform for a large
swath of Western civilization. The Bible has yet to be published as
a platform, however, though it has been published as a limited portal.
To publish the Bible as a platform means not
only getting the content "right" (which in this context means
having content that other people want to build upon) but also providing
tools for other developers, whom we are likely to call authors or publishers,
that make it easy to build on that platform. Metatags, information that
helps to define the components of the documents that they are attached
to, are such tools; they can identify such things as graphical categories
("this is a picture"; "this is a paragraph indentation"),
rhetorical categories ("this is a paragraph"; "this is
the beginning of a chapter"), and topical categories ("this
passage is about cats"; "this passage is about dogs"),
even when obvious keywords ("cats" and "dogs") are
missing ("this is a passage about household pets"). Metatags
can be weighted, which means that their importance can be ranked. This
paragraph, for example, includes the keyword Bible, but the
passage is not about the Bible, which should be given a low weighting.
A metatag for literary theory would be given a higher ranking,
even though neither term appears here; and a metatag for a consultant's
marketing tool would get the highest ranking of all. At the risk
of pushing the metaphor too far, the publisher of a book-as-platform
needs to "expose the API," the application program interface,
allowing other authors and publishers to write to the platform. The
content of the platform is then conceived of as information objects,
defined and discernible modules that can be invoked by other works.
The book-as-platform strains the traditional
sense of what a book is, making it hard to reinvent or resuscitate a
traditional book for platform work. For this reason, some of the work
currently being done to create content platforms is original to the
Internet, though the business prospects for the entities in this area
are still uncertain. One venture is producing a set of reference data
keyed to news items. For example, a reader who comes upon a reference
to Ariel Sharon can click to a brief article about this figure.
Similarly, a text reference to petroleum will link to an article
about oil and the oil industry. The content created by this venture
differs from traditional reference works in that it was designed with
the Web in mind from the outset. The articles are short and can be easily
displayed in a window on a computer screen, and the article topics are
generated by scanning items that actually appear in the news (unlike
a traditional encyclopedia, many of whose entries may be obscure to
people who only read newspapers). A related venture has chosen not to
create new content for a platform but has developed a database of Web
site entries. So, when a reader comes upon a reference to Sharon,
he or she can link to a small group of Web sites that contain information
about Sharon, rather than to a specific article. As more and more reference
information is published on the Web, the split between content that
is made for the Web and content that is made with another medium primarily
in mind will close.
There is, I believe, a very large business opportunity
in creating books-as-platforms, especially by concentrating on reference
material in specialized markets. General reference works--a new version,
say, of Encyclopaedia Britannica, but with much more extensive
coverage and an atomistic, short-entry editorial strategy--are tempting,
but the cost of creating and maintaining such a work is staggering and
the economic prospects discouraging. Better to work in vertical markets,
whether for consumers or professionals. A definitive online encyclopedia
of garden flowers, organized as information objects, would be a good
project, but even better would be a highly technical encyclopedia of
the genomes of garden flowers, including the genetic maps of each plant
and flower, with downloadable files of data for simulation of genetic
engineering. From a business point of view, as a rule, the narrower,
the better; the more technical, the better; and if the data can be made
instrumental--how things work--as opposed to interpretive--what things
mean--better yet.
Publishers will devise various means to monetize
their investments in books-as-platforms, but finding the right economic
model (that is, the one that provides the highest return on capital)
will be a process of trial and error. Publishers with seminal content
may charge other publishers for the right to "call" the seminal
property, but, on the other hand, if the costs are perceived to be too
high, the seminal work may not be able to generate a substantial network
around it, thereby undermining its seminal status. There is a trade-off
here between short-term and long-term economic gain--which is another
reason that publishers will continue to get bigger and bigger, as only
large organizations can finance a long-term vision. An analogous situation
exists today in the library world, especially the public library segment,
where hardcopy trade or consumer publishers have always had ambivalent
feelings. Most trade books are sold through retail outlets (bookstores,
discount clubs, and online), but a book placed in a library's collection
has the potential to cannibalize retail sales. On the other hand, there
is strong evidence that library collections serve as a marketing mechanism
to stimulate retail sales. Publishers therefore support public library
sales provided that they don't become a substitute for retail distribution.
This meets with a paradox: If we ever developed a fully-funded public
library system in this country, where everyone looked to their local
libraries as the centerpiece of civic life, publishers would stop selling
books to them.
I suspect that a split will develop between marketing
books-as-platforms and books-as-applications (that is, mere books).
Books-as-platforms, seeking ubiquity and determined to keep their transactional
costs down, are likely to be marketed to pre-existing communities such
as the faculty and students of a university, the employees of a corporation,
or a special-interest group (the local chess club licensing an encyclopedia
of chess openings). Books that draw on these content platforms will
be marketed both on a community basis and to individuals. This two-track
marketing structure will encourage communities to take a larger role
in their members' informational needs, which in turn will encourage
closer community involvement. This is not a "one world" scenario,
but rather one of tribal associations born of common economic interests.
It is worth noting a curious aspect of the book-as-platform,
namely, that books that are created with this quality in mind are something
of a self-fulfilling prophecy. A book-as-platform announces its availability
to be invoked by other books in part through the suite of tools it makes
available to third parties. A good book with no tools will not get invoked
often. A bad book with good tools will not, one hopes, be invoked at
all. But a good-enough book with good tools is likely to get invoked
more often than the tool-less good book. Certain network effects--things
external to a particular product or service that tend to support and
even reinforce the original product, as the huge quantity of third-party
software supports the Microsoft Windows platform--may then kick in,
which will tend to strengthen the platform aspirations of the good-enough
book. The Google search engine, for example, ranks Web sites in part
by counting the number of links other sites have to the primary site;
and since search engines, of which Google is the current leader, are
a major source of traffic to Web sites, a large number of inbound links
can result in even more inbound links. This means that the creation
of a successful book will increasingly involve an awareness of what
tools are necessary to inspire invocation. It is not enough to say something;
it must be said in a way that others will choose to say it as well.
But how about the outstanding book, the book
whose felt force is so great that it demands that we pay attention,
despite an absence of platform tools and inept publication? Provided
that we understand that the club for outstanding books is a small one
indeed, the exceptional book will foster its own followers, who will
assemble the network of processing tools around it. Despite its efforts,
the processed book cannot ultimately do away with the exceptional primal
book, whose very intensity exposes the limitations of computing.
4. The book as machine component
We have been spoiled by books. We believe that
they have been written for us to read, that their ultimate goal is to
reach us, that as readers we occupy a central place in the drama of
culture. If the processed book attempts to separate the author from
the text of his or her own work, we should not be surprised that the
reader will soon fall under attack. One aspect of the processed book
is to create books that are intended to be read by machines and embedded
within machine processes. It is only a matter of time before books will
be created with a machine-audience in mind. Considering the slow growth
of the publishing industry today, the future of publishing may be to
serve this new constituency.
Research into the use of aspects of human culture--books, for instance--as
parts of computer algorithms has been going on for decades; some examples
of this work are now finding their way into consumer devices and services.
We commonly encounter text-to-speech synthesis (TTS) technology, for
example, when we dial an information operator and are greeted with a
robotic voice. TTS works by developing a collection of sounds that are
mapped onto the letters and words of the text in question. While there
are only forty such sounds (phonemes) in English, most TTS engines generate
more sounds than that in order to reduce the choppiness of pronouncing
one letter at a time; indeed the technical sophistication that "sits
behind" what would seem to be a simple sound is dazzling and tends
to overshadow the lexical content that it generates. Millions of personal
computers now come with this technology built in. You can have your
email read to you in a robotic voice, if you want to, which may not
make much sense for someone sitting at a computer, but is a great convenience
for someone driving a car who can't or shouldn't take his or her eyes
off the road; such mobile TTS is now available. One Silicon Valley company
has developed a TTS tool for reading books to the blind, which is a
wonderful addition to the world's media, as only a small portion of
published books are ever recorded as audiobooks (and even then, for
reasons of cost, mostly in abridged format). TTS will eventually find
its way to all books, giving the reader of a processed book the choice
of reading or listening. (This, by the way, will diminish or even destroy
the $2 billion audiobook business as we know it today, as the rights
for a book's text intended to be read and the rights for audio will
converge.)
The reverse of TTS is voice-recognition technology, though this technology
is not as far along as TTS. A voice-recognition system incorporates
a dictionary, which helps identify the words being spoken. Some of the
current systems require that a particular speaker "train"
the system for a period of time to make it work effectively, but even
so, the principle of embedding a dictionary is unchanged. One way to
improve the accuracy of voice recognition is to restrict the vocabulary
of the system. This is what is going on when you are talking to an automated
voice mail system, which may tell you something like: "At the tone
please say your Social Security Number or you may key it in using your
phone's dialpad." Creating such a restricted vocabulary is the
equivalent of making an abridged paperback version of a dictionary,
except that the requirements of voice-recognition technology are largely
determined empirically, by studying the words users actually employ
and expanding the vocabulary if users effectively demand it. This feedback
mechanism, which is peculiar to the processed book, can take place quickly,
even instantly. The processed book, in other words, "learns"
and adapts itself to the actual circumstances of its use. Traditional
books, on the other hand, like diamonds, are forever.
It may appear that reference books and dictionaries in particular have
an advantage over other books in becoming machine components, but in
fact all books aspire to the perfection that is a machine. It may require
a multistep process, however. Let's take romance fiction, for example.
How could we possibly make a machine want anything to do with a romance
novel? The first step is to convert the novel into a collection of indexes
through text analysis, much as described above in the section on the
self-referencing book. Such indexes are intermediate documents that
could be of value to marketers, who might extract word-frequency lists
(or some other underlying textual pattern) to assist in crafting copy
for advertising. But the processed book can do more. Why one romance
novel when you can have one thousand? And let's link the indexes of
each novel to a number of fields of metadata such as author, date of
publication, rate of sale, and the geographical distribution of sales.
Let's also capture data at the point of sale, directly from the cash
register, and update our metadocument in real-time. Now we have a dynamic
database that can tell us how the moods and tastes of a particular market
segment are changing minute by minute. (We could get even better information
if these books were being read online, where each page view could be
assessed.) We may as well disintermediate the copywriter and have the
processed data tweak the Web site of the client company; or we could
have the dynamic data feed a digital printing press, where last-minute
changes to a marketing brochure can be made.
Outside the world of books, something like this is already underway.
The Benetton organization captures data on every garment it sells, data
that is then sent to a database and evaluated for trends in fashion.
These evaluations are then moved to the production line, where they
can influence the dye lots. It is quite possible that the color of shirts
in the Benetton inventory could change from one week to the next as
a direct response to the information being fed to the factory floor
from the cash register. The introduction of the processed book to such
a system will represent a refinement, the addition of a culturally based
weighting mechanism to optimize the effectiveness of the inventory management
and merchandising system.
5. The book as network node
The primal book is a discrete item. The processed
book is a node on a network. Now we know what has happened to the primal
book: it resides as a node, linked to other nodes, many of which themselves
are primal books. Compare this to DNA: Each individual has his or her
own unique DNA, but this DNA has much in common with that of all other
people, living and dead--and, for that matter, not yet born. All men
are cousins. And this is true of books as well: By being placed within
a network, where it is pointed to and pointed from, where it is analyzed
and measured and processed and redistributed, a book reveals its connections
to all other books. When Hemingway remarked that all American literature
can be traced back to Huckleberry Finn, he was acknowledging
kinship. When a text analysis program determines that writers from one
region use more dependent clauses than writers from another region,
it is defining kinship.
The relationships between the nodes of the network can be multiple.
One node can be used as a machine component and aid in creating another
node, which serves as a platform for a third, which supports the first.
The network map--how one node connects to another--is a portrait of
the processed book, showing its ancestry, its descendants, and the relationships
between the entire family. This map is itself a document--may we call
it a book?--or metadocument, which derives from the very field it comments
on and in turn influences that field, much as consciousness influences
the behavior of a human being.
This is all pretty abstract for someone whose ambition is to write a
simple, self-contained text such as a memoir or a category novel (mystery,
romance, science fiction, etc.). The problem is that the idea of a self-contained
text is a product of the fixed medium of print on paper. The challenge
that the processed book puts to writers is that of working with a double
consciousness, as primal authors looking over their own shoulders as
they see the book being processed even as it is written. The primal
book lives under surveillance. It is hard to imagine many authors whose
work will not be influenced by the fact of being observed by a camera.
And it is important to note that this is not a matter of choice. While
some romantic writers will try to bat away the intrusions of processed
media, those who embrace the network will be the most successful, success
being determined by the survival and "pointability" of the
text or node.
It is worth noting that the nodal aspect of the processed book has very
important business implications, which are likely to reshape the publishing
industry in the years to come. The threat to copyright, for example,
may be pushed back. Publishers have been watching the tribulations of
their brethren in the music industry and fear copyright piracy like
nothing else. Piracy is not peculiar to digital books, of course, as
any publisher who has travelled in Asia can tell you, but it is much
more extensive when copies of books can be sent around the world on
the Internet. While Napster, the first mass-market file-sharing service,
has been clipped by the music industry, file-sharing still takes place
on Napster's underground successors (LimeWire, Aimster, etc.) and within
the universe of UseNet. Many books are now being pirated on these services,
which has led not a few publishers to steer clear of distributing digital
copies of their products, for fear that they will end up in the file-sharing
underground.
A processed book, however, can be published as a node in a network,
with connections to other books, commentary, online library card catalogues,
teachers' recommendations, and so forth. If the network is usefully
developed (and this is an important "if," for links and other
connections for their own sake can be a distraction), the value of the
book-as-node is greatly enhanced by being part of it. Pirated copies
of the primary book, the node, would not have all the network connections,
making the pirated copies less valuable. This would serve to bring readers
back to the nodal book, not for its primal value (because the content
is elsewhere available for free in pirated copies) but for its processed,
networked value. This would embolden more publishers to make their books
available electronically, provided that they had the means to plug the
book into a network quickly.
Something like this is already going on in the world of academic journals.
As noted above, Reed Elsevier, the leading publisher of journals, has
built a powerful search engine for its collection of academic research.
This is a shrewd move, and it may be that Reed's customers, primarily
research libraries, don't yet see what is going on. There is a growing
movement in the research community for changes in the way research is
published, with not a few people arguing that all academic research
should be made available online for free to anyone with an Internet
connection. Reed has been the whipping boy for this movement, perhaps
with some reason, as it has pushed through aggressive price increases
on many of its journals. Many articles are now appearing in various
prepublication forms on the Internet, which could undermine the value
of Reed's journals, and there is a move afoot (see, for example, the
writings on this subject by Stevan Harnad) to have researchers self-archive
their work prior to submitting it to a journal. This would give a publisher
pause. If an article is self-archived on the Internet, then anyone with
a Web browser can read that article for free. Why then would librarians
continue to purchase journals, especially if the prices continue to
rise?
Reed's answer is to create a processed book, though, of course, it is
never phrased this way, nor is the underlying strategy ever expounded
on. Reed's search engine adds value to the journals indexed and searched;
the extensive links add more value. To the database of journals are
added many public domain documents, all of which can be searched at
the same time. The database gets bigger, thus the need to have a good
search engine becomes greater. Now the value of any one document is
significantly augmented by virtue of its being part of the network that
Reed has created. If copies of these articles are placed in a self-archive,
what value they have is theirs alone, assuming that anyone can find
them; but placed within the network, the value of the nodes rises. The
inclusion of public domain documents is particularly crafty. Reed has
migrated the value from the public domain documents themselves to the
search engine, the dynamic metadocument, which helps a reader find the
underlying documents. In some respects Reed is coopting the public domain.
So who needs copyright? The economic challenge for content creators
and publishers is to create content that requires its incorporation
into a network and to make sure that the network's domain-specific search
capability is always a step ahead of general-interest search engines
such as Google.
Can this work for books as well as journals? It can and it will. By
the time we get to the twenty-third title in the Harry Potter series,
ebooks may be ubiquitous. The new title will be published electronically
and will have built into it such things as links to key passages and
characters from the previous books, a Harry Potter dictionary, connections
to Web sites for Harry Potter clubs, and much, much more. There will
be a temptation to pirate the text, but the pirate won't get the built-in
links to the trailer to the next movie--not cool, as any kid will tell
you. Piracy will be kept in check by reinventing the highly primal Harry
Potter titles as processed books. The economics of publishing will demand
it.
III
Who is going to implement these aspects of the
processed book? Authors? Their agents? Or is there hope for publishers,
those benighted organizations that steal the souls of authors?
The answer is, All of the above and more. The processed book is not
one thing, nor will it be implemented as a single coherent system whenever
a new book is written or published. The processing of the book will
be organic. Authors will pitch in, as will their readers, critics, publishers,
librarians, and anyone else who touches a book. There is obviously a
place here for software vendors as well, and the telecommunications
companies that ship this content around will benefit regardless of what
a particular node of the network is about. Not all processed books are
created equal. Books that lend themselves to linkages to other texts
will be processed more. There will be a new quantitative measure: a
book is equal to the sum of things that can be linked to it and through
it.
Authors will process their own books in the act of writing; indeed they
have been doing so for some years now, as word-processing is an aspect
of the processed book. They will increasingly write with the processed
potential in mind, changing the nature of their texts, as reflected
in their style, choice of topic, etc. As they help to bring their books
to the attention of others, they will process their books further, perhaps
by creating Weblogs (see, for example, www.andrewsullivan.com),
which can help promote a book and encourage others to add new layers
of processing to it. They will choose their agents and publishers in
part for how much processing they can bring to the task, just as they
now inquire into who has the best relationships with television talk
show producers.
Will they self-publish and disintermediate publishers entirely? Some
will, but disintermediation is overblown. Publishers are in a better
position than individual authors to develop the network of essential
processed relationships. It is hard to build and maintain a single Web
site, but it is trivial to build and maintain a thousand. The scale
of a publishing house will benefit all the authors, books, and nodes
that connect to it. Publishers will acquire software tools for all their
titles that can be brought to bear for any individual author, tools
to create portals and platforms, for books to be made self-referencing
and to be converted into machine components, and ultimately to take
their positions as nodes on the network. These tools will be distributed
to others in the value chain, to readers and critics, who will add new
links to each text. We can see the beginnings of this on Amazon.com,
where each title is surrounded by copy provided by publishers and reviews
by amateurs and professionals. Because of the cost to create the processed
book, size matters: the processed book will contribute to the ongoing
consolidation of the publishing industry, as fewer but bigger houses
take a larger share of the market.
In the final indignity to authors, it seems likely that the creators
of books will begin to lose control of the editorial entity. This has
always been true to some extent, of course, as bad or misguided reviews
have influenced the reception of many books over the years. With the
processed book, however, matters change in both degree and kind. A book
can find itself overwhelmed with linked commentary, and if the commentary
is irresponsible, how does one correct that except by overlaying even
more commentary? Worse (or better, depending on one's bias), multiple
versions of a work can circulate independently, each developing its
own network around it; and if one or more of these versions do not precisely
represent the author's original intention, well, who is to say? What
is to stop someone from making some changes in the original work (I
never did like Desdemona's eyes) before forwarding the revised text
to the next person, and the next person, to everyone, and forever? Or
perhaps, by virtue of the sheer accessibility of all texts at all times,
with all their relationships mapped in exquisite and excruciating detail,
there is an Invisible Hand in critical commentary that ensures that
over time the "right" text and the "right" interpretation
will prevail.
IV .
Suppose this essay were written and augmented
as a processed book? How would it differ from what you are reading right
now?
To begin with the obvious point, this essay is a processed book, though
the extent of its processing is too conventional to catch anyone's attention.
This document was created in Microsoft Word, which is word-processing
(good term) software. It has been edited and reedited, moved around,
and spliced and diced. It has also been sent over the Internet any number
of times to friends who agreed to provide comments. In sending it around,
I chose to leave it in the Word format, which enables editing, rather
than freezing the text as a PDF file. Indeed, one of the alleged virtues
of PDF files, namely, that they can prevent tampering with the text,
may in the context of a processed book prove to be a liability, as uneditable
files are less likely to have a network of comments built around them.
(Some new variants and add-ons to PDF technology permit a base file
to remain unaltered even as edited versions, including those with commentary,
are displayed alongside.) An early version of this essay has even been
mounted on a Web server, which may in time raise a practical problem
for me: How will I make sure readers find their way to the current version?
The answer, for better or worse, is that the processed book inevitably
leads to a loss of editorial control. This makes me wonder if in a world
without editorial control, authors may cease to write for attribution.
But let's work through the five aspects of the processed book and see
how they apply to this very document.
As a portal, "The Processed Book" would include links
to many Web sites for further information on people and ideas discussed
here. For example, there would be links to further information on Stevan
Harnad, Alan Kay, and Ted Nelson. There might also be links to earlier
drafts of this essay, perhaps including the many emails I collected
from people who have commented on it, not all of them favorable. The
reference to doing a Google search on "computational linguistics"
would be enacted right within the text: click and the search results
would appear.
The essay has already served as a makeshift portal for one reader, a
longtime friend who went through it carefully. He came upon the references
to Isabel Archer and Osmond and didn't know what they were. So he proceeded
to Google and soon found himself reading about a novel by Henry James
entitled Portrait of a Lady, whose protagaonist, Isabel, marries
the scheming Osmond. Why were these references not spelled out in the
text to begin with? For several reasons. First, in my (East Coast) circle
almost everyone knows who Isabel Archer is, though references to Calvino
(for example) require greater explanation. But you never can tell what
the range of reference of a particular reader is; there is, after all,
no agreed-upon culture to draw upon, no canon of bedrock ideas; in such
a world, which is the one we live in right now, the processed book becomes
a means of cultural unification. One reader (West Coast) will be puzzled
by Isabel Archer, another (East Coast) by references to algorithms derived
from Bayesian statistics. A writer has to work with all such possibilities,
which is why the processed book-as-portal is inevitable. Another reason
Isabel was not explained in the text was my own anticipation of making
the very point I am making now; in other words, I left the reference
slightly obscure in order to demonstrate the need for the book-as-portal.
I chose to write the essay under my own surveillance.
The book-as-portal will become more robust in time. It is one thing
to look up keywords in Google (e.g., proper names), but it is still
a considerable challenge to capture some allusions. The phrase "all
men are cousins" appears early in this essay, but what is a search
engine to do with it? An alert reader will pick up the allusion to "all
men are brothers," but Google also gives us "all men are pigs,"
"all men are scum," and "all men are created equal."
(Presumably for some audiences, all these phrases are equivalents.)
The alert reader will also note that the phrase "all men are cousins"
is the only time in the document that so-called sexist terms are used--but,
the writer protests, I couldn't very well have said "all men and
women are cousins," as that would have obscured the allusion to
"all men are brothers." There is cultural content here that
would make it very hard for a fully automated process to generate a
meaningful link. But this will improve, and probably soon.
There is also a reference to Walter Pater in this essay, which I will
decline to highlight. Literary students will pick it up, but for everyone
else it will remain hidden. When the time comes when everyone can find
the allusion, the processed book-as-portal will truly have arrived.
As a self-referencing text this essay could provide comprehensive
indexes, which are currently not included. Besides the aid such indexes
would provide to readers, such indexes would also make it easier for
search engines to find and classify this essay, which would in turn
potentially bring more readers to it, assuming it were posted on the
Web, as inevitably it will be. A self-referencing text would also clear
up some possible confusion in the preceding section, where I referenced
Harnad, Kay, and Nelson. Only Harnad has been discussed up to this point;
Kay and Nelson are yet to come. A self-referencing text would permit
a reader to see all references to Kay and Nelson simultaneously. Such
a text would also cluster metaphors and categories of information together.
So, for example, all literary references could be highlighted, as could
all references to the computer industry. A harder trick would be to
identify all the metacomments, of which there are many. How can a machine
tell the difference between saying something and saying something about
saying something? It may be that self-commentary will be the last bastion
of the purely human.
A self-referencing text could also provide quantitative information.
For example, a publishing friend asked if I intended to publish this
essay in book form, which brought up the question of its length. A word
or byte count is a trivial exercise for a machine. Although it isn't
clear why anyone would want to do it, this text could also be analyzed
to determine how much space was given to each topic or the distance
between literary allusions or the frequency of quotation marks and special
characters.
The value of a self-referencing text grows with the length of the work.
For this essay, self-referencing is not particularly revealing; for
Moby-Dick, it would be breathtaking. On the other hand, if
all the comments made on the drafts of this essay were to be included
as part of the text, self-referencing would become more valuable, as
it would trace the evolution of ideas. This raises the question of whether
self-referencing of a text should apply only to a particular network
node or to the entire processed network.
It is very interesting to think of this essay as a platform.
To a small extent, it has already served in that capacity. One early
reader asked me for permission to use one section for a project he was
working on, a major reference work in botany. He wanted to present the
book-as-platform idea to the writers; he wanted, in other words, to
use a section of "The Processed Book" as material for his
own work--he wanted to use the essay as a platform. Well, this is only
a tiny matter of technology: all he needs to do is copy the relevant
section and paste it into his document. But what he perceived is that
a book is often copyrighted and that he needed more than cut-and-paste
technology to use the essay as a platform.
If this essay were a platform, it would include tools to enable other
writers to "call" its text or a section of the text. These
tools would necessarily include copyright information, without which
clearing permissions can become tiresome. (I don't want to get into
the fair use aspect of copyright law, though it is relevant here, as
it is complicated and certain to provoke much unproductive argument.)
One company (now defunct) had a technology that is likely to be imitated
that had the copyright policies of a particular work pop up on the screen
simply by having the mouse pass over the object in question. What would
those policies be? It depends. A writer or publisher could take a tough
stance on copyright, requiring all uses of the platform to involve permission
and fees. Or there might be a matrix for copyright questions, depending
on the size and nature of the use--free for schools, costly for corporations,
and so forth. For that matter, the work could simply be put into the
public domain.
Fascinating work in this area is being put together by Hal Abelson and
Lawrence Lessig at their public service organization, Creative Commons.
Among other things, Creative Commons proposes to "brand" the
public domain, that is, it is developing a set of signposts so that
users will know whether or not a particular information object is under
copyright. As part of this project, a series of intellectual property
contract templates is being developed, which will allow the owner of
a creative work to determine the copyright status of his or her work.
This is important. Prior to the work of Creative Commons, much intellectual
property was either totally controlled by its owner or not controlled
at all, that is, it was in the public domain. The contracts being developed
by Creative Commons would allow me, as the author of this essay, to
choose an intermediate position. I might assert the right for all commercial
uses of this essay (not many and not worth much), but I might also stipulate
that noncommercial uses require no fees or permissions. If this were
a novel, I might insist that I controlled everything in it, but I might
make the characters available to others for free or for a fee for derivative
works.
The point here is that as we think of the processed book, we are not
only dealing with what technology can do with content but also about
the total set of social and legal issues that surround a work. Social
and business rules can be codified and instantiated within technology.
A reader or user can then draw on these rules without fear of violating
anyone's rights. The book-as-platform may have more to do with copyright
law and marketing strategy than with bits and bytes.
With the book as a machine component, things really begin to
get interesting. Hide as I may try, this essay says a lot about me.
The word choice and syntax are mine, the allusions part of my mental
framework. Words and ideas don't have to be original to say something
about the person who uses them. For example, the fact that I prefer
the work of Borges to that of Faulkner, though Faulkner is arguably
the superior writer, says something about me, even though I couldn't
hope to write a line like Borges; we are, after all, our tastes as well
as our expression. The works of Marshall McLuhan and Ted Nelson are
as much a part of me as extraordinary tales of growing up in Fort Lee,
NJ. Computers can take this essay and convert it into a proxy for me
through various analyses. In other words, "The Processed Book"
is the raw material that can result in a computer agent.
What would an agent do? Just about anything. I would like a well-crafted
agent that would regularly poll the Internet for things of interest
and that would also filter out a number of related things. For example,
I am interested in copyright issues on the Internet (as this essay reveals),
but hardly want to read all the manifestoes of the information-wants-to-be-free
crowd: perhaps an agent can find information on copyright and weed out
the histrionics. An agent could also be used to find things that I don't
even know I care about by identifying themes in my writing (e.g., submerged
metaphors) and matching them to related themes found on servers anywhere.
Computer agents are not new. What is new is the increasing sophistication
with which they are being built and their purposes. All Internet users
are familiar with the kind of profiling that ecommerce sites habitually
engage in, profiling that says something about the kind of merchandise
to offer particular users. Most of these agents are put together, however,
in fairly clumsy ways. So, for example, the all-important Zip Code is
likely to say something about one's household income and education level
and many other things besides. But we all know how imperfect Zip Code
analysis is. On my street we have university faculty, Silicon Valley
executives, and (apparently) a couple New Age households made up of
students and former students. And let's not forget the transplanted
retirees (this is a beach town). But what, someone is bound to ask,
does that say about me? By taking a statistical abstract of a person's
writings, these profiles can become more intimate, and their uses can
become more interesting than determining which digital camera I am likely
to buy.
One intriguing application of the use of personal content is to create
spam filters. Paul Graham (see http://www.paulgraham.com/spam/html)
has written a white paper on the use of Bayesian statistics to develop
highly accurate filters to catch unwanted unsolicited email. This works
by breaking a user's incoming email into spam and not-spam (the user
determines which is which). Then a statistical abstract is taken from
both groups and all further incoming email is measured against these
abstracts. An additional feature is that the filter becomes better the
more you use it, as you continue to build a larger database, which makes
the statistical measures increasingly accurate. It is not hard to imagine
similar processes to be applied to the content of "The Processed
Book."
Were this essay to become a machine component, its task would be to
serve as my virtual representative--it would become, in other words,
the soul of the machine. Such a machine would incorporate human culture
(mine) into its processes and thus become more human-like in the tasks
it can take on. And why stop with this essay? We could add all the email
I write (and give it a high ranking), all the Web pages I view (and
give them a lower ranking, because reading is not as close to the bone
as writing), and anything that is my personal expression. This is the
ultimate goal of the processed book: to inform a generation of robots,
not to make the world more machine-like but to make machines more human.
It should be clear by now how "The Processed Book" would serve
as a network node. All the other four aspects of processing
would apply here: the portal, self-referencing text, platform, and machine
component. Each of these aspects contribute to the network. Commentary
would sit somewhere between the portal and platform aspects, depending
on which text is doing the pointing and which is being pointed to. "The
Processed Book," in other words, like any written document, develops
a community around it. The relative size of that network depends on
the importance of any particular book--a small network for this essay,
an enormous one for Ted Nelson's Literary Machines. It is noteworthy
that such a network has in fact not sprung up around Literary Machines,
despite that work's enormous importance, almost certainly as a result
of the author's eccentric decision to self-publish, denying Literary
Machines of the marketing clout of even a modestly-sized publisher.
It is an interesting marketing exercise to consider how to build such
a network for "The Processed Book." Most obviously, the paper
should be mounted on a Web server, where it will be indexed by search
engines, which will in turn point users to it. It can also be distributed
in various pre-publication forms, some of which will inevitably end
up on the Web as well (this is already happening). It can be sent around
to interested (and uninterested) parties as an email attachment. Links
to it can be posted in newsgroups. The way to market this book, or any
book, in networked mode is to let the network do the work. This means
relaxing some common controls. Digital Rights Management (DRM), for
example, which can reduce or eliminate the copying of digital works,
may be a good economic decision for Stephen King and John Grisham, but
unknown authors like that of "The Processed Book" are better
off allowing their work to be copied and sent around--and even in some
cases to be changed somewhat. Since a friend posted a draft of this
essay on a Web site, which I noted in two newsgroups, I have been astounded
by the number of responses I have received. The network is working.
Of course, not all network nodes are created equal. (Imagine for a moment
what a computer could do with that sentence. Besides picking up the
reference to "The Gettysburg Address," it would also note
the earlier passage in this essay where the phrase "all men are
created equal" appears and then back into "all men are scum,"
etc. The poor machine!) The book-as-network is a new phenomenon and
we still don't know what the inherent rules for building out such networks
are. Does every node have the potential of building an ever-growing
network, or do some nodes have the potential to diminish or even wipe
out the network aspirations of other nodes, as the wake of a large ship
will overwhelm that of a tiny rowboat? We don't know the answer to this
at this time, but my guess is that in a networked world, the big shall
rule and that the diversity of voices that currently characterizes the
Internet will increasingly become dominated by the roars of a handful
of media empires, barring a regulatory regime. The processed book of
tomorrow will have to fight for attention just as much as yesterday's
primal book.
V .
Although I noted at the outset that the processed
book was not to be confused with physical devices, it is useful to see
how the advent of the processed book will help to influence the shape
of such devices.
The concept of the processed book complements such ideas about electronic
publishing as Alan Kay's Dynabook and Ted Nelson's hypertext and networked
information. The Dynabook is essentially a hardware concept: a portable
hand-held computer that could serve as a viewing device for the world's
knowledge. It differs in an obvious way from the processed book in that
the processed book is about content in digital form, not hardware, whose
creation is shaped by the presence of ubiquitous computing. Of course,
many of the ideas of the Dynabook have now found their way into the
marketplace in the form of personal digital assistants and some aspects
of wireless phones. These devices, among many others, play a role in
the development of the processed book: for viewing, editing, linking,
and communicating or transmitting.
Ted Nelson's vision of non-linear writing closely resembles the concept
of the book-as-network-node, though Nelson ultimately became devoted
to building a system to enable his vision and focused less on the creation
of content. The distinguishing aspects of the processed book are that
(a) it is about content (b) it outlines how the creation of content
changes in a digital environment (c) it implies a certain business dimension
(who will build these tools and why) and (d) it points to the increasing
alienation of an author from his or her work as the act of processing
serves to separate the wellsprings of creativity from all the acts of
summarizing, indexing, and abstracting that automation is heir to. Ultimately
Kay and Nelson are humanists, but the processed book is a post-modern
development.
Among the many competing visions for electronic publishing today, one
(mostly favored by established media companies) wants electronic publishing
to look very much like hardcopy publishing, but without the expense
of managing physical inventory. To which I say: Nice work if you can
get it. This vision usually concerns itself with such things as copyright
protection and is inclined to support electronic publishing initiatives
where properties are kept distinct. One outcome of this vision is a
generation of ebooks--in this case, hardware devices--that are dedicated
book readers. The word "dedicated" is important: an ebook
that is only an ebook is fundamentally different from a digital cell
phone or personal computer, which have multiple applications. A dedicated
ebook is a separate device. Most importantly, it is not designed as
a computer peripheral because to do so would mean that the content would
be copied in the process of moving from the computer to the peripheral,
and if it can be copied in that way, it can be copied in many others.
This version is bound to fail, not because copyright is dead but because
all such books published in this manner will have to compete with books
that draw on the resources of the processed book. A book that stands
by itself literally stands by itself. It competes with an army of networked
information. This is not to say that some individual books are not better
in some important way than a book that is a network node, but to make
the obvious marketing point: the real challenge for creative people
is to get others to pay attention to their work. This is why publishers
exist and why they will continue to exist. Without the support of a
network, most books will get lost amidst the huge outpouring of new
material.
So we shouldn't expect to see dedicated ebook readers. Instead, we will
have reading devices that connect to other computing devices: the wireless
phone with a bigger and better display, for example. In the technology
world, these are called convergence devices. In time we should
expect that we will all carry one--and only one, serving multiple uses.
This means tradeoffs. While a dedicated ebook reader would naturally
be optimized for the reading experience, convergence devices will please
no one entirely. The importance of this is that it will slow down the
acceptance of digital readers as hardcopy simply continues to do a better
job for some functions. Over time, even as we see the hardcopy world
shrink, certain areas will remain mostly in ink and paper, literature
in particular. The processed book will invade professional information
first, college texts second, and then begin to nibble at the edges of
consumer or trade publishing. The inroads of the processed book will
be gradual enough that many people will not notice it happening, even
as they now happily and innocently purchase DVDs of movies that include
all sorts of "non-primal" elements such as previously deleted
scenes and interviews with directors and actors. The processed book
will inevitably takes its place on the virtual bookshelf, where it will
be read in front of the fireplace, while the genetically engineered
dog snoozes on its pillow.
K Dictionaries Ltd
10 Nahum Street, Tel Aviv 63503 Israel
tel: 972-3-5468102 fax: 972-3-5468103
kd@kdictionaries.com |