Tech Comm & Number Theory

Image result for mathematics

As I’ve written previously, mathematics and technical communication (tech comm) both model reality. In math, numbers do not “exist” in the literal sense of the word. You can have 3 coins, but the concept of 3 does not occupy a physical point in time or space; it transcends it. Numbers, therefore, describe the quantities or properties of a person, place or thing but are not actual people, places or things.

Similarly, tech comm is a description of reality but is not reality itself. A guide explaining how to use a smartphone is not an smartphone but a representation of it. The ideas, lesson and concepts in the guide must be interpreted and understood by a human reader; therefore these things exist only in the reader’s mind.

Now, if mathematics and tech comm are attempts to describe reality, it follows that some of the basic principles of math should apply to tech comm.

Numbers are the building blocks of all mathematics. The 10 digits which form all numbers are math’s “alphabet”, however, not all numbers are equal; they fall into various groups.

See the source imageNatural numbers are all whole positive numbers: 1, 2, 3 and so on. These are the practical, real-world numbers that we use each day when counting, ordering, adding, and so on. They are precise and complete because they exclude fractions or decimals. Any simple, clear and complete positively stated information corresponds to a natural number, for example: Sales increased 7% over last year.

Negative numbers are numbers less than 0. They were first envisioned by the Chinese over 2,000 years ago. There is a theory that the idea of duality in Chinese philosophy made it easier for this culture to develop the idea of a number less than zero.

Any negative statement corresponds to a negative number, for example:

Do not turn off your computer during the installation.

Image result for FractionsFractions have at least two parts: the top number of the fraction or numerator and the lower portion or denominator. However, fractions can have more than two parts in the form of a complex fraction, for example (2/3)/( 5/7).

Complex modern content management systems (CMS) are actually composed of fractional pieces of information which are reused as required. For example, there may be many procedures which all refer to a specific part number. If you are using are using a typical Word processor to document these procedures and the part number changes, you’d have to manually search and replace every occurrence of this number. However, in a CMS, the part number is stored once in a database as a variable, and therefore only has to be changed once. All references to that part number are then automatically updated. Any piece of information can be a “informational fraction”, from a word, to a sentence, a paragraph, and even a page.

Image result for Irrational numbersIrrational numbers have an infinitely long series of non-repeating digits after the decimal place. You can’t write them as a fraction or ratio. Examples include the square root of 2 (1.4142135…) and pi (π) which is equal to 3.14159265…. As you continue down the line of infinite digits, you get incrementally closer and closer to the true value of the number. However, when calculating values, you have to stop a certain point; you can’t simply go on forever. NASA scientists are able to keep the space station running using only 16 digits of pi. For calculating the fundamental constants of the universe, they need 32 digits.

Irrational communication is comprised of pieces of information which each add ever-decreasing value to the information. For example, let’s say you need to write a step that instructs users how to connect to a wi-fi network. The statement you develop is:

To connect to the wi-fi network, select the ABC_Network, then enter the following password: Pass1532.

For most people, this would suffice. However, what about novice users who don’t even know how to select a wi-fi network? We’d have to add another piece of information, underlined below:

To connect to the wi-fi network, on your device, under Settings, select Wi-Fi connections, select the ABC_Network, then enter the following password: Pass1532.

This seems complete, right? But what about people who are not sure what you mean by “device”? To address this, we add even more information:

To connect to the wi-fi network, on your SmartPhone, tablet, laptop or desktop, under Settings, select Wi-Fi connections, select the ABC_Network, then enter the following password: Pass1532.

But what about people who don’t know what a wi-fi network is? We add:

You can use our wi-fi network to connect to the Internet. To connect to the wi-fi network, on your SmartPhone, tablet, laptop or desktop, under Settings, select Wi-Fi connections, select the ABC_Network, then enter the following password: Pass1532.

And what about those poor souls who don’t know what the Internet is?

The internet is the world’s largest information network. It used to send and receive information, view news items, images, videos and sound, and to connect with others. You can use our wi-fi network to connect to the Internet. To connect to the wi-fi network, on your SmartPhone, tablet, laptop or desktop, under Settings, select Wi-Fi connections, select the ABC_Network, then enter the following password: Pass1532.

One could go adding information forever but I think you get the point. Each piece of new information, just like each additional digit in an irrational number, adds a bit more value to the original piece of information. How many “decimals” of information are required depends on the knowledge level of the average user. Too much information is as bad as not enough.

Now we come to one the most challenging types of numbers: imaginary. At some point, mathematicians asked: what is the square root of a negative number? There is no clear answer, because no number multiplied by itself produces a negative number. Two negative numbers multiplied together produce a positive number. To resolve this, mathematicians invented imaginary numbers, written with the letter i. For example, the square root of -9 is 3i.

The essence of imaginary numbers is:

  • two numbers are combined together
  • combining numbers normally creates a larger number but in this case actually creates a smaller one, therefore,
  • an inherent contradiction is created

The informational equivalent of an imaginary number is a statement added to another statement that creates a conflict and therefore lowers the value of both statements.

For example:

  1. If you over 18 years old, complete Form A.
  2. If you are less than 25 years old, complete Form B.

The second statement contradicts the first, and thereby negatively impacts both statements. It is the equivalent of the mathematical i, in this case, the i standing for incomprehensible, impossible, inexcusable and, quite possibly, insane. This is actually a common problem, especially with complex policies, procedures and regulations that are riddled with contradictions.

Finally, an exponent is a number that dramatically increases the value of another number, for example, 3³ which equals 3 x 3 x 3 or 27. Conversely, a square root is a number that when multiplied by itself creates a larger number, for example, 10 is the square root of 100. In either case, we are changing a small number very rapidly to much larger one, or vice versa.

If there’s one thing that information experts agree on, it’s that the amount of information in the world has grown exponentially. How much? A quick math lesson is in order. An exabyte is one quintillion (1018) bytes, which is one billion gigabytes or one thousand million billion bytes, a byte being equivalent to about one letter. One exabyte is up to 3,000 times the size of all the content in the Library of Congress. Between the start of history and 2003, five exabytes of information in total were created. We now create five exabytes every two days. Big data, indeed.

The single informational device that has contributed to the ability to access this near-infinite amount of information is a textual object that is absurdly simple yet staggeringly complex: the hyperlink. For with a single hyperlink, a tiny piece of information directly connects to something much larger. (This small link, for example, links to something vast.) The destination of the hyperlink is the exponent of the hyperlink itself; the hyperlink, therefore, is the root of the much larger piece of information that it points to.

It’s no coincidence that the word mathematics literally means “to learn”. The primary goal of tech comm is that the user learns something, whether it is a concept or a task. The connections between mathematics and tech comm are, as with math itself, measured, complex, and infinite.

Image result for mathematics

Advertisements

Rounding up

Related imageRounding is a mathematical process in which a complex number is replaced with a simpler one, such as 1.343 rounded to 1.3. It makes numbers easier to communicate and work with. However, rounding applies not only to math but to all aspects of our existence.

Starting with the essentials (matter, space and time ): all matter is composed of atoms, which in turn are almost 100% space. If you could remove all the space between all the atoms of all the skyscrapers in New York city, they would fit within a matchbox. Why then do we perceive matter as solid? It is because our senses are simply not acute enough to detect the spaces. If we were much smaller (or more sensitive), we would see the spaces. Instead, we “round” the spaces up, filling in the gaps and thereby perceive matter as solid or liquid.

Related imageSimilarly, we round space. Again, because we cannot perceive vastly small spaces, we round up to the nearest perceptible unit, usually about 1 mm, depending on the situation.

Finally, we round time. When we say it takes 20 minutes to do a task, we generally don’t mean exactly 20 minutes but rather 20 minutes, plus or minus a few minutes. Even for events that we measure precisely, again, because of our perceptual limitations, we cannot perceive tiny amounts of time, such as one ten-thousandth of a second. We round to the nearest second, minute, hour or even day.

Image result for sensesWe also round our senses. No two people perceive colour, sound, smell, taste and texture the same way. As with matter, space and time, we perceive these things within a certain perceptible range. It would be impossible, for example, to differentiate two nearly identical colours, one .000001% brighter than the other; we round up the colours and see them as identical. You are rounding the text displayed here. Your eyes and mind fill in the pixels this text is composed of to see the letters and words.

Now, if such fundamental and seemingly objective aspects of our existence as matter, space, time and our basic senses are rounded, how much more so the less objective and more ethereal aspects.

Concepts, thoughts, ideas and feelings are constantly “rounded”. In fact, because these things are non-physical, it would be tempting to say that math does not apply and that they cannot be “rounded”. One could argue that it would be ridiculous to say that you could like someone 12% more than someone else, or that a political party is 14% better than another. That may be true, but you can measure aspects of these things. For example, like-ability by itself is not measurable, but surveys where each person rates or ranks their feelings to the other is. The moment you introduce math or statistics, you can have rounding.

Rounding therefore, is the process of taking something and replacing with something less precise but easier to understand and perceive. In that sense, it is one of the purest forms of technical communication. For it is the job of a technical communicator to take something complex and simplify it so that it can be practically understood by the reader.

It is a constant struggle to determine the degree to which content should be simplified. Simplify it too much, and you lose valuable information; simplify it too little, and the content becomes inaccessible. Because of rounding, no two technical communicators will ever document something the same way.

May all your content be well-rounded.

 

 

A Lasting Theorem

Here is one of the world’s most difficult mathematical problems:

For the equation: an + bn = cn, where a, b and c are whole numbers, n must equal 2. In other words, this equation only works if n =2.

For example, the following numbers fit this equation:
32 + 42 = 52 and 52 + 122 = 132. If n equals 3 or any other number, you won’t find any solutions to this equation.

This problem is known as Fermat’s Last Theorem, named after the French mathematician Pierre de Fermat, who lived during the 1600s. While annotating a book about mathematics, Fermat claimed to have found a solution. He wrote: “I have discovered a truly marvelous proof of this, which this margin is too narrow to contain.” Too bad he wasn’t using a Word processor with its ability to add notes of unlimited size.

The problem remained unsolved for over 350 years until a British mathematician named Andrew Wiles finally conquered it in a monumental 200 page proof. How he solved it is a fascinating adventure into the strange and mysterious world of higher mathematics.

Wiles’ solution involved two very strange mathematical shapes: elliptic curves and modular forms. Elliptic curves resemble doughnuts, whereas modular forms don’t resemble anything and are therefore much more difficult to describe, but here goes:

A modular form is an incredibly complex, highly symmetrical form with many dimensions. It is impossible to draw one because it only exists as a conceptual form.

Elliptic curves and modular forms are very different from each other. However, the solution to the theorem involved proving that these two shapes, are, in fact, the same. When the idea that these two forms might be identical was initially proposed, it was a radical concept. It was like saying that an elephant is a banana, which is, quite simply, bananas.

However in 1995, Wiles proved these two forms were indeed identical. In doing so, he solved Fermat’s Last Theorem. How proving that these two forms were the same also solves Fermat’s Last Theorem is beyond the scope of this article. (For a full explanation, read the PBS transcript from the Nova documentary, The Proof.)

Mathematics and technical communication both attempt to model reality, and both use informational objects to do so. The primary object (or shape) that a technical communicator develops is the information repository, which is comprised of:

  1. Topics (such as overviews or procedures) that answer specific questions.
  2. Containers and sub-containers for the various topics (such as other topics, pages, chapters or other sections).
  3. A function enabling the user to search the topics (an index, TOC, or content search function).
  4. An environment that contains all the topics and the search function (for example, a PDF, help system, or website).

Users deal with another shape: informational queries, which are comprised of:

  1. The generation of specific questions, such as “what is this thing?”, “how do I perform this task?”, “how do I resolve this problem or error?”
  2. The process of determining where to find the answers.
  3. Locating the relevant information repository.
  4. Searching the information repository.
  5. Locating the topic that they hope will answer their question.
  6. Understanding the answer to their question, that is, the contents of the relevant topic.
  7. Successfully resolving their question, for example, by understanding a concept, completing a task or resolving a problem or error.

Both of these shapes require all of their respective components in order to be considered complete shapes. For example, an informational query is incomplete if the user can only complete the first six steps. They may find and understand the relevant help topic, but if they cannot complete it (due to an error in the topic, the product or both), then the informational query is incomplete.

Just as elliptic curves and modular forms, two radically different shapes, were proven to be the same, both information repositories and informational queries are the same. This is because each shape is a reverse-engineered version of the other.

When a technical communicator creates an information repository, they are attempting to recreate the steps that a user will follow in an informational query.  Communicators try to anticipate as many of the questions that a user will have, then work backwords to create a resposity that will the answer the user’s question.

We can take the steps of an informational query, change their order (mostly by reversing them), and then structure them from the perspective of the technical communicator:

  1. Consider all the potential questions a user could have.
  2. Create topics that successfully resolve these questions.
  3. Ensure the topics are written so that the user will understand them.
  4. Index the topics so that they are searchable.
  5. Create a search system that will enable the user to find the relevant topic.
  6. Place the information repository in a location where the user can access it.
  7. Make it obvious to the user where the information repository is located.

Conversely, we can reverse engineer an information repository from the user’s perspective:

A user needs to:

  1. Locate the environment containing the relevant document that will answer their question.
  2. Search the topics for the answer.
  3. View the various topics that might contain the answer.
  4. Find the topic that answers the question.

One shape is but a mirror-image of the other.

The commonality goes even further, for all end users are potential communicators, and all communicators should ideally “be” the end user. The greatest documentation is created when end users actually communicate directly with the technical communicator, and when the technical communicator imagines themselves to be the end user, with all of their worries, concerns and, most of all, questions.

In fact, I have developed a formula that proves the number of end users in the world is equal to the number of technical communicators.

Unfortunately, this blog is too small too contain it.

 

The Governing Dynamics of Documentation

Related imageGame theory is a specialized field of mathematics that analyzes choices and results in strategic situations, or games, as the players try to maximize their success. It can be applied to practically any situation where one is making a choice for personal benefit, for example:

  • choosing a restaurant
  • hiring a worker
  • buying and selling stocks
  • deciding who to marry
  • playing poker (or any other game)

Game theory has been applied to such diverse fields as: economics, evolutionary biology, engineering, political science, psychology, philosophy and business management. Google even uses it to maximize the advertising revenue generated from their AdWords. All of these areas require making the best decisions possible in order to create the maximum benefit.

Game theory takes into account the fact that humans are essentially self-centered, that is, that we tend to act in a way that we think will be best for us. Even when we appear to be selfless, we’re still acting in our own self-interest. For example, giving to charity gives us the benefit of feeling that we’re doing a good thing, a benefit that we’re willing to pay for.

The Nash equilibrium (also know as governing dynamics) is a set of game theory strategies. It states that the individuals in any situation will benefit the most if they do not only what is best for them individually, but also what is best for the entire group.

This equilibrium was developed by the mathematician John Nash, who was profiled in the wondrous film, A Beautiful Mind. The film gives a graphic example of his theory:

A group of men and women are at a bar. Each of the men wants to pair off with each woman. However, one of the women, a blond, is more attractive than the others. The question is: should each man go for a less attractive woman, or try for the blond?

The Nash equilibrium implies that no man should try to pick up the blond. Odds are they will all be rebuffed. If the men then try to go after the other women, they’ll most likely be rejected because each women will know that they were the man’s second choice. The best strategy, therefore, is for each man to try to pursue a woman other than the blond.

To sum up: the Nash equilibrium states that in any situation involving trade-offs, the maximum benefit is achieved if everyone is making the best decision they can taking into account the decisions of the others.

The Nash equilibrium has been applied to an amazing variety of areas including:

  • arms races
  • technical standards
  • bank runs
  • currency fluctuations
  • traffic flow
  • auctions

Nash’s theorem was so ground-breaking that in 1994 he won the Nobel prize in economics for his theorem. (There is no Nobel prize for mathematics.)

The philosophy of this equilibrium can be applied to information development on two different levels: within the architecture of information objects and to the information development process.

Documentation Objects

All documents contain sets of objects that can be thought of as “players” in a game. These objects include:

  • topics
  • cross-references
  • TOC entries
  • index entries
  • graphics

In addition, documents themselves form objects in a documentation set, as part of a group of related documents.

Conflicts will arise if there are two or more objects within each of these areas that are difficult to distinguish. Examples include:

  • topic with similar names, such as Overview and Introduction
  • index entries that begin with the term removing and others that begin with the term deleting
  • two graphics that describe the same thing but in a slight different way
  • using the same term to describe different things
  • guides with similar names, for example an Administration Guide and a Technical Guide

Conflicts such as these create confusion for the end user, because they have no way of knowing which is the “correct” version. This creates an unnecessary game-like situation for the user, as they struggle to pick the winning object.

To avoid this, all documentation needs to be carefully reviewed and purged of all conflict. The end result is a series of objects that play nicely together. What is best for each individual documentation object is also best for the group of objects; the very essence of governing dynamics.

Multiple Authors

The Nash equilibrium can be applied even more directly to the information development process in a multi-author environment. Many organizations have content management systems in which multiple authors can create, edit and manage their documentation.

While such systems can create a more balanced workload, the potential for conflict is enormous. Even if the system only allows one editor at a time (a standard feature of any content management system), it’s still easy for writers to get into editing conflicts in which whoever updates last “wins”.

This is not a technical issue but a management issue. Writers must understand they are not competing against each other but against incomplete and inaccurate documentation. The end user wants relevant information, and is not interested the writers’ turf wars. As the Nash equilibrium implies, writers working on common documentation sets need to know that what is best for the group is also what is best for each of them.

There are probably few areas in life to which governing dynamics could not be applied. Therefore, the idea that “life is but a game” is far more true than we can possibly imagine.

The Fractal Factor

They’re in the sky, on the ground, in nature, clothing, special effects and even your cell phone. They’re incredibly simple yet extraordinarily complex. The closer you get to some of them, the further away they seem. They are fractals and they’re everywhere.

Fractals are complex, irregular, endlessly repeating geometric shapes. They can be easily created on a computer but also occur naturally. The classic example is a tree.

It’s Only Natural to Love Fractals

When you look closely at a tree, you’ll see a main section, with branches protruding outwardly from it. Each branch, in turn, is like a mini-tree, with sub-branches sprouting off the main branch. Each sub-branch may also contain a sub-branch. In other words, the tree shape repeats throughout the tree.

Another example is a coastline, which has a certain irregularity or “crinkliness” to its shape. You’ll see the same degree of crinkliness when viewing the coastline from 1 metre, 100 metres, a kilometre or even 10 kilometres – the overall pattern remains the same.

Mountains, flowers, clouds, plants and snowflakes – all of these naturally occurring things are fractals. However, it’s only recently that fractals have been recognized as much more than just pretty patterns. They have real, practical applications in both science and mathematics.

Fractal-shaped antennas are used in mobile devices such as cell phones. It’s been scientifically proven that this type of antenna shape is the most efficient at receiving the widest variety of signals. Without it, your cell phone would resemble a porcupine because it would require so many different antennas.

Many cinematic special effects use fractals. The spectacular lava effects in the finale of the last Star Wars film, Revenge of the Sith, were generated using fractals. Fractals are also used in design, engineering and medicine.

Computer-generated fractals have a particularly unusual property: no matter how much you magnify them, the level of detail does not change. You can see an animated example here.

Info-Fractals

New technology has unknowingly fractalized information. The best example of this is Wikipedia. Open any major topic and you’ll see it’s very detailed and contains dozens, if not hundreds, of hyperlinks. Click the hyperlink within this topic, and it takes you to another detailed topic, again with its own set of hyperlinks. As with fractals, the level of detail remains about the same no matter how much you “zoom in”. This fractalization does not just exist with websites. A complex online help system also allows you to move from one topic to another, with little or no diminishment of detail.

Just as fractals have no real start or end, neither do modern information structures. Although technically both Wikipedia and an online help system have a first and last topic, from the user’s perspective, they do not. Users rarely read documentation linearly – they go directly to the topic they need, perhaps follow some links to get additional information and then they leave. Documentation is not a novel.

Modern documentation, therefore, clearly resembles fractals. However, there is another more important similarity, and to understand it, you need to look at the history of fractals.

Math Wars

Benoît Mandelbrot (1924-), is considered the father of fractals. When he first presented his theories, the mathematical community did not take him seriously. They thought the shapes he created were “pretty” but had no practical applications and therefore did not represent genuine mathematics.

These mathematicians were trapped in their traditional, Euclidean view of mathematics: straight lines, simple curves and basic shapes. They simply could not fathom a math that was so irregular. Many years would pass before other mathematicians finally recognized Mandelbrot’s work as genuine mathematics.

Believe It or Not

Our profession suffers from similar disbelief. It’s held by writers who are unable to accept the new way of creating information, specifically XML, where all information is classified by tags and which separates the form from the content.
XML is as different from traditional documentation as fractals are from traditional mathematics. As an example, you may be used to this way of writing:

Printing a Page (Heading 3 paragraph style)

1. From the File menu, select Print. (Numbered paragraph style)
2. Select your print options. (Numbered paragraph style)
3. Click Print. (Numbered paragraph style)
The document prints. (Body paragraph style)

Now try this way of writing:

<procedure title>Printing a Page</procedure title>
<step>From the <UI element>File</UI element> menu, select <UI element>Print</UI element>.</step>
<step>Select your print options.</step>
<step>Click <UI element>Print</UI element>.</step>
<result>The document prints.<result>
Ugly, isn’t it? There’s no formatting or paragraph styles, just tags. Note that this is an extremely simple example. Still, many writers will balk at this and say it’s not really technical writing but “programming” or “coding”.
An Inconvenient Truth
Just as traditional mathematicians initially refused to accept fractals as actual mathematics, many writers are unable to accept XML as actual information development. However, XML is, in fact, information development in its purest form, because it separates the form of the information (the fonts, point sizes, formatting and layout) from the actual content. In addition, XML allows you to modify, manipulate and manage information in ways that are simply impossible using traditional documentation methods.
The choice is ours: we can continue our old ways, or we can change. If you’re struggling with the new ways, then remember the lowly fractal the next time your cell phone rings.
If it weren’t for one person challenging the system, you’d never be able to get that all-important call.

The Incomplete Guide to Gödel

Related imageKurt Gödel was an Austrian-American mathematician who lived from 1906 to 1978. He’s best known for his incompleteness theorems, which have had a tremendous impact not only on mathematics but on other sciences and even philosophy.

Gödel’s theorems were actually a response to the ideas of another mathematician, David Hilbert. Hilbert had a dream of creating a complete, accurate and consistent system for all the mathematical principles (or axioms) that had been discovered to date. Earlier, some of these axioms were discovered to be wrong, so there was tremendous pressure to ensure such errors would never occur again. Hilbert wanted to establish a solid foundation for all mathematics, now and forever.

Sounds like a great idea right? There was just one problem – to create such a “perfect” system is impossible, and Gödel proved it. He demonstrated that if you created a complete system that described every mathematical truth, it would contain some statements that could not be proven. In other words, the system would be complete but not fully predictable or accurate. Conversely, you could create a system that is fully provable, but then it would not be complete.

How Gödel did this is incredibly complex, but I will try to explain. He essentially created a mathematical system using a complex numerical notation, where one of the statements in the system was: This statement cannot be proven.

Now, if this statement could not be proven, then the system might be complete, but it would contain a non-provable statement, making the system itself inconsistent. But, if the statement is provable, then it means the system still contains a non-provable statement (because the statement itself says it can’t be proven) and therefore the system itself is still not fully provable.

As a workaround, we could just leave this pesky statement out of the system, because it’s causing so many problems. However, if we do that, then the system would be missing a statement, making the system fully provable (yay!), but incomplete (boo!).

To summarize, Gödel was saying that any system can be:

  • complete but not fully provable
    or
  • fully provable but incompleteThis was Gödel’s first incompleteness theorem. His second theorem went further and said that any system can never be complete. That is, any system containing true statements will always be missing some statements, but you can never know what they are. (If you did, then they wouldn’t be missing, would they?)

    The philosophical impact of Gödel’s theories are enormous. Extrapolating his ideas, it means that we can never know everything, and that even if we could, some of what we know could never be proven.

    The implications for information developers should be clear. All documents are essentially collections of statements, namely theory and procedures. It is impossible to create a document that is complete, and we don’t even need Gödel to show this. Users are complex beings and therefore completely unpredictable. Compounding this is the fact that the product being documented (especially if it’s software) is also unpredictable. Mathematically, we would express these two facts as:

    user unpredictably x product unpredictably = lots and lots of unpredictably

    It is therefore impossible to create a complete guide that would explain every possible situation. Even if you could, it would contain statements that are not true. Again, we don’t need Gödel to prove this. Too often we see guides that try to be complete, and in doing so sacrifice quality and accuracy. That’s why the mantra of every tech writer when responding to pressure to quickly complete the draft should be: “Do you want it fast or do you want it right?”

    More importantly, if you really did try to create a complete guide, it would have a size approaching infinity, which is a rather large number. Some of the best examples of documentation are quick start guides, usually just a few pages, or as a fold-out poster. Size does matter, but not in the usual way, because less is often much, much more.

    This column is now complete, but of course I cannot prove a word of it.

Dr. Drake and the Information Equation

Related imageOne of the most fascinating films of 1997 was Contact, starring Jodie Foster. Foster portrays Dr. Ellie Arroway, an astronomer who discovers a complex signal transmitted from another world. The signal contains a giant user manual describing how to build a machine that will magically transport a single occupant to another galaxy. The machine is massive and costs $500 billion to build.

The film explores the mad race to build the machine, the decision of who will ride in it, and what they will encounter, if they even survive the trip.

A SETI is Not a Couch
Although the film is pure science fiction, the organization to which Dr. Arroway’s character belongs, the SETI Institute, is quite real. SETI, an abbreviation of “Search for Extraterrestrial Intelligence”, is made up of scientists and researchers working to find intelligent life on other planets.

One of the real scientists working for SETI is Dr. Frank Drake. Drake is a leading American astronomer who in 1960, while working for the National Radio Astronomy Observatory, conducted the first radio search for extraterrestrial intelligence.

The Drake Equation
He developed a formula, called the Drake Equation, which tries to calculate the number of planets in our galaxy with beings who are intelligent enough to communicate with us.

The Drake Equation goes something like this:

N = R* x fp x ne x fl x fi x fc x L
which roughly translates in English to:
N = the number of civilizations within our galaxy that can communicate equals:
R* = the rate of formation of stars that are suitable for the development of intelligent life
x
fp = the fraction of these stars that have planets
x
ne = the fraction of these planets have an environment suitable for life
x
fl = the fraction of these planets on which life actually appears
x
fi = the fraction of these planets where intelligent life develops
x
fc = the fraction of these planets that develop communication
x
fl = the number of years that these planets exist

Based on this formula, Drake estimates there are about 10,000 planets within our galaxy that could communicate with us. Of course, the big question is, why haven’t they?

There is no known solution to this equation, because is impossible to know what all these numbers are. However, this equation is still a useful tool for scientists. Although they disagree on the numbers, they mostly agree that the factors in this equation would be ones that determine what the final number (N) would be.

Don’t Chain Me Down
This formula is a mathematical way of saying that a chain is only as strong as its weakest link. Because if any of these numbers are 0, there can be no possibility of us Earthlings finding someone else to talk to. That would be unfortunate, if you think of all the potential business opportunities for tech writers if we ever could produce documentation for other planets.
In our field, we don’t seek other planets, but do seek to create the ultimate, perfectly complete and understandable document – this is our holy grail.

The Documentation Equation
Therefore, by applying the Drake Equation to information development, we can derive the Documentation Equation:
V = A x W x Aa x As x Tla x Tls x Tch x Tcl x Tc x Tr x Pa

In this formula, V is the Value of of an information set, or infoset, derived by multiplying the following values together:

  1. A = infoset Awareness
  2. W = Willingness to use infoset
  3. Aa = infoset Access ability
  4. As = infoset Access success
  5. Tla = Topic location ability
  6. Tls = Topic location success
  7. Tc = Topic comprehension
  8. Ut = User task completion
  9. At = Application task completion
  10. Tr = Task relevance
  11. P = Practicality of application

Let’s explore each of these values in turn.

V = Value of Infoset
This is the final value that we are after – a number on a scale of 0 to 100% that rates how valuable an infoset it is. An infoset is defined as any grouping of information or documentation, for example, a single document or a related set of documents, an online help system, a website, and so on.

Factors of V
V is derived by multiplying eleven factors together. The following sections describe each of the factors in this equation. The factors involve probabilities of what average users will do in an average information search situation. However, average user and average search situation are difficult things to define.

For the purposes of this formula, we’ll just have to imagine that the factors describe what most users will be doing, or will encounter, most of the time, in most of the situations where they need information to help them complete a task or understand a concept.

1. A = infoset Awareness
This is the probability that the average user knows that the infoset they require exists. In most cases, this should be close to 100%, but sometimes users aren’t even aware there is a manual or help system to refer to. If this is the case, the infoset has no value for that user. Fortunately, most users today have come to expect there will be some sort of information provided with their product, even though they may not always use it – silly users!

2. W = Willingness to use information set
This is the probability that the average user not only knows that the infoset exists, but is willing to use it. As we all know from experience, many users view documentation with the same affection as going to the dentist: it’s a pain, but it’s something you have to put up with once in a while. If a user does not want to use an infoset, it has no value.

3. Aa = infoset Access ability
This is the probability that the average user not only knows that an infoset exists and wants to use it, but knows where and how to access it. Sometimes a user won’t know the exact location of an infoset. For example, they may not know where to locate a particular PDF file or support website that they need. Again, for these users, the infoset would have no value.
4. As = infoset Access success

This is the probability that the average user not only knows how to access the infoset, but is successfully able to do so. Examples of not being unable to access an infoset include trying to view a website that is down, or read a PDF or other type of document that has become corrupt and is unreadable.If a user cannot successfully access an infoset, it has no value.

5. Tla = Topic location ability
This is the probability that the average user is not only able to successfully access the infoset, but knows how to locate a specific topic. For example, the user would have to know how to use the “contents”, “index” and “search” functions within an infoset to be able to even start looking for what they need. Inexperienced users may not have a clue where to start.If a user does not know how to locate a specific topic within an infoset, the set has little or no value.

6. Tls = Topic location success
This is the probability that the average user not only knows how to locate a topic in an infoset but successfully does so. This value will depend on the quality of the index, contents and search engine used in the infoset. It does not necessarily depend on the size of the infoset. Users should still be able to effectively search even the largest of infosets, provided the index, contents and search functions are well-developed. If a user cannot locate a specific topic within an infoset, the set has little or no value.

7. Tch = Topic comprehension
This is the probability that the average user can understand the specific topic they find. This factor is fairly straightforward: if a user can find the topic they were looking for but can’t understand it, then the topic has no value to the user. This, in turn, lowers the value of the infoset.

8. Ut = User task completion
This is the probability that the average user can complete the task using the specific topic they found. For the purposes of this formula, there are two basic types of tasks:

  • Procedural tasks: tasks that describe how to complete a specific procedure by following one or more steps.
  • Learning tasks: tasks in which the user simply wants to know more about something but does not actually complete a procedure. Learning tasks include reading an overview, learning high level concepts, and understanding descriptions of terms used in the application.

In many cases, users are searching for procedural tasks. Whether a user can successfully complete a procedural task depends on the clarity and completeness of the steps listed in the task. If the steps are unclear or cannot be easily followed, the topic has no value.
A user can successfully complete a learning task if they can accurately understand the concept, idea or definition they were after. Again, this will depend on the accuracy and clarity of the topic.

9. At = Application task completion
This is the probability that the application itself will correctly complete the request task after receiving input from the user. This factor does not apply to learning tasks because these only involve understanding by the user, and not any application processing. Unless, of course, the user experiences the dreaded “brain malfunction”.

This is similar to the previous factor but relates to what happens after a user successfully completes all the steps outlined in a procedural task. One would assume, at that point, that the application would take over and do whatever processing is required to complete the task. However, if there is a defect in the application, and the processing fails, then the topic becomes useless. Or to put it another way: the operation was a success, but the patient died.

The fact that this failure is not the information developer’s fault is irrelevant. From the user’s perspective, if they cannot complete the task, then the topic describing that task has no value.

10. Tr = Task relevance
This is the probability that the average user, after finding the topic they were looking for and successfully completing the task it contained, that the task itself was actually relevant to the user. That is, that it was the correct and appropriate task for what they were trying to do in the first place.

Now, you may argue that if the user has successfully found the topic they were looking for, wouldn’t that automatically mean the topic is relevant to them? In most cases, yes, but sometime users will find the topic they want, but not what they need.

Microsoft Word – Stop the Insanity!
For example, someone may want to send out a letter to a group of people using Microsoft Word. A novice user, completely unaware of the mail merge feature in Word, assumes they will have to copy and paste the text of the letter several times, and then just change the name on each letter. As the letter changes over time, they assume they will need to do a “search and replace” of the specific text. They therefore search for and successfully find information about copying, pasting, searching and replacing. Instead, what they really needed was information about mail merging.

Now, the big question is: does the information they successfully obtained have value? This question is really a litmus test of what you believe the purpose of information development is:

Is it to give users the information they:
a) want?
or
b) may not know they want, but in fact, need?
If you believe (a), then you will always assign a value to task relevance of 100%.
If you believe (b), then you may assign a value to task relevance of less than 100%.

Neither choice is “right” but is something to be aware of as you try to determine the value of information.

One of the greatest challenges in information development is the fact that users don’t know what they don’t know. Great documentation is one that is able to anticipate users needs and clearly present them with information about how to do things in the most effective way possible, in ways that they may not have even anticipated. In other words, to read their minds!

11. P = Practicality of application

This is the probability that for the average user, the entire application is practical and relevant to them. This is similar to the previous factor, but deals with the application as a whole rather than the tasks that comprise it.

In most cases, we would hope this value would be close to 100%, but again users don’t know what they don’t know and sometimes work with the wrong tool. For example, a user could use WordPad to create basic documents. However, if they want to add elements such as borders, automatic page numbers, styles, tables, TOCs, indices, auto-numbering, and so on, they should use a full-featured word processor such as Word. Users who are using WordPad when they should a word processor may be doing what they want, but not what is best for their needs in the long run. They may find what they need to do in the WordPad help, but the help has little value if they are using the wrong tool.

Again, it comes down to whether you believe information should only give users what they want, even if what they want is not the right thing. It is the ultimate catch-22: if we try to give users what they need, we are called arrogant, self-righteous, know-it-alls. If we try to give users what they want, we are called weak, unhelpful panderers who are too lazy to tell users what they really need to know. Select whichever insult bothers you less and run with it.

The V Files – “The Truth is out There”
By multiplying all these factors together, we obtain V – the value of the infoset. The highest possible value for V is 100%, which although theoretically possible, is in practice, impossible. This is because there are too many uncontrollable variables, not the least of which are the users themselves. It is impossible to predict if or how every user will try to search for information, if they will find and understand the information, if they will successfully complete the task, and so on.

In fact, because V is derived by multiplying eleven numbers together (rather than adding them), the value of V will usually be very small. For example, if an infoset scored 90% in every variable, the value of V would be .9011 or a measly 31%. Even if the infoset scored a whopping 97% for every value, V would be just 71%.

Therefore, V, by itself, is a meaningless value. A score of 31% simply indicates that the particular infoset has a value of 31% of what is theoretically perfect. To give meaning to this value, you’d have to compare it to all other appropriate infosets. So, for example, if similar infosets for similar products scored an average of 23%, and the infoset you developed scored 31%, yours would be considered “above average”. The challenge is to define what are similar enough infosets to the one you are rating that you are making a meaningful comparison. That is, you want to compares apples to apples, and not apples to coffee mugs.

Summing Up the Numbers

The Drake Equation and the Information Equation have much in common.

In both, it is impossible to obtain completely accurate numbers for all the values. You may be able to get some of the values if you were to spent huge amounts of money on research, however then you would have to justify your return on investment.

Therefore, for both equations, the purpose is not necessarily to find the true numbers, but rather to make people aware of the factors that lead to the final value.

Your assignment, therefore, should you choose to accept it, is to find other factors that could be incorporated into the Information Equation. In doing so, you will help solve one of the greatest mysteries of our time:

What makes information have value?