Thursday, August 07, 2008

Notes on Tufte's The Visual Display of Quantitative Information

A while ago I attended one of Tufte's all day seminars and got copies of all his books. Although they are all very interesting and full of great stuff, I like the first book, The Visual Display of Quantitative Information, the best. The other books seem to build on the concepts that are started in this book rather than really present new concepts, in my estimation. While I was reading, I took some notes just as a reminder to myself about what he has to say about what makes visual data good and bad. I thought I'd put it on here incase anyone else is curious.

In General, graphical displays should:

  1. Show data

  2. Induce the viewer to think about the substance rather than about the methodology, design or technique of the production

  3. Avoid distortion

  4. Present many numbers in a small space

  5. Make large data sets coherent

  6. encourage the eye to compare different peices of data

  7. reveal data at several levels--from broad to fine

  8. serve a clear purpose--describe,explore, tabulation or decoration

  9. be closely integrated with verbal/writen descriptions

"Graphics reveal data." p.13

Graphical excellence is the efficient communcation of complex quantitative ideas. p15

"Time-series displays are at their best for big dat sets with real variability. Why waste the power of data graphcs on simple linear changes, which can be better summarized with one or two numbers? Instead, graphics should be reserved for the richer, more complex, more difficult statistical material." p 30

"...small, non-comparitive, highly labeled data sets usually belong in tables." p 33

The Small Multiple: a closely spaced group of graphs that use the same design technique to show changes in data. p42

"The relational graphic...is the greatest of all graphical designs." p47

Principals of Graphical Excellence:

  1. Graphical excellence isthe well-designed presentation of interesting data--a matter of substance, of statistics, and of design.

  2. graphical excellence consists of complex ideas communicated with clarity, percision and effieciency

  3. graphical excellence is that which gives the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.

  4. graphical excellence is almost always multivariate

  5. graphical excellence requires telling the truth about data. p51

"Tables usually outperform graphics in reporting on small data sets of 20 numbers or less. The special power of graphics comes in the display of large data sets." p56

Lie Factor = (size of effect shown in graphic) / (size of effect in data) p57

6 Principals of Graphical Integrity:

  1. The representation of numbers, as physically measured on the surface of the graph itself should be directly proportional to the numerical quantities represented

  2. clear detailed and thorough labeling should be used to defeat graphical distortion and ambiguity. write out explainations of the data on the graphic itself. albel important events in the data.

  3. Show data variation, not design variation.

  4. in time-series displays of money, deflated and standardized unites of monetary measurements are nearly always better than nominal units.

  5. the number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data.

  6. Graphjics must not quote data out of context.

5 Principals in the Theory of Data Graphics:

  1. Above all else, show the data.

  2. Maximize the data-ink ratio.

  3. Erase non-data-ink.

  4. Erase redundant data-ink.

  5. Revise and edit.

"Graphics can be designed to have at least 3 viewing depths: 1. what is seen from a distance, an overall structure usually aggregated from an underlying microstucture; 2. what is seen up close and in detail, the fine structure of the data; and 3. what is seen implicetly, underlying the graphic--that which is behind the graphic." p155

"Different visual angles for different aspects of th data also organize graphical information. Each separate line of sight should remain unchanged (perferably horizontal or verical) as the eye watches for data variation off the flat of the line of sight. For multivariate work, several clear lines can be created." p 155

Small Multiples:

  1. inevitably comparative

  2. deftly multivariate

  3. shrunken, high-density graphics

  4. usually based on a large data matrix

  5. drawn almost entirely with data-ink

  6. efficient in interpretation

  7. often narrative in content, showing shifts in the relationship between variables as the index variable changes (thereby revealing interaction or multiplicative effects).p175

"Maximize data density and the size of the data matrix, within reason." p168

"Graphical elegance is often found in simplicity of design and complexity of data." p177

"Visually attractive graphics also gather their power from content and interpretations beyond the immediate display of some numbers. The best graphics are about the useful and important, about life and death, about the universe. Beautiful graphics do not traffic with the trivial." p 177

Attractive displays of statistical information:

  1. have a properly chosen format and design

  2. use words, numbers and drawing together

  3. reflect balance, a proportion, a sense of relevant scale

  4. displayu an accessible complexity of detail

  5. often have a narrative quality, a story to tell about the data,

  6. are drawn in a professional manner, with the technical details of production done with care,

  7. avoid content-free decoration, including chart-junk. p177

"A table is almost always better than a dumb pie chart...Given their low data density and failure to order numbers along a visual dimension, pie charts should never be used." p178

"Tables also work well when the data presentation requires many localized comparisons...one supertable is far better than a hundred little bar charts." p179

The principle of data/text integration is: Date graphics are paragraphs about data and should be treated as such. p181

"Tables and graphics should be run into the text whenever possible." p181

"Words should tell the viewer how to read the design and not what to read in terms of content." p182

The Friendly Data Graphic:

  • words are spelled out , mysteries and elaborate encoding avoided

  • words run from left to right, the usual direction for reading occidental languages

  • little messages help explain data

  • elaborately encoded shadings, cross-hatching, and colors are avoided; instead labels are placed on the grapihc itself;no legend is required

  • graphic attracts viewer, provokes curiosity

  • colors if used, are chosed so that the color-deficient and color-blind can make sense of the graphic

  • type is clear, precise, modest; lettering may be done by hand

  • type is upper-and-lower case, with serifs

Unfriendly data graphics:

  • abbreviations abound, requiring the viewer to sort throuigh text to dcode abbreviations

  • words run vertically, particularly along the Y-axis; words run in serveral different directions

  • graphics is cryptic, requires repeated references to scattered text

  • obscure codings require going back and forth between legend and graphic

  • graphic is repellent, filled with chart-junk

  • design insensitive to color-deficient viewers; red adn green used for essential contrasts

  • type is clotted, overbearing

  • type is all capitals, sans serif p183

"Graphics should tend toward the horizontal, greater in length than in height" p186

The "Golden Rectangle" is 1:1.618 or a/b = b/(a+b) p189

General rules:

  • If the nature of the data suggests the shape of the graphic, follow that suggestion

  • otherwise, move toward horizontal graphics about 50 percent wider than tall. p190