L579 Project 2: Tree Visualization

L579 Project 2: User-Centered Design, Discussion, and Evaluation of a Tree Visualization

Jay Askren and Mark Meiss

Data Set Selection

Because we shared a common interest in geneology and Jay had some previous experience in writing geneological software, we decided to work with family tree data. Jay also had some code for reading and parsing GEDCOM, so we had hopes that we could easily adapt it to the IVC when it came time to create a concrete rendering of our visualization. GEDCOM is a standard genealogy format that almost all genealogy programs are able to import and export. On the other hand, we also agreed not to let the desire to use this code dictate our ideas, as we had been warned in advance of this design pitfall.

We also thought that genealogical data would be a good fit for the project description's recommendation of trees with 10 to 500 nodes. (To our surprise, we learned during our user interviews that it is not uncommon for serious genealogists to construct family tree of as many as 10,000 nodes!)

User and Task Analysis

Our target user group consisted of people who use genealogical software on a regular basis to keep track of their family tree data. This is a very loose description, so to get a better idea of the traits of this group, we interviewed the staff of the genealogical center at a local church. The interviewees are familiar with a variety of genealogy programs, use them often themselves, and help others on a regular basis. They have become intimately familiar with the user population, the tasks they try to accomplish, and the problems that arise.

We quickly learned that the group demographics would provide us with a special challenge. The majority of users of genealogical software are elderly and have little experience with computers. Many are not interested in computers for any other reason than to work on genealogy, and they are unlikely to feel comfortable with any excessively technical view of the data. A common complaint is resentment at the rapid changes in operating systems and user interfaces; these users feel betrayed by this instability, and they have become suspicious of new products. This attitude is fostered by their computers themselves, as they often have old or low-end systems that most users would consider obsolete.

There is a single visualization central to every major genealogical program: the pedigree chart, which is a two-dimensional layout centered on the root of the tree. Our interviewees repeatedly emphasized its importance in working with application; not only is the pedigree chart the primary visualization for reports, but it is also a navigation interface. In many programs, a user may click on an individual in a pedigree chart to draw a new chart centered on that person.

This view of the data affects not only how these users see their data and work with it, but also how they conceptualize it in the first place. They are accustomed to reading these charts and can describe various family relationships with only a glance at the tree structure. The interviewees emphasized that the user community considers the pedigree chart to be intuitive, familiar, and extremely useful. They were completely uninterested in a global view of the data set (many trees contain thousands of nodes) because they felt it would be uninformative and difficult to read. Almost all of the users' time is spent entering data, not viewing it. Most "fancy" genealogical visualizations are actually just pedigree charts printed on large paper, decorated with scanned photographs, or superimposed on the image of a tree.

Since it was clear that the user community would not accept any basic changes to this visualization, we tried to identify their greatest needs that were going unmet by existing software. Among these key challenge were the following:

We were intrigued by the mention of geography, so we pressed for a more complete explanation. After some discussion, we understood that the users had a large and unfulfilled need for a visualization that would help them to tell a story. Story-telling is clearly the greatest source of entertainment and emotional satisfaction in the study of genealogy; it unites the data with oral history and makes the data mean something to users and their families. The interviewees felt that maps were essential to this story-telling, since they add an important (and more tangible) dimension of space to time-based data.

The most important need of the user group was thus for a visualization that still used a pedigree chart for navigation, but could also show geographic information. This visualization of the tree would have to be intuitive to a non-technical audience and provide cues that promote discussion of the data. The goal of the user in our scenario is to look at the data and become inspired to talk about it with others--a social goal rather than a technical one--so clarity and visual appeal would be essential.

Design of Visualization

This was our first sketch of a geographical visualization of the tree data. The panel to the right shows a pedigree chart that would be used for navigation. The current root of the tree is shown in the largest font, and the size of boxes and text then decreases for each generation away from the root. If the user clicks on an individual's name, there would be an animation showing the pedigree chart transforming so that individual becomes the root of the tree. The "siblings" and "spouse" buttons to either side of the root would allow lateral navigation across a single generation.

On the left side of the sketch, we have a combination map and timeline for the individual at the root of the tree. Significant events in the person's life are notated by a letter in a circle: "B" for birth, "M" for marriage, "C" for the birth of a child, and "D" for death. This information is then repeated on the map below; each event is positioned where it occurred, and there is a shaded area around the event that reflects how long the person stayed at that location. The arrows establish the sequence of events. We wanted an interface in which the user could "play" the timeline as an animation; a pointer would move to the right along the timeline as more events appear on the map. The checkboxes at the bottom would allow the user to decide whether to see modern state borders or to have the boundaries change according to history. Our example shows the United States, but a finished application would create a montage of maps based on the individual's particular geneology.

Because our opportunity to meet with the user experts was limited, we had to perform the initial evaluation of this design ourselves. We first decided to discard the arrows from the map, as they would give a misleading impression about travel and intermediate events not in the record. The shaded areas went away as well, since geneological data rarely includes reliable "length of stay" information. After much deliberation, we also decided to shed the proposed animation altogether, since the entire display seemed too focused on a single individual rather than the family as a whole. We needed to find a way to involve all the people shown on the pedigree chart into the visualization itself. Animation was also inappropriate in that the users didn't want to do their story-telling while sitting at a computer; they wanted a printout to show others. Finally, we thought that the collection of letters used to mark events on the timeline was arbitrary and confusing. It would be better to show less information than to cause the timeline to be ignored entirely.

We experimented with several different ways of showing connections between the pedigree chart and the map and timeline. Simple labels containing people's names detracted from the ability to understand the visualization at a glance; our visual system is not well-suited to match text on the map to text in the pedigree chart. We tried making the connection more explicit with lines connecting the pedigree chart to map locations, but the results were confusing and unreadable; following one line in a tangle of them is also difficult. We also noticed that the location indicated by a verbal label is somewhat ambiguous in the absence of any other marker.

These obversations led us to design our most promising candidate, two views of which are shown to the left. The left image shows the birth locations of each person in the pedigree chart. Instead of a simple black-and-white view, we use color to distinguish the generations, following the familiar rainbow progression (with yellow omitted) from top to bottom. We place each individual on the map used a male or female colored icon with the name floating just above it. These icons are duplicated in the pedigree chart for visual reference. The selection of a human-shaped icon also helps to disambiguate location, since we tend to see a person as standing on a particular spot on the map. We also include birth and death dates in the pedigree chart and augment the "spouse" and "siblings" button with small icons.

In the second view, we have replaced the figure caption with a timeline that shows the lifespan of each person in the pedigree chart. The timeline uses the same color coding as the rest of the visualization and makes it easy to see at a glance which people were alive at the same time and for how long. We also repeat the rainbow progression so that the order of names in the timeline will seem natural.

Visualization with IVC

We were able to port some GEDCOM geneology data to the XML tree format supported by the IVC Software Framework. To accomplish this, we used a java program to convert the GEDCOM to xml. The java code can be found here. To run the java code, use the following command:

java -cp GedComConverter.jar GedcomFile OutputFile

Where GedcomFile and OutputFile should be replaced with the appropriate names. Then an xslt stylesheet was created to convert the xml data to the Peruse tree format used by the IVC. This stylesheet can be found here. Unfortunately, not very many visualizations have been implemented for the Peruse Format. Only the radial graph visualization worked for this tree. This graph for Abraham Lincoln's family Tree is shown here. This data can be downloaded here and loaded into IVC. Not satisfied with this visualization in the IVC we contacted the IVC dvelopment team to find out how to show the data in various visualizations supported by the IVC. After sending several emails back and forth with the development team we finally were able to hand code this file with 33 individuals from Jay's family tree to test out in the IVC. Using this alternate file format, the IVC can display the following visualizations:

Tree Visualization, Tree Map, Radial Graph, Balloon Graph

More than likely, the above visualizations would scare our user base. For this reason we decided not to use these visualizations with our users. The IVC also does not appear to support the type of visualization we would like to do. A "live" display would require simultaneous display of the pedigree chart, the timeline, and GIS data. We settled on using xfig and GIMP to generate our sketches by hand. While it's not reasonable to expect the IVC to handle a visualization as complex as this one, support for working with GIS data would be a wonderful addition to the system.

Usability Study

For our Usability Study Jay went back to the family history center at a local church and asked the same two staff members for feedback on the visualization mockups. Flora and Beryl gave several interesting insights into the visualization.

The first thing they mentioned is that they thought it would be better if arrows were drawn on the map between locations to show how the families migrated. They mentioned that it would be interesting to overlay on the map migration patterns and old roads that people would have traveled on. Examples given were the National Road, the Oregon Trail, or specific railroads of the day. The users wanted to see how the routes taken by their ancestors compared to traditional routes taken by other travelers. It was later mentioned that it would be nice if the most likely type of transportation could also be shown along with the route. For instance, if a family moved in the mid 1800's, it may be appropriate to show a covered wagon or perhaps oxen or horses. If the route was along a river or across the ocean, it would be appropriate to show a boat. If it was in the mid 1900's it would be appropriate to show a car, etc... This could give the user an idea of not only where families moved, but it would also quickly give them an idea of when they moved. Unfortunately, showing this information is difficult since genealogy data does not typically hold information about specifically when familes moved and what routes they took. Some approximations may be able to be made though.

Another problem that the users pointed out was readability. They were looking at a paper copy of the visualization and thus the names of the people on the map were very tiny. Not being able to actually read these names, both users assumed that these names were county names. Along the same lines, the users were asked if they could tell which individuals were male and which were female. After looking at the diagram for some time, they finally noticed that the icons were different for the different genders. They suggested possibly using the male and female symbols so that they could be differentiated more easily. Since the typical genealogy software user is older and often doesn't have the best eyesight, it is very important to make sure that words are big enough to be read and that the symbols are different enough to be easily distinguished. They also mentioned that it would be nice if the county that individuals lived could be highlighted so it was easier to match a person with a county on the map.

Finally the users were asked what additional features would be nice to add to the visualization. They mentioned they would like to zoom in and zoom out and see the individual counties and townships. In each county they wanted to be able to see the cemetaries, churches, county seats, etc so they could know where to go to do genealogy research. They also felt it would be useful to be able to see what the surrounding counties are. It's quite common that an individual might have birth records in one county and baptism records in a church in a neighboring county because they lived so close to the county line. It was also mentioned that it would be nice to show plat maps. These are maps which show ownership information for each plot of land. Another feature that was mentioned was the ability to see maps in different years and especially to see the different county names in different years. It is quite often that an individual changed to a different county, not because they moved but because the county name changed or the county boundary changed.


In the interests of privacy, the names shown in our pedigree chart are fictional, as are their birth and death dates. They were randomly generated from first and last name lists distributed by the U.S. Census Bureau.


  1. Jay Askren and Mark Meiss. Interview with Flora Barker and Beryl Poteat, 2005.
  2. Jay Askren. Informal Usability Study with Flora Barker and Beryl Poteat, 2005.
  3. Katy Börner. Lecture on user and task analysis in visualization. Indiana University School of Library and Information Science, 2005.
  4. Pat Hanrahan. To draw a tree. Stanford University Computer Science Department, 2001.

Last modified: Wed Feb 9 18:34:13 EST 2005