" The soul never thinks without a picture." Aristotle  

Interactive visualizations

Visualizations I added a chart of data on unemployment in Connecticut with data from the Connecticut Department of Labor. I also added a couple interactive charts of Childhood Fatalities in Arizona (see links to the left). Information about the Arizona Child Fatality Review Program can be found at: http://www.azdhs.gov/phs/owch/cfr.htm As with the other charts, pull the slider at the bottom to move across time.

Recently I attended a talk at Massachusetts General Hospital about correlations between obesity and breast cancer. The speaker showed a visualization on the Center for Disease Control website. They used color to show the change of percent of obese people in each state over 13 years. The color change is interesting because it allows us to see which geographic regions changed more than others. But color change doesn't show us how much the percentages changed, because we can, for example, assign a large color change to a small change. My visualizations (see list on the left) show the change on a vertical axis so that we can see the relative change in percentages. The data contains the percent of people that are obese in the United States from 1995 to 2007. The data is from this page at the National Center for Disease Control website. I chose this dataset, not because I have a connection to it, but because it is simple to use and understand as I develop the software to visualize datasets. It's 50 states, 13 years. It contains the percent of people with Body Mass Index in the obese range (>30) in each state, in each year. Drag the triangle on the time slider at the bottom to move across years. Mouse over the dots to see which state is represented by which dot or bar.

  • The Misleading Scatterplot is misleading because the data shown is a percentage, but the maximum value on the Y axis is 35, so it makes the change appear larger than it actually is.
  • This Scatterplot of Obesity Data is an improvement because the maximum value is 100, so a you get a better sense of how much change there is.
  • The BarChart of Obesity Data is better since we're talking about percentages, it makes sense to fill a space that represents 100 percent. This gives us a better feel for the change in obesity rates over time.
I'll post more examples as I develop them.

Visualization in general When I was an undergraduate, one day a painted line appeared on the path in the middle of campus. The line stretched far across the campus, so I followed it to its end. When I got to the end, it turned out to be the longest bar in a bar chart of the current United States budget. It represented the defense budget relative to some other items in the budget. It struck me that without a picture, it's really hard to understand the size of a number that large. Some schools collect 6 million paper clips, or pennies, so that students can understand the size of the number of people killed in the Holocaust. The number itself is hard to internalize. When I first started working with gene expression data, I didn't understand it until I pulled it into our interactive visualization environment and explored. Any time there is a big pile of overwhelming data, visualization helps in understanding what data is there and what truths are contained in it. Visualization is useful for exploration, education and understanding. Scientific studies result in papers filled with thousands of words, sometimes the right visualization can make that data clear in a way that makes it accessible and useful. I work at the Institute for Visualization and Perception research lab at the University of Massachusetts, Lowell. We wrote a visualization environment for data exploration. It's desktop based and generic enough to be used with any dataset. It's a great starting place for data exploration, but to show a specific dataset, on the internet, accessible to anyone, a visualization that is simple and designed for the right subset of a dataset can provide powerful insight and understanding. Adobe Flash Player is a good tool for interactive visualization and is installed on almost all connected computers.

Let me know if you have questions, or suggestions about making it more clear.