The Poster
The poster layout was chosen as a compromise between essential text, visual appeal and clarity. The broad nature of the analysis required a significant proportion of the poster to contain explanatory text orienting the reader to detail of the analysis - bold, coloured text was used in order to allow a 'tl;dr' precis to be extracted.
The Title and Subtitle; 'Accident Hotspots: 2009-1016, Seven Years of Road Traffic Accidents in Swansea', was chosen as it is brief and draws interest without ambiguity. It sits in a banner at the top of the poster to afford prominence and a prompt for the reader to begin at the top left. The sentence sitting at the top of the page 'Swansea has fewer accidents...' is included as a 'tagline' to give a brief cue to the nature of the findings.
The layout is portrait and split into thirds, with the reader encouraged to proceed left to right in three stages. This process is facilitated through numbering and a horizontal lane splitting the sections into three. This approach attempts to reduce the risk of information overload when encountering a new analysis by producing 'packets' of information to consume in a step-wise manner. The analysis proceeds in progressively decreasing time frames - from year to minute and then culminates in an amalgam of spatial and temporal data in the form of a choropleth.
Radial Bar Plot and Regular Bar plot
Before addressing the choice of calendar heatmap, I want to first discuss the middle third. The reasoning behind this is that through these 'conventional' visualisations and I will justify the deviation from convention in this context. In terms of accuracy of observer estimates, utilising length, height or position in plotting is generally preferred over area, angle, weight or colour. This is underpinned by evidence from visual science which has demonstrated that position along common scales: as seen in bar and scatter plots, is more easily interpreted than slope, direction, angle, area. The bar plot sits at the top of the hierarchy of elementary perceptual tasks (fig 1.). The Week in Accidents (fig 2.) bar plots was chosen for simplicity and easy of interpretation. A rapid analysis of differences in days, and the change from 2009 to 2016 is facilitated by this plot. It follows logically from the calendar heatmap seen above and is drawn from the same data. The paired bars were used as they are more easily interpreted than stacked columns and emphasise that reduction in the rate of accidents both per day and across the entire week. The minimalist theme avoids distraction without compromising essential detail, and the colour palette aligns with the rest of the poster. In contrast to the bar plot, the radial column plot that sits in the centre third of the poster was chosen for a number of reasons despite its limitations in perceptual terms. Maps based on angle and length , in this case the angle of the bar and its location in the circle represent time and the length represent the variance present within this bin, sit within the bottom half of the hierarchy. This may appear a poor choice, where a traditional time series graphic may aid ease interpretation. However, methods of visualisation based on lower parameters may be appropriate when the aim is not to facilitate precise judgements in data interpretation but to reveal general patterns relating to the dimensions of the data. The radial map resembles a traditional clock (analogue) face, and is therefore extremely familiar to those viewing it. The decoding required by the observer is simple and corresponds with a learned skill from early infancy (telling the time). The graphic emphasises the peak periods of road traffic accidents around 9 am and 5pm. The colour scheme also serves to highlight the differences from day to night, with two transition periods in late evening and early morning. It is worth noting that the binning here is in 30 minute aggregates. This arose out of necessity and due to the sampling method in the STATS19 dataset. The original chart was designed as a minute by minute visualisation of 24 hours i.e. 24 * 60 columns. When charted, it revealed a cyclical variation in accident occurrence which at first is difficult to interpret . A large number of accidents corresponded with whole divisions of the hours of the day i.e. on half hour intervals. With smaller numbers occurring within these periods (see fig 2). Without separate evidence to corroborate this finding, that accidents are more common on half hour or hour times throughout the day, it would appear that those individuals collecting the data were 'rounding -up or -down' in the recording of the time of event. So accidents that occurred at 11:57 for example, were disproportionately recorded as 12:00. The thirty minute interval bins aggregated these together and served to limit the column to column variability without affecting the overall accuracy of the graph. It should also be noted that a log scale was used in the creation of this graph. The wide variation in accident number was difficult to plot on a linear scale and obscured the larger pattern
Calendar Heatmap
Most Calendars are laid out in tabular format, with days read from left to right, top to bottom. This traditional structure has no intuitive visual relationship with the unit of time that it represents - the month, week, day. However, it is so pervasive that most individuals would have no problem in orienting themselves to a calendar without any direction or cues. Much like the clock face in the radial bar example, an intuitive understanding of calendar structure allows its form to be exploited in graphing variations where day, week and month differences are important. Here trends in the data are more important than precision, and permit a rapid orientation to the graphs take-home message. It allows the observer to 'relate' to the message in a manner that brings association with lived experience; e.g. winter driving. Furthermore, it is a visually appealing format that attracts interest from viewers. Poster displays can often contain a great number of posters with average viewing times very low, distinctive imagery is advantageous in this setting, something difficult when not generating visual data through experimentation e.g. microscopy images.
Choropleth Map of Change in Accident Rate
The choropleth approach is useful in visualising ordinal, ratio or interval data with geographic variation - for example, accident quantiles. Rates of road traffic accident are recorded by LSOA of accident location (as well as latitude and longitude). The LSOA is a small unit of population, therefore there are benefits and limitations to its use in choropleths. For small or middle sized metropolitan scale maps, they work well. Capturing granular data at a level interpretable visually on the small maps used in poster presentations. Furthermore, unlike large geographical units the map is not dominated by single regions, e.g. the U.S. or Russia on a World choropleth. At larger geographical map sizes, LSOAs can prove difficult to interpret due to the excess of detail. This is particularly true if the patterns of relationships are more subtle than rural versus urban. I opted to focus on Swansea, however an all wales choropleth would have enhanced the poster to allow visualisation of Swansea within the greater context. Space was at a premium, therefore it was not included - an option would be to include a link to the larger map.
Change in accident rate was chosen as the metric because the aim was to emphasise the relative incidence of road traffic accident injuries in Swansea during the period of interest. Maps plotting absolute counts or relative rates are a useful in identifying areas of high incidence and has been widely produced with the STATS19 data. It can however mask areas of improvement or relative decline, that are useful in forming future road policy.
Classification of data is an important part of the creation of choropleth maps and can have a significant impact on their appearance. Examining the distribution of the data we can see that is it positively skewed, with over 60% falling into the lower bins, minor improvement or stability. In choosing classes of 1% there is a risk of losing intra-class discriminating details. This occurs as so many are within 1% of each other. Despite this risk, the 1% binning was chosen as it emphasises improvement versus deterioration. The method of selection based on quantiles would also have been a useful approach. Cutting the data up into a number of equal sized bins would better demonstrate differences that lay within the 1% bins. The result of this however, would have emphasised very small differences in classes, including binning small increases, stable LSOAs and small decreases into single classes.
The use of a divergent colour scheme was also chosen in order to emphasise distance and direction of deviation from zero change. These changes should be interpreted with caution as they are often very based on small numbers of accidents per LSOA. The near uniformity of the direction of change, however, represents a visual analogue of the significant difference in accident rate across the entire region from 2009 to 2016.
Future Work
The STATS19 dataset is recorded each year, therefore the analysis can continue for comparison in coming years to see if these trends are maintained. Of particular interest would be establishing the reasons for the Friday peak. Literature in this area is mixed, however the suggestion is that the increase relates to driver fatigue as the week progresses. One analysis could look at whether accidents on this day occur at different times to the rest of the week, involve a disproportionate number of distractions (recorded in self-reporting), or
References
1. Cleveland and McGill. Graphical Perception: Theory, experimentation and applications to the development of graphical methods. JASA 79 (387): 531-554; 1984.
2. The Truthful Art: Data, Charts and Maps for Communication.