By Chris Douce
This article is an attempt to highlight some of the research papers and books cited during an interesting discussion that begun with the seemingly innocuous question of:
This original posting was later expanded, defining 'harder to read':
The word readability was key. Readability can be read in different ways.
On one hand readability can be synonymous with that equally difficult term comprehensibility, or could be viewed as perceptibility, particularly in the cases of programming tools such as text editors and development environment. This raises interesting questions, especially since many of our software tools still continue to use the somewhat impoverished VT100 terminal environment.
The subject of readability (in perceptibility terms) is one that immediately begins to cross boundaries. One of the earliest posts introduced the topic of typography.
Is there something special about reading from the screen as opposed to reading from the page? HCI practitioners have been aware of such differences for considerable time. After all we usually read source directly from the screen before choosing to print, so we can annotate our printouts with notes and arrows on our pages (using our highly viscous pens and pencils).
One of the most appropriate references to be suggested was:
A related paper is:
Other interesting papers include:
Frank Wales unearthed an interesting link that explores the origins of a programming convention that we know as CamelCase:
The immediate question being: is there some empirical research out there that tells us definitively that CamelCase is more useful than writingtextlikethis? One hypothesis being that it enhances readability due to the different shape of upper and lower case letters, and of course, allows us to more easily identify word boundaries.
During this discussion, Derek Jones provides useful pointers to his commentary on the C 99 specification. Two links are particularly appropriate, providing us with a set of very interesting and appropriate references:
In this section from his forthcoming book Derek describes what is called, 'early vision', the phase of vision performed without apparent effort, and goes on to describe the rules of gestalt perception, edge detection as well as the processes of reading (eye fixation), and some interesting models of reading.
The second link explores the issue of identifiers and their processing in greater depth, providing a wealth of associated information that is again well referenced.
Particularly relevant are the sections on identifier spelling, identifier spelling choices and further explorations relating to human language, memorability, confusability (my favourite) and usability.
Derek also provides us with a reference to research surrounding extreme case alternation (which could descend into altercation if used in anger):
There is an interesting distinction that can be made here between intended and unrelated case changes. However, in terms of understanding the perceptual system that programmers constantly use, this research is particularly pertinent.
Programmers use secondary notion (such as indentation), and the the efficacy of particular typefaces to highlight program syntax (or role, or slice) could potentially support programming (and its related activities).
CamelCase, could be perceived to be a form of secondary notation for identifiers, where the case alteration distinguishes between useful identifier elements.
We are reminded of two other useful and related references:
An interesting point was raised when somebody asked: 'surely there are more interesting/useful topics to perform research on!'. This is undeniably true. CamelCase, however (and identifier naming) is the stuff of programmer arguments, which indicates that programming style is a topic that will remain in fashion for some considerable time.