|Psychology of Programming
|about newsletters workshops resources contents|
Editor: Chris Douce
Welcome to what has now become the Summer 06 edition of the PPIG newsletter. Quite a lot of time has passed since the previous issue, and quite a lot has happened too. This edition is jam packed with useful information, discussions and developments.
This issue begins with some preliminary information about the forthcoming 2006 workshop which will be held in September in the sunny seaside resort of Brighton, UK. This is followed by a review of the 'unroll your ideas' workshop held back in January.
This newsletter contains an report on an 'ethnographies of code' workshop run at Lancaster University. Due to the cross-over of interests, this event was attended by a number of delegates who also attend PPIG events.
Ramanee Peiris has kindly written a review of ITiCSE 2006.
Martin Sustrik has written a short paper that explores the relationship between programming languages and natural languages – a subject that many readers have an interest in.
Alan Blackwell answers a frequently asked question, namely 'what ever happened to ESP (Empirical Studies of Programmers)'? Alan provides a clear answer.
Chris Douce considers what aspects of programing makes him happy and angry (sometimes at the same time) by addressing what effect emotion may play on the activitity of programming.
Many thanks are extended to all contributors – your input is warmly welcomed.
The next issue of the newsletter will be published in late September and edited by Francoise Detienne. Its publication will co-incide with a special issue of the CSI (Computer Society of India) Journal. The focus of the newsletter will be collaborative and group software development.
The eighteenth Annual Psychology of Programming workshop is to be held from September 7-8 in Brighton, UK. It's is co-located with IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC'06) and SoftVis'06.
Brighton is a lively cosmopolitan seaside city that affords participants an excellent environment for scientific discussion. The conference will be held at the Holiday Inn - Brighton Seafront, currently being refurbished to a very high standard.
Virtual Brighton provides a wealth of information about Brighton, the south coast's 'London by the Sea'. Brighton provides a wide range of shops, restaurants, a royal palace, museums, cafes, galleries, theatres and a wide range of leisure activities in and around the city. All meeting venues, accommodation, shops, restaurants and other tourist attractions are within easy walking distance of each other and Brighton itself has easy access by road, rail, air and sea.
Virtual Brighton & Hove Tourism provides a wide range of information on Brighton accommodation, attractions & walks, eating out, pubs, shopping, maps and routes, the seafront, and Brighton’s history and heritage.
Visit Brighton has a detailed list of attractions and places to visit, activities and events, accommodation, shopping and various interactive maps to help plan out a visit to Brighton & Hove.
More information is available through the PPIG 2006 page at Sussex University
A limited subsidy for students who wish to attend this event is available. Please contact Maria Kutar for further information.
[ top ]
January 12-13 2006
Coventry School of Art and Design, Coventry University, UK
This was a fascinating and very lively SmallPig workshop. There were a large proportion of participants new to PPIG, yet the ethos of a PPIG workshop, with its focus on lively constructive discussion, was achieved.
Thanks should be given to Prof. Stephen Payne who was invited to give the opening address. He gave a talk entitled 'Deciding how to spend your time....' which presented the results of three very different experiments, interpreted as showing: People allocate their time and effort across a set of texts, or tasks, by monitoring their 'gain' and switching between texts / tasks whenever their gain drops below threshold. When reading multiple texts, or time-sharing across similar tasks this strategy is quite adaptive. However during exploratory learning of a multi-function device it leads to maladaptive thematic vagabonding.*
Highlights of the discussion papers included Richard Bornat and Saaed Dehnadi, (Middlesex) 'Separating the sheep from the goats', in which we were presented with the results of their research which indicates that they have devised a test which is able to pick out, from a group with no previous programming experience, the population that can program, before their programming course begins. We look forward to hearing more about this at future workshops.
The second day began with an invited talk from Prof Bob Newman (Associate Dean Faculty of Engineering and Computing, Coventry University). In his discussion on 'Coders vs Programmers' we thought that we had reached the Holy Grail of a definition of a PPIG programmer when he told us that 'Programmers are creative problem solvers, interested in doing clever difficult things'. Alas, he completed the sentence with the words 'even if they do them badly', and so our search continues.
Sarah Mount (Coventry), in her paper 'The usability of compiler output for learner programmers' observed that the problem with teaching programming is neither the staff nor the students, rather it is the University's lack of commitment to funding an army of highly trained ninja parrots who could take tutorials. Two speakers (Ruba ab Husna (Coventry) and Richard Bartlett (Mapperley Games)) discussed IT in schools. This was a welcome focus, given that PPIG is more accustomed to discussion of issues in Higher education.
The obligatory Cognitive Dimensions session provided an introduction to the topic for the new PPIGers who were then well placed to receive Luke Church's (Cambridge) papers which used cats and dogs to illustrate his Cognitive Dimensions analysis of security issues, and contained a unique blend of usability and security.
The final papers from David King (Open) and Sarah Mount (Coventry) were open discussion papers, inviting comments on research in progress. These were well received and sparked interesting debate. The primary aim of the SmallPig workshops is to provide a forum in which people are able to discuss research in progress and so it is encouraging to see that this is happening in practice.
Finally, a huge thank you must go to Andree Woodcock, (School of Art and Design, Coventry University) and her team for organising such an enjoyable and successful workshop.
* with thanks to Prof. Payne for this summary
[ top ]
March 30-31 2006
University of Lancaster, UK
by John Rooksby
The Ethnographies of Code workshop was held at Lancaster University on the 30th and 31st of March 2006. It attracted about fifty participants, and featured sixteen full papers and eight position papers. It was a very international affair, and featured a good mix of computer scientists, sociologists, psychologists and philosophers. The proceedings of the workshop are published.
The idea behind the workshop was to provide a forum for researchers who are trying to bring social analysis beyond its usual sticking points of 'human factors' or 'culture', and into a position where it can be brought to bear on technology design. The title 'Ethnographies of Code' might not seem to make a great deal of sense (Ethnography being a term that literally means writing about people) - unless you accept that code is saturated with, and indivisible from social phenomena. The subtitle 'Computer Programs as the Lived Work of Computer Programming' borrows from Eric Livingston's writings on mathematics (see for example his book The Ethnomethodological Foundations of Mathematics), using his notion of 'lived work' to stress that the program is never free from the practices of programming.
The keynote speakers were Adrian Mackenzie from Lancaster University and Tom Rodden from Nottingham University. Adrian Mackenzie is a lecturer in cultural studies and is well known for his work on 'cultures of code'. His talk covered diverse themes from code-art, to representations of programming in films, to the commodification of code and formations of communities of programmers. He has recently published a book called Cutting Code. Tom Rodden's talk was on mixed reality games and players' development of strategies. He described the problems for ethnographers of observing these games and discussed 'technology probes' as being useful means of investigation.
The first paper session began with a presentation by Dave Martin, one of the workshop organisers. This was on the practical knowledge of programmers in finding and working their way around a large code base. Monika Büscher, also from Lancaster then presented a video of finding and fixing a networking problem and discussed how technical resources are made palpable by the workers involved in order to achieve this. The final paper in this session was by Steinar Kristoffersen from The University of Oslo on how programming entails certain kinds of design decisions and on how an 'epistopics of design' might account for this.
The second session began with Christian Greiffenhagen and Wes Sharrock from Manchester University discussing video of a maths lecturer working at the blackboard. Barry Brown from Glasgow University discussed how it is that programmers work from line to line. Stuart Reeves from The University of Nottingham then presented an ethnography of himself (much in the style of Livingston) as he developed a program.
The third session began with Julia Prior from the University of Technology in Sydney discussing the role of infrastructure in code production. Catalina Danis from the IBM TJ Watson Research Center discussed her study of collaboration in developing code for high performance computers. Phillipe Rouchy from Blekinge Institute of Technology then took a historical approach in a study of PROLOG and shifting professional dynamics.
The first session of the second day had related papers from Sebastian Jekutsch and Frank Schesslinger, both from the Free University in Berlin. The first paper detailed an annotation scheme for programming analysis, and the second presented software for use in doing analysis. The room looked on in envy at this software. Marjahan Begum then discussed a cognitive study of the strategies of novice programmers.
Christophe Lejeune from The University of Technology in Troyes presented his study of the development of an open source web directory. Chris Douce then presented his paper on the practices of programmers in rooting out useful information in other programmers' blogs. Gabriele Gramelsberger from the Free University in Berlin then presented her research on the status of code in some scientific research as being the embodiment of scientific theory, and the kinds of practices entailed when working with such code.
The closing talk was by Morana Alac from the University of California, San Diego. She presented a phenomenological study of developers attempting to get the arm movements of a robot interviewer to look human.
by Ramanee Peiris
A very warm end of June in the beautiful city of Bologna was the setting for the ACM-SIGCSE ITiCSE 2006 conference. Over 200 delegates enjoyed the food, wine and a range of presentations covering Computer Science Education, under the heading Freedom in Teaching Computer Science.
These included new slants on old topics, such as design patterns first and puzzles-first. There were also ideas to engage school pupils by having an after-school Java programming club for 12-13 year olds. Their projects developed the popular idea of having leaners read and develop someone else's code. The school pupils went away with their own game running on their mobile phones.
Six working groups studied subjects in more detail including whether the objects-first debate is grounded in research, whilst posters covered subjects from story telling to various aspects of whether gender is or isn’t a factor in CS Education.
Panel discussions included the "Programming Languages" course, and six views on its content. Should one teach students from a historical perspective, study one language in detail, or provide a taster of several?
There were two keynote presentations, Roberto Di Cosmo spoke on "Educating the e-Citizen", where he felt that free software is the key, and warned us to never stop questioning the technology. Alison Young & Logan Muller gave a thought provoking presentation on their project in Peru, and they challenged Computer Science not to follow the footsteps of Nuclear Science - we need to question our priorities and build on the knowledge of other communities.
Full papers will appear in the next issue of the SIGCSE Bulletin; working group reports will follow in a later edition. All contributions are available from the ACM Digital Library.
Thanks to the organisers - Renzo Davoli, Michael Goldweber and Paola Salomoni. Next year, why not join us in Dundee?
Ramanee is a Lecturer in the School of Computing at the University of Dundee, Scotland.
by Chris Douce
by Kent Beck
Addison Wesley, 2001
Test Driven Development by Example is one of those books that I've been threatening to read for a while. From time to time I'm faced with writing code of a kind that I have never written before. I can usually come up with a solution (with varying amount of API reading and head scratching) but sometimes I'm not as confident in the solution as I would like to be.
To get over the feeling of uncertainty and following one of the ideas of Extreme Programming I have occasionally used two 'xUnit' modules - Junit to unit test some elements of Java programming, and nUnit to test some of my .NET code (apparently there's also something called HTTPUnit too!).
The idea of test driven development is that you write a test before writing the code. Here are some of the key phases: add tests (after learning about the requirements), run the test to see them fail, write the program code, run the tests again to see them succeed, then refactor to remove duplication. Sounds easy! But not all only types of system software can be written this way – you'll what I mean in a later paragraph.
My adoption of unit testing has always been based on pragmatics – the need to ensure that a series of functions are as correct as they can be. Plus, writing the tests within the code itself gives one great advantage: you write a form of test specification that also incidentally can be checked programmagically against your code. It must be said that my practice has always been of the 'test last' variety.
TDD by Example is divided into three parts, 'the money example', 'the xunit example' and 'patterns for test driven development'. My favorite part of the book is the third section: the part about patterns (the money example doesn't inspire, but I can see how it shows the ideas behind the tests very clearly).
The 'pattern' section is explicitly related to the idea of software design patterns in one section, but it would have been better called 'idioms' or 'tips' instead. These 'tips' encapsulate development heuristics or tactics that can be used to obtain a greater understanding of a code base. I can see the motivation for many of the 'red and green bar' patterns (ways to use the testing tools), but being an empiricist, I would have liked to have seen them grounded in firmer study.
TDD is clearly related to the idea of cognitive dimensions – particularly to viscosity. By adding tests you construct a parallel representation of your existing codebase. In doing so you increase its viscosity – you ensure that elements of the system are more reluctant to change. If you make a minor feature enhancement, the 'red bar' may indicate that you have violated the rules of the existing test cases allowing the compiler and codebase to protect you from yourself.
I have a favorite testing pattern and one that is clearly related to the notion of a 'secondary notation'. With the Broken Test pattern, Beck writes, 'how do you leave a programming session when you're programming alone? Leave the test broken.' His motivation is simple. Get the codebase to tell you what you were doing before you had to 'break off' to do something else. 'A broken test doesn't make the program any less finished, it just makes the status of the program manifest. The ability to pick up a thread of development quickly after weeks of hiatus is worth that little twinge of walking away from a red bar'.
The use of xUnit-based testing is essentially a simple and pragmatic idea. Whilst reading it there were some nagging questions that the author clearly acknowledges towards the end of the book: you can't test GUIs, distributed objects or database schema, for example. You certainly can't test the essential validity of your requirements – are you building the right product, for example (although you may be led to construct some tests that may help you to question them).
You can't currently use a 'test first' approach to determine how your software sits within a larger ecosystem of software products. Final pre-release or beta testing might be dependent on devices, operating system, web servers and virtual machine frameworks.
Beck throws down the gauntlet to us. Does TDD create software with reduced defects? 'Where do I get off claiming such a thing?', he writes. 'Do I have scientific proof? No. No studies have categorically demonstrated the difference between TDD and any of the many alternatives in quality, productivity or fun.'
One interesting thing about Beck's writing is that the book is filled with emotion. The preface refers to courage. Testing can reduce programmer stress and tests should be written 'until fear is transformed into boredom'. Another quote: 'having a green bar [on the test program] feels completely different from having a red bar… you can refactor from there with confidence'. Testing seems to be as much about the programmer as it is the resulting software artifacts.
I can't help but notice a growth in the number of studies that explore the phenomena that is Extreme Programming. I hope that as researchers tease apart elements of the XP ideals, they will also look into TDD, how it is being used and whether programmer's views change as they gain more practice.
Will I give the idea of TDD a try? Perhaps, but only on parts of a system that lend themselves to it. I think the idea of TDD is worth a look.
Do you know of a journal that may be of interest to fellow PPIG members? If so, please tell us about it. Interested in writing a review, or perhaps you are an editor and would like to introduce your journal to us? Please feel free to send a message to chrisd(at)fdbk.co.uk
[ top ]
by Chris Douce
Occasionally, if I am asked the essential question of 'why' I wrote a collection of functions a a particular I may not immediately have an answer. I have an instinct to reply, 'because it feels as if this the best approach', especially if some time has passed between discovering a design and the request for its justification.
When developing software, I seem have acquired a sense of code aestetics, of what good software should look like in terms of function construction and naming, its use of memory and processor resources, and how a problem has been decomposed. Here, we can return to the earlier concept of 'hacking and painting' that has been touched upon in a previous newsletter. The emotion inherent in successful programming is connected to the idea of refactoring where programmers can find evidence of less than satisfactory coding in terms of 'bad smells'. The refactoring idea introduces the programmer to the world of program aestetics through appropriate justification.
The use of 'feelings' within the discipline of computer programming or software engineering, a discipline that demands that programmers conform to the cold, hard, unforgiving logic of a machine seems to be a paradox. Perhaps the most fundamental question to ask is 'what is emotion for, and can it in any way help our quest to create truly great software?' Again, I have very few firm answers – only questions, and a number of interesting examples. My first example comes out from immediate experience:
Some days I wonder whether my chosen occupation is one that really makes me happy. I seem to chase a continual whirlwind of software releases, upgrades and security fixes (not to mention the coming and going of which might be software fashions). Somedays, rather than feeling happy, I seem to feel perpetually confused. But confusion is good. This 'feeling' of confusion is a metacognitive state. It means that you do not have a full comprehension of a situation or a system.
Emotion is a guiding force that complements our rational capabilities. We are made to feel uncomfortable if things are not understood. When sub-goals are acheived we may become happy. Eventually we may feel 'proud' at what we have acheived. Uncomfortable feelings can be motivators – they can be harnessed to produce positive results, providing, of course, they are recognised.
Memory researchers have known for a long time that emotions strengthen memory. When searching for a solution that 'feels right' (and does the job it supposed to), not only do we remember facts about programming language semantics and the functions within an API, we also recall situations that were successes and failures: code coloured by memories of past frustrations and successes. Expertise isn't only the possession of domain specific facts and knowledge, it is facts and knowledge that is retained alongside events that gave rise to particular feelings.
Emotion comes into the equation from another perspective. If a project you have been working on for two years is cancelled, those who invested time in developing solutions to problems are bound to be upset, bewildered and angry. Even though we may try to recall Weinberg's concept of the 'egoless programmer' I believe such notion is noble, although idealistic.
In egoless programming, the artefact and the programmer should be separate. You should not 'put yourself' into your program – it should not have your soul. I have to say that when I write a program I am somewhat guilty of 'putting myself into it'. The argument that allows me to make this justification is that it is my time that has allowed the program to be constructed. This said, it may be possible to get more out of a program than what I have put in, especially if my work is constructively critiqued – something that every programmer should welcome. (All coders have some daemons they might want to confront).
Working in programming pairs, we are duty bound to consider the sensibilities of our coding partners and have an appreciation of their knowledge, expertise and sense of code ownership. 'Is it alright to change this?' a member of a pair may ask, 'if you don't like it, we can change it back, I just want to show you something…', a dialog may go. I'm looking forward to finding out more about ethnographic studies in this area.
Let's now turn our attention to the emotional sounding job title of 'evangalist' Where can I get one of these jobs from? Being an evangalist sounds great fun – it sounds as if there is a lot of shouting to do, preferably in front of large numbers of people who are likely to be willing listeners. The word 'evangalist' causes other words to spring to mine. An evangalist must be an 'enthuser' and a 'persuader'.
Software companies know all about programmer emotions. One famous operating system vendor is particularly good at using all the tricks in the marketing book. You can see the joy on the faces of those programmers. Look at how happy they are as they discover how to work more efficiently! Look at how they love intellisense! Isn't it great!
Other vendors are not immune to displaying the smiles of the happy coder, but instead apply the youthful conception of 'cool' in an attempt to allocate a maximum amount of developer_happy_space.
An interesting contrast can be seen if we view one of the greatest open source development tools: Eclipse. Why is there such a marked difference in style? Perhaps the focus of emotion lies with collective success rather than individual happiness or achievement.
As well as attributing emotion to software organisations, emotion has a role to play when it comes to forming 'mid-level' judgements: choosing whether or not to use a library, package or language to solve a particular task. Micro-level emotional judgements could be construed to be judgements about groups of individual programming constructs, notation within drawing or design tools or textual explanations within documents, Macro-level level judgements are those relating to organisations, vendors and/or programming paradigms.
Our emotions play a key role when it comes to deciding who we trust. We trust people more if we have benevolent feelings towards them, and this is applicable just as much to groups as it is to individuals.
Trust is a particularly interesting issue due to the availability of source code and libraries on the net. Should we trust what the executive summary or comments at the top? Does the code do what it says it will? More importantly, what are other people saying about it? If we don't know someone or an institution, we make trust judgements based on reputation, what insitutional values we detect, and our sense of aestetics. Does it look right? Does it feel right?
There is another dimension of trust too. Just as we may take a leap in the dark and (almost) blindly trust the code of others (open or otherwise), we have tools at our disposal that increase our own levels of personal trust. No where is this more evident than with the explosion of unit-based testing frameworks. We have httpunit, junit and nunit to name but a few. It's your decision if you wish to use them, if you feel they are right for your particular project or organisation.
The cold hard of syntax and semanics is one thing. The feelings that emerge from the activity of programming is something else – but they are connected to each other.
Not so long ago, I stumbled upon a paper that was authored by Keith Oatley and Phillip Johnson-Laird. It's title: Towards a theory of emotions. It was published in a journal called 'Cognition and Emotion'. Imagine how excited I felt at this discovery!
Is thinking about the 'emotion of programming' a worthwhile endeavour? I have a feeling it might be.
[ top ]
With the advancement of the World Wide Web, networked computing has become an essential determinant on how people access and exchange information. The integration of human factors in networked computing has the intrinsic goal of improving the effectiveness of computer-to-human interaction and, ultimately, of human-to-human communication.
Whilst the HCI community looks predominantly at the application layer and the telecommunications community at the lower end of the ISO OSI stack, little work has been published in bridging the gap between these two communities. Indeed, the human element is often neglected in Quality of Service negotiation protocols. Not only does this have a negative and undesirable impact on the user's experience of networked computing, it also discards the potential for more economical resource allocation strategies. With the proliferation of ubiquitous multimedia in predominantly bandwidth-constrained environments, more research is needed towards integrating and mapping perceptual/human factors considerations across the protocol stack and building truly end-to-end communication solutions.
Topics of interest include, but are not limited to:
More information can be found on in the Computers in Human Behaviour Journal
Inclusive Education in Computer Science
The 12th Annual Conference on Innovation and Technology in Computer Science Education
25th-27th June 2007, University of Dundee, Scotland
The program will consist of:
The University of Dundee has a strong reputation in developing computer systems for users with special needs, accessibility of digital media and widening participation in higher education.
For more details, and to register your interest visit the ITiCSE website
University of Sussex, Department of Informatics September 11-12th
Central themes include:
More information can be found from the HCTW web site
October 10-12, 2006
The London Convention Center
London, Ontario, Canada
Future Play will be running panels on 'The Future of Women in Game Design and Game Development', 'Navigating University and College Administrations with New Game Curricula' and on 'Indigenous Games'
For more information and registration please visit Futureplay.org
[ top ]
by Alan Blackwell
Those familiar with PPIG would probably have heard of the Empirical Studies of Programmers (ESP) series of workshops. The following brief history was written in response to a query on the PPIG mailing list, asking whatever happened to ESP.
The ESP series was managed by the USA-based Empirical Studies of Programmers Foundation. The last published list of the Board of Directors of that Foundation (in 1997) was Deborah Boehm-Davis, Wayne Gray, Thomas Moher, Jean Scholtz and James Spohrer.
There were seven ESP conferences, all held in the USA. The research coverage of the series was very similar to the European (UK-based) PPIG series, which is the host organisation for this newsletter. Many people considered ESP and PPIG to be sister organisations. All ESP conferences except ESP 3 published formal proceedings volumes. Until ESP 6, the publisher of those proceedings was Ablex. The proceedings of ESP 7 in 1997 was published by the ACM Press. An attempt was made to convene an ESP 8 meeting that would have been held in 1999, although insufficient submissions were received for the meeting to be viable. The papers received were instead published as a special issue of the International Journal of Human-Computer Studies (Volume 54, Number 2, published February 2001).
Because many Americans find it difficult to attend meetings in Europe, especially if they publish in ways that fall below the tenure horizon, several of us PPIG folk have wished to support a continued sister series like ESP.
The most recent approach to this has been to create symposia within the IEEE Human-Centric Computing series that would focus on empirical studies of programming issues. The names of these these symposia (in 2001, 2002 and 2003) have tended to focus on end-user programming and usability. Nevertheless, it seems that the IEEE series comes closest to taking the mantle of ESP nowadays. It now meets under the name "Visual Languages and Human Centric Computing" (VL/HCC), and proceedings are published by the IEEE.
The next meeting of VL/HCC will in fact be in the UK (for the first time), and co-located with PPIG, in order that Americans can attend both events. This will certainly be the best opportunity within the next year to meet those people who would once have attended the ESP meetings.
Alan Blackwell is a senior lecturer in computing at the University of Cambridge, UK, and a regular PPIG contributor
[ top ]
by Martin Sustrick
It is quite common to use computers to analyse natural languages. Although we are not yet able to accomplish the task plausibly, the problem is being solved with the hope that one day we will be able to communicate with computers in natural language.
Other way round, programmers have to speak 'computerish'. We are able to 'speak' C, Pascal, SQL or even machine code. We learn a computer language using the same faculties as learn our own human language, in intuitive manner, without a profound understanding of what is going on in out brain.
It looks like both approaches have the same goal: to allow people and computers to communicate freely. However, both are quite extreme. Former approach wants to teach computers to speak English, not taking into account that computers are not humans and therefore cannot speak English, or at least cannot speak proper, complex, human English, full of subtleties and ambiguities. Latter approach wants humans to speak completely computerish (at least when talking about machine code - other programming languages are bit more human-friendly) and fails to recognise that humans are not computers and often have problems dealing with things that look really simple from the computer's point of view.
It is clear that the solution will lie somewhere in-between these extremes. We will not speak to computers using hexadecimal machine code, nor will computers answer in Shakespearean English. The compromise will be natural-language-like enough for people to use it freely without need to spend big amounts of time just to figure out how the code should look like and in the same time exact enough for computers to parse it unambiguously in reasonable time.
The goal of this article is not to make a proposal of this kind of language. If it exists at all, it will be created through long process of evolution, the same way as natural languages evolved. It is not in single person's capacities to design it, nor will it be a single unique language. However, we may look at computer languages as well as on natural languages, compare them and try to figure out what are the differences, why they are there and what kind of problems they are pointing to.
We may do that just by looking at problems we have in computer language design and asking: "How is the problem solved in natural language? Does it give any hint on how to solve it in computer language?" Although this approach alone would give us big amount of interesting results, the problem with it is that there is a lot of problems in computer languages that we are not even aware of, that we don't think about, that we consider to be something unavoidable in its nature.
So we may follow the opposite way. Looking at natural language we may ask: "What is this construction used for? Do we have a need to express something similar in computer language? How do we express it then? Is computer language construct easy to write and understand or is it just an annoyance, when compared to corresponding one in the natural language? If so, is it possible to introduce the natural language construct into programming language?"
In this article I will use mixed approach. It will be just a simple show-and-tell. I will take a simple programming construction, analyse it from linguistic point of view and try to suggest possible improvements. I will call this hypothetical 'improved' C language 'C-ish' to emphasise its tie to natural languages.
Although a more comprehensive study would be able to contain more information and cover the area in a more precise way, I believe that giving very straightforward examples like the ones below without loads of theory is essential for computer scientists without linguistic background to get a grip of the subject. So here we go:
Identifying semantic and syntactic roles in a method call 'Semantic role' of a phrase in a sentence means its relationship to the overall meaning of the sentence. For example, a phrase may be an 'action', i.e. what is going on in sentence. Or it may be an 'agent', the performer of the action. In following example 'peter' is an agent while 'walks' is an action.
English: Peter walks.
Each phrase has also its syntactic role. In aforementioned case, 'peter' is a subject and 'walks' is a predicate. However, it is necessary to keep in mind that there is no one-to-one correspondence between semantic and syntactic roles.
There are some intuitively felt common roles in natural language sentence and in method call statement in a programming language. First of all it is the concept of 'action'. In natural languages this role is performed by a verb (a predicate). In programming languages, same role is taken care of by method name.
English: Peter walks.
C: walk (peter);
In English version of the sentence 'Peter' plays the role of 'actor' (called 'subject' in linguistic terminology). As for C 'peter' has no special role. It is just one of the arguments of the function 'walk'. However, when the example is rewritten into C++, we will see that there is a special syntactic construction to mark the subject. In fact, one of the biggest differences between object-oriented and non-object-oriented programming is the possibility to identify the actor of an action using purely syntactic means.
C++: peter.walk ();
As for syntactic roles, there is no more of them in programming languages. In natural languages on the other hand we can find quite a lot of them. Now the question is: Are there more semantic roles (apart of 'action' and 'actor') in programming languages that would profit from being formalised via explicit syntactic constructions? I would say there are. Say an 'index'. Argument of this type appears in quite a lot functions.
C++: container.insert (position, value);
If there was a special syntactic role for 'index' in programming languages, the call may look like this:
C-ish: container.insert [position] (value);
Compare it with equivalent construction in English. Preposition 'in' plays the same role as '' - namely identifying the place where the action takes place.
English: Peter walks in the park.
We can identify arbitrary number of semantic roles this way. (Few examples: destination, source, range-beginning, range-end, etc.) However, we should keep in mind that not all the semantic roles, even in natural languages, have their syntactic counterparts. Sometimes several semantic roles are expressed using same syntactic role, depending on context to resolve the ambiguity, sometimes descriptive phrases are used to express certain semantic role. Number of syntactic roles should be kept low not to overcomplicate the grammar of the language.
Now let's have a look at syntactic roles in natural languages. Apart of subject and predicate (which are already present in object-oriented languages, as we have seen) there is one special role present in all of them, one playing the key role (together with subject and predicate) in the grammar. It is the role called 'object', i.e. the entity the action is performed on.
English: Peter hits David.
Peter picks a key.
Most of the functions encountered in programming languages do have a clear object:
C++: my_file.open ("log.txt", ios_base::openmode::in);
(What is the action 'open' performed on? It is performed on the 'log.txt' file. It follows that it is the object of the statement.)
At this point I will abandon the subject of the syntactic roles. However, in what follows I will need to use 'object' role, so let's devise following construction to formalise it. I chose a colon sign for this as it resembles dot sign used for separating subject and predicate. This way, standard subject-verb-object statement can be written as follows:
In natural languages, phrases of same kind can be grouped using conjunctions.
English: Peter and Jack hit David.
Peter hits and injures David.
Peter hits David and Jack.
There is no equivalent of this in programming languages, however, every programmer, in case he have learned to program by himself, had probably attempted to use naturally looking but disallowed conjunction at some point:
C: *if (a > 0 && < 10) ...
(Note: In linguistics, asterisk in front of a phrase is used to mark the phrase as ungrammatical.)
The line cannot be compiled, of course.
Looking into the manual, or maybe just experimenting with the statement, he found out that the construct should look like this:
C: if (a > 0 && a < 10) ...
He corrected the line without giving it much thought and forgot about the problem completely.
If asked about the former construct in the time, when he grew more experienced, he would probably say that the construction is not allowed because it is kind of vague and ambiguous, but, if needed, it can be built into compiler without any problems as it is only 'syntactic sugar', something that makes programmer's life easier, but has no special meaning by itself.
However, if he is asked to implement this feature, he would run into really serious problems.
We would expect following three statements to be semantically equivalent:
C-ish: if (a < b & a < c) ...
if (a < b & < c) ...
if (a < b & c) ...
But what should the compiler do to interpret such expressions? Have a look at conjuncted clauses. How would we interpret them in terms of programming language? First conjuncted clause has clear semantics, but what about the second and the third one? What is conjuncted subexpression '< b' supposed to mean? Is it a function that compares its argument to b? And what about 'b & c'? Is it a way to construct a set? If so, operator '<' would have different semantics in each of the examples. In first one it is a standard comparison between two numbers. In second one it is a way to create an unary function. In third one it is a comparison between a number and a set of numbers. Same applies to '&'. In the first case it is a standard logical AND. In second one it is a way to combine two unary functions. In third one it creates a set of numbers.
All these subtleties seem to be too complex to express so simple a concept as joining two parts of an expression by 'and'. Are we following a wrong way? Thinking in terms of orthodox compiler design yields no sensible answer.
So let's have a look at the problem from different point of view. But first, let's make an equivalent example with explicit function calls (bear in mind that '<' is only a shortcut for function 'operator <'):
C-ish: list & another_list . insert : x ;
list . erase & insert : x ;
list . insert : x & y ;
Same parsing problem as we described persists in this example. What we should focus on, is the fact that natural language cannot be described solely by means of context-free grammar. In fact, context-free grammar plus set of transformation rules (non-context-free) is used to perform the task. Once we got rid of context-freeness constraint, the problem begins to give sense. (Note: 'Non-context-free' sounds scary. It gives an idea of greater complexity of parsing that we can afford. However, there is a lot of simple transformations that can be parsed quickly even though they are inherently non-context-free.)
First of all let's think of '.' as a prefix identifying that the subsequent identifier is the predicate. Same for ':'. This prefix identifies the following identifier as the object. Identifiers without a prefix (Φ-prefix) are to be considered subjects.
Next, '&' would mean that the role of subsequent identifier is the same as that of the preceding one in conjunction.
Using this transformation we would get following sequences (we are using minus sign to join prefix with the identifier it applies to):
C-ish: Φ-list Φ-another_list .-insert :-x ;
Φ-list .-erase .-insert :-x ;
Φ-list .-insert :-x :-y ;
Now we will interpret sequences like these as standing for all possible combinations of their subjects, predicates and objects. So for example having a statement with two subjects, three predicates and four objects, it would be expanded into 2 x 3 x 4 = 24 separate statements, each having single subject, predicate and object.
C-ish: Φ-list .-insert :-x ;
Φ-another_list .-insert :-x ;v
Φ-list .-erase :-x ;
Φ-list .-insert :-x ;
Φ-list .-insert :-x ;
Φ-list .-insert :-y ;
In plain C-ish (without the prefixes attached to their identifiers) it would look like this:
C-ish: list . insert : x ;
another_list . insert : x ;
list . erase : x ;
list . insert : x ;
list . insert : x ;
list . insert : y ;
And that is exactly what was meant by original example, this time without the use of conjunctions.
The problem is more complicated, of course. The simplistic algorithm used above certainly does not match the way how conjunctions are parsed in natural languages. We should, for example, be able to conjunct whole predicate-object pairs (i.e. 'verbal phrases' as they are known in linguistics) apart of simple predicate conjunction.
In fact, this predicate-only conjunction we see in the C-ish example is considered to be an advanced feature of language (called 'right node raising') when compared to standard verbal phrase conjunction, that there is no support for in the example. However, problems like this one are out of scope of this article.
Adding conjunctions to standard programming language looks trivial but once it is introduced, other natural language concepts that seem to have no counterpart in programming languages begin to give sense. For all, take an example of passivisation. In natural language you can rephrase the sentence using passive voice:
English: I took it.
It was taken by me.
There is really nothing similar in programming languages, but once you start to use the conjunction construct I just introduced, you will encounter following problem. When you want to get string s to uppercase and append it by string t, you can do it in following way:
C-ish: s . upper & append : t ;
However, when you want to get s to uppercase and append it to t, you will have to split the action into two statements:
C-ish: s . upper ;
t . append : s ;
Now let's suppose there exists a passivisation operator '@' that modifies function in the way that it will exchange syntactic positions of subject and object. This way following statements will be equivalent.
C-ish: t . append : s ;
s . @ append : t ;
Having this kind of construction allows expressing the upper/append example in a single statement.
C-ish: s . upper & @ append : t ;
This example demonstrates how constructs of natural language may be introduced into programming language. However, it is also an example of risks to be faced by doing so.
Although passivisation in this form looks very simple and elegant, it does not occur in any natural language I am aware of. Normally, object is turned to subject; however, subject is not turned to direct object. It is instead promoted to some other syntactic role:
English *It was taken me.
It was taken by me.
In English preposition 'by' is used to identify the actor. In most Slavic languages special case (instrumental) is used for the same purpose. In some other languages, like modern Arabic, idiomatic expression is used instead ('min qiblii' = 'from my side').
This fact may mean that elegant swap-the-roles passivisation construction as I have introduced it, may be felt as unnatural and hard-to-understand by programmers.
However, it may be even worse than that. Looking at the natural languages it may be seen that although passive with actor specified exists, it tends not to be used commonly and is often viewed as purely literary construct. Most occurrences of passive in common language are used to express that actor is either unknown or irrelevant, or, in syntactic terms, to turn transitive verb into intransitive one.
English: It was broken.
The point here is to understand that unreasonable introduction of natural-language traits into programming languages may bring not only benefits, but problems as well.
As we have seen, applying analogy with natural languages to programming languages may yield quite interesting results. I have deliberately restricted myself to syntax, as this is the only field of linguistics that is felt to be integral part of computer science. However there are still other fields that can give equally interesting results. Semantics, morphology or lexicology, for example.
Here are some more problems of interest:
Many of these questions are still to be explored. On the way we may obtain completely different view of what a programming language might be and how they could be analysed and improved.
Martin is from Bratislava, Slovakia. He holds a master's degree of computer sciences from Comenius University in Bratislava and has an interest in the linguistics of programming languages.
Would you like to tell other PPIGers how you are and what you are doing through the newsletter? If so, please e-mail chrisd(at)fdbk.co.uk.
We (me and Richard Bornat) seem to have a test that predicts ability to program with very high accuracy before the subjects have ever seen a program or a programming language. we have a test which picks out the population that can program, before the course begins. We can pick apart the double hump.
We don't know exactly how/why it works, but we have some good theories. A draft paper which incorporates some of the criticism we got at Coventry mini-PPIG in January and the University of Kent and all other relevant files are available on my website
We are planning to conduct the test in a broader scale. Please contact us if you are interested to collaborate.
I am a first-year researcher using qualitative narrative-communicative research techniques to investigate the community of Agile Systems practitioners. I am particularly looking at notions of community identity, communication, boundary and space as demonstrable in conversational storytelling and personal narratives from this practitioner-group.
I have recently conducted a pilot study, incorporating narrative interviews supported by field observations, of a small software development company in the South of England. Preliminary analysis supports the view that this approach, when further applied to the community of Agile Systems developers, will provide potentially interesting results.
Any suggested references, problems, questions, concerns, suggestions, or just general interest from PPIGers would be greatly appreciated. Please feel free to contact me at j (dot) m (dot) hunt (at) sussex.ac.uk and let me know any thoughts you may have!
Brad Myers developed and presented a new overview of End User Programming (EUP) as an Invited Research Overview at the ACM SIGCHI conference, with help from Margaret Burnett and Andrew Ko.
The presentation slides are available
The extended abstract is available
The abstract of the paper is summarised below:
In the past few decades there has been considerable work on empowering end users to be able to write their own programs, and as a result, users are indeed doing so. In fact, we estimate that over 12 million people in American workplaces would say that they "do programming" at work, and almost 50 million people use spreadsheets or databases (and therefore may potentially program), compared to only 3 million professional programmers.
The "programming" systems used by these end users include spreadsheet systems, web authoring tools, business process authoring tools such as Visual Basic, graphical languages for demonstrating the desired behavior of educational simulations, and even professional languages such as Java. The motivation for end-user programming is to have the computer be useful for each person's specific individual needs.
While the empirical study of programming has been an HCI topic since the beginning the field, it is only recently that there has been a focus on the End-User Programmer as a separate class from novices who are assumed to be studying to be professional programmers. Another recent focus is on making end-user programming more reliable, using "End-User Software Engineering."
This paper gives a brief summary of some current and past research in the area of End-User Programming.
The package I wrote about last year: "Sidebrain: a programmers' memory aide" is now available for people to try, at Sourceforge
Also, I'm working on a project to provide higher-level editing facilities, with the aim of exploring what level people think of program changes at (by looking at which of the new facilities they use, and which ones they ignore and prefer to continue with other techniques). The software for this is available at Sourceforge where a series of screenshots are also available
I completed my Ph.D. study in Computer Science recently in Wayne State University, Detroit, MI, USA under supervision of Prof. V. Rajlich. My dissertation title is "Cognitive Aspects of Software Engineering Processes". The abstract is as below:
Software engineering activities are to process a large amount of knowledge and therefore, the cognitive process is mainly involved. Studying the cognitive process involved in software engineering can greatly help us to understand the whole process and to improve software engineering research.
In order to study the cognitive process, we developed an empirical method that includes dialog-based protocol and self-directed learning theory. The dialog-based protocol is based on the analysis of the dialog that occurs between programmers in pair programming. It is an alternative to the common think-aloud protocol and it may minimize the Hawthorne and placebo effects. The self-directed learning theory is based on the constructivist learning theory and the Bloom taxonomy. It captures the specifics of the programmer's cognitive activities and provides an encoding scheme used in analyzing the empirical data.
We conducted a case study of expert and intermediate programmers during incremental software development. Compared to intermediate programmers:
We also conducted a case study on program debugging, and found that programmers apply the cognitive activities at all six Bloom levels and move from lower one to upper one in order to make update. Program debugging is a more complex activity than incremental software development.
A case study on pair programming in software evolution class projects was also performed. The results of the case study showed that paired students completed their change request tasks faster and with higher quality than individuals. They also wrote less lines of code and used more meaningful variable names.
The related publications are as follows:
Xu, S. and Rajlich, V., 2005, Dialog-Based Protocol: An Empirical Research Method for Cognitive Activity in Software Engineering, Proceedings of the 4th ACM/IEEE International Symposium on Empirical Software Engineering, pp. 397-406, November 17-18, 2005, Noosa Heads, Queensland, Australia.
Xu, S. and Rajlich, V., 2005. Pair Programming in Graduate Software Engineering Course Projects. Proceedings of the 2005 IEEE Frontiers in Education Conference (FIE 2005), pp. FIG-7-FIG-12, October 19-22, 2005, Indianapolis, Indiana.
Xu, S., Rajlich, V. and Marcus, A., 2005. An Empirical Study of Programmer Learning during Incremental Software Development. Proceedings of the 4th IEEE International Conference on Cognitive Informatics, pp. 340-349, August 8-10, 2005, Irvine, California.
Xu, S. and Rajlich, V., 2004. Cognitive Process during Program Debugging. Proceedings of the 3rd IEEE International Conference on Cognitive Informatics, p. 176-182, August 16-17, 2004, Victoria, Canada.
Rajlich, V. and Xu, S., 2003. Analogy of Incremental Program Development and Constructivist Learning. Proceedings of the 2nd IEEE International Conference on Cognitive Informatics, p. 142-150, August 18-20, 2003, London, UK.
If you need more information, please feel free to contact using Simon dot Xu (at) algomau.ca. Simon is an assistant professor in the Department of Computer Science, Algoma University College, Laurentian University, Canada.
[ top ]
Not so long ago a question was asked on the discussion forum as to whether anyone had any references regarding the topic of 'slips in programmer practice'.
Below represents a tiny bibliography of papers that are related to this topic. Perhaps it could be useful for anyone who is interested in probing this particularly interesting topic which crosses into the arena of software engineering (and programming in the large) as well as individual programmer performance.
Adelson, B. (1981) Problem solving and the development of abstract categories in programming languages, Memory and Cognition, vol. 9, 422-433.
Adelson, B. (1984) When Novices Surpass Experts: How the difficulty of a task may increase with expertise, Journal of Experimental Psychology: Learning, Memory and Cognition, vol. 10, 483-495
Jadud, M. C. (2005) A First Look at Novice Compilation Behaviour Using BlueJ, Computer Science Education Vol. 15, 25–40
Ko, A & Myers, B. (2005) A Framework and Methodology for Studying the Causes of Software Errors in Programming Systems, Journal of Visual Languages and Computing, vol. 16, 41-84
Vanlehn, K. (1990) Mind Bugs: The origins of procedural misconceptions. MIT Press.
Reason, J. (1990) Human Error, Cambridge University Press, Cambridge, UK
[ top ]
CHI'06 was a great conference for people interested in end-user programming, end-user software engineering, and human aspects of programming. There were a number of excellent papers relevant to this area, and the WEUSE II Workshop (Workshop on End-User Software Engineering) held at CHI was also a big success.
An extended browse of the CHI proceedings, including at least these programming-related papers, is recommended:
You can read about the WEUSE workshop at the WEUSE website
Margaret Burnett, Project Director, EUSES Consortium (End Users Shaping Effective Software)
[ top ]
by Patrick O'Beirne
EuSpRIG is an interest group of academia and industry promoting research regarding the extent and nature of spreadsheet risks, methods of prevention and detection of errors and methods of limiting damage.
We bring together researchers and professionals in the areas of business, software engineering and audit to actively seek useful solutions.
To find out more about EuSpRIG, feel free to visit the EuSpRIG website
[ top ]
by Chris Douce
One of the recurring questions in computer science education is whether students should be taught to program using an IDE, and if so, should they use an 'eduational' IDE such as BlueJ, or one that prepares them for the world of work, such as Eclipse or Visual Studio?
Slashdot discussion: should we teach using an IDE
Relating to an earlier section in the newsletter which introduces EuSpRIG and the work performed by the Euses consortum, the following links are particularly pertinent:
Most spreadsheets have critical errors in one percent of their cells
News stories relating to spreadsheet usage
One enterprising group of developers have made good uses of recent developments in digital photography and have kindly shared a picture of their programming bookshelf (it is more interesting than it sounds – trust me!)
Perhaps newsletter readers would like the share the contents of their bookshelf? (Submissions from psychologists particularly welcome!)
How to write comments
I have seen the error of my ways. These days I prefer to refactor code in such a way that the resulting code barely needs any comments (other than ones like, 'see page n of this book on the programmers bookshelf' - but I really must get into the habit of making explicit references to the edition number!)
I like comments, especially those that go, 'I think this works but I'm not quite sure why'.
I am rapidly become a wikipedia addict. I recently found the following:
List of software engineering topics
List of psychological topics
List of software development philosophies
I have no idea where I found this one from, but it was (at the time) loosely related to work!
[ top ]
Many thanks go to the effors of the reviewers of this edition of the newsletter. All your comments and words of wisdom are appreciated.