I certify that the work in this thesis has not previously been submitted for a degree nor has it been submitted as part of requirements for a degree except as fully acknowledged within the text.

I also certify that the thesis has been written by me. Any help that I have received in my research work and the preparation of the thesis itself has been acknowledged. In addition, I certify that all information sources and literature used are indicated in the thesis.





Signature of Student





By Andrew Polaine



STUDENT NO. 10131908



Thesis Submitted For The HIGHER Degree of

Doctor of Philosophy


Supervisor: Prof. Ross Gibson



University of Technology, Sydney




Given the rather lonely task of writing a doctoral thesis it turns out there are a large number of people to thank.


The first thanks must go to my principal supervisor, Professor Ross Gibson, for his guidance and input throughout the process and for taking me on as his PhD student. He has been a stimulating sounding board for my ideas. Perhaps more importantly as an experienced writer he has given me a great deal of encouragement and guidance in the process of telling the story of those ideas. He has always treated me as a peer and in the darker moments of the writing journey he guided me to the light at the end of the dissertation. Without this the document in your hands would have joined the many other unfinished pieces of work languishing in the dungeons of my hard drive.


I am very grateful to the members of UTS’s Creativity and Cognition studios under the guidance of Professor Ernest Edmonds, in particular Lizzie Muller, who was instrumental in getting my Time Sketches work into beta_space, and Brigid Costello who unfathomably took on the task of interviewing the participants whilst I was out of the country. Their own research mirrored much of mine and the rigor of our conversations along with other members of the studio’s “Pub Club” helped me refine my ideas. Greg Turner (Dr. Greg, no less) also helped me set up the work and took the excellent photos of the two boys playing that are used in this thesis. Thanks also go to David Low who proofread my thesis and thus generously provided me with hours of typo crrections [sic].


All the authors and researchers upon whose shoulders I have stood have my eternal gratitude and they will find their names in the absurdly long list of references at the end. Mark Pesce merits an individual mention for letting me steal the phrase “The Invitation to Play”, which forms a central principle of this thesis. Thanks also to Mark Pesce and the other examiners, Prof.Anne Cranny-Francis and Dr. Melanie Swalwell who all provided generous and useful feedback and suggestions.


Thanks are due to my family and friends as well as my colleagues at Antirom – the environment that formed the foundation of my approach to interactivity – and the other studios in which I have worked. Thanks too to my colleagues at the College of Fine Arts, UNSW (who also provided the grant funding for Time Sketches) and the Bauhaus University in Weimar.


One of the chief sources of inspiration has been my students who continue to inspire me as well as being one of the most demanding and questioning audiences. They have unwittingly let me inflict my untested ideas on them and humoured me that I knew what I was talking about. The readers and commenters on my blog, Playpen, have been similarly gullible and generous, thank you.


Most importantly, immeasurable love and gratitude are due to my wife, Karin, who has put up with bouts of frustration, inordinate piles of books and long monologues from me for much longer than anyone should have to. She also pointed me towards much of the psychology literature contained in this thesis all whilst writing her own Masters dissertation. I am more than a little jealous that she started after me and finished before me.


It goes without saying that any errors and omissions are entirely due to someone else or my cats, Leo, Oskar and Archie who kept me company during the long hours at the computer and added their own input by walking over my keyboard. All the best bits are mine.



In a world increasingly influenced by interactive interfaces, devices and services both in the commercial and non-commercial spheres, understanding interactivity and its underpinnings is essential. We have moved into a state of flux in which both culture and technology are in constant interplay and the only constant is change itself. The only future-proof approach to designing for and dealing with an environment of constant change in these systems and forms is to look for a mechanism and theoretical framework that underpins them all.


Drawing upon a range of disciplines – from design, art, cognitive science, linguistics and more – this thesis argues that play is such a fundamental building block of culture, society, technology and cognition that it is the ideal lens through which to examine the interactive experience. It is versatile enough to cross boundaries and fundamental enough to be understood intuitively. Through an understanding of the intersection between movement, embodied cognition, metaphor and play, a set of principles of interactivity are developed that are flexible enough to analyse and be applied to a broad spectrum of interactive experiences, from interactive artworks to services to individual user interface elements. Finally, it is proposed that these principles provide a way to examine the phenomenal growth of social networks and the fundamental cultural shifts we are experiencing today as a result of the friction generated between emerging networked technologies and the industrial age structures they are dismantling.


ACKNOWLEDGEMENTS................................................................................. i

ABSTRACT..................................................................................................... ii


PART I - PLAY, FLOW, METAPHOR & COGNITION.......................................... 1

Introduction.................................................................................................... 1

      What is interactivity?................................................................................. 6

      Interactivity and the HCI field..................................................................... 10

      Towards a language of interactivity........................................................... 11

CHAPTER 1 - The Role of Play in Interactivity................................................... 13

      Play as a design process, interactivity as content..................................... 14

      Playful interaction and direct manipulation................................................. 17

      Cameras and the body as affordance....................................................... 20

CHAPTER 2 – Flow, Interactivity & Suspension of Disbelief.............................. 23

      Interactive Flow versus Narrative Flow....................................................... 23

      Willing Suspension of Disbelief and Interactivity........................................ 25

CHAPTER 3 - Interactivity, Metaphors and the Mind........................................ 28

      The role of metaphors............................................................................... 28

      Interfaces are both metaphorical and physical.......................................... 30

      A metaphor too far.................................................................................... 36

      Direct control and manipulation in interfaces............................................. 40

CHAPTER 4 - Thinking Outside of the Brain..................................................... 43

      Scripts, Schemata, Mental Models and Automotive Action........................ 43

      Scripts...................................................................................................... 44

      Schemata................................................................................................. 47

      The embodiment of AI............................................................................... 51

      Pay no attention to that man behind the curtain........................................ 53

      Are we automata?.................................................................................... 54

      Not today, I’m not in the mood................................................................. 56

      The Mojave Experiment............................................................................. 58


PART II - Developing the Principles of Interactivity............................................ 63

CHAPTER 5 – The State of Play....................................................................... 63

      Homo Ludens.......................................................................................... 63

      Les Jeux et les Hommes........................................................................... 65

      Love is..................................................................................................... 75

      From Physics Lab to Living Room to the Bus........................................... 77

      Game Studies.......................................................................................... 82

      The Magic Circle....................................................................................... 89

CHAPTER 6 – Principle 1: The Invitation to Play................................................ 91

      Form, function, medium and message..................................................... 92

      Case Study: Time Sketches...................................................................... 103

      Case Study: Body Movies – public play and interaction............................ 110

CHAPTER 7 – Principle 2: The Playing Field & the Rules.................................... 114

      The Rules of Play ..................................................................................... 115

      The Lusory Attitude................................................................................... 116

      Using the lens of gameplay...................................................................... 118

      We are all hackers..................................................................................... 121

CHAPTER 8 – Principle 3: Challenge, Boredom and Anxiety.............................. 123

      The border between boredom and anxiety................................................. 127

CHAPTER 9 – Principle 4: Triviality, Open-endedness, Promises....................... 133

      Playing in the Gallery................................................................................. 133

      Delivering the promise............................................................................... 136

      Case Study: Eavesdrop – an opportunity missed...................................... 140

CHAPTER 10 – Social Interaction and Playing with Friends.............................. 145

      APIs as invitations.................................................................................... 147


PART III - Conclusion...................................................................................... 151

CHAPTER 11 - Understanding Interactivity Through Play................................. 151


POSTSCRIPT.................................................................................................. 155

Social Play & Cultural Disruption...................................................................... 155


REFERENCES................................................................................................. 162


Appendix A – Time Sketches Research Interview Instructions........................... 176

      Interview Questions.................................................................................. 178


Appendix B – Time Sketches Research Interviews........................................... 180

      Subject 1 - Interview................................................................................. 180

      Subject 2 - Retrospective  Report.............................................................. 182

      Subject 2 - Interview................................................................................. 183

      Subject 3 - Retrospective Report............................................................... 185

      Subject 3 - Interview................................................................................. 186

      Subject 4 - Retrospective Report................................................................ 188

      Subject 4 - Interview.................................................................................. 190

      Subject 5 - Retrospective Report............................................................... 193

      Subject 5 - Interview................................................................................. 194

      Subject 6 - Retrospective Report............................................................... 196

      Subject 6 - Interview................................................................................. 198

      Subjects 7 & 8 - Retrospective Report....................................................... 201

      Subjects 7 & 8 - Interview......................................................................... 205




Figure 1: Dan Saffer’s definition of Interaction Design............................... 4

Figure 2: A screenshot from the 1994 Antirom CD-ROM ........................... 15

Figure 3: The "Production" section of the Levi's kiosk................................ 16

Figure 4: A simple ‘soundspace’ audio mixer from the Antirom CD-ROM... 17

Figure 5: Rimbalzo by Fabrica................................................................... 18

Figure 6: The ‘Madness’ opening screen of We Feel Fine.......................... 32

Figure 7: The ‘Mounds’ screen of We Feel Fine ........................................ 33

Figure 8: The view-by-weather screen of We Feel Fine .............................. 34

Figure 9: The Bumptop Interface .............................................................. 37

Figure 10: Apple’s Exposé in action.......................................................... 38

Figure 11: VisiCalc running in MS-DOS..................................................... 41

Figure 12: The City Wall project in Helsinki................................................. 70

Figure 13: A Guitar Hero player striking a ‘power stance’........................... 71

Figure 14: Wii Sports players in action....................................................... 71

Figure 15: Classification of games from Caillois, Man, Play and Games... 72

Figure 16: Tennis for Two by William Higinbotham.................................... 78

Figure 17: Spacewar! running on the Computer History Museum's PDP-1 78

Figure 18: Tennis (Pong) on a 1975 German videogame console.............. 79

Figure 19: SEGA’s Super Monkey Ball game for the iPhone...................... 81

Figure 20: Antirom’s Installation for the JAM exhibition.............................. 100

Figure 21: Time Smear.............................................................................. 104

Figure 22: Detail of the Time Smear slice line indicator............................... 105

Figure 23: Time Slicer................................................................................ 107

Figure 24: Time Slicer installed in beta_space............................................ 108

Figure 25: Rafael Lozano-Hemmer's Body Movies................................... 110

Figure 26: Samuel van Hoogstraten's 1675 engraving, The Shadow Dance............................. 111

Figure 27: Rafael Lozano-Hemmer’s Body Movies .................................. 112

Figure 28: Tokyo Life................................................................................ 135

Figure 29: Dan Saffer’s definition of Interaction Design............................. 151

Figure 30: The interactive experience.......................................................... 152




What makes an interactive experience engaging? Why are some simple interactions so satisfying and others so dull? Why do technological marvels sometimes fail to satisfy whilst a cardboard box can provide hours of entertainment? These questions seem so innocently simple yet answering them turns out to be surprisingly complicated and opens several cans of worms. This thesis examines how can we use an understanding of play as a lens through which to unpick these complex experiences.


The reason why I believe play and interactivity can and should relate to each other so intimately is directly related to my own role and experience in the development of the language of interactive media. During my days as an undergraduate students in the early 90s, “multimedia” – as it was then known – was emerging on the media scene. My lecturer and later friend and colleague, Andy Cameron, encouraged us as photography, film, video and “digital imaging” students to explore this new world and try to uncover its language and syntax. Armed with tools such as Photoshop 2.0, Macromedia Director 3.0 and a new-fangled video technology from Apple called QuickTime, my colleagues, Andy Cameron and I spent many hours playing.


We were playing with the affordances of the tools, playing with our understanding of narrative structure, playing games and, most importantly, playing at making interactive ‘things’. We were not terribly sure what one should call these ‘things’ that we were making. Sometimes it was tempting to refer to the medium itself, such as CD-ROMs, but this only described the storage format, not the interactive experiences. Partly because many of these interactive experiments were incomplete and partly because we felt they were experiments in play in both their creation and usage, we settled on calling them ‘toys’ for the most part. Sometimes we would take the code guts of one ‘toy’ and re-use or re-combine it with another and these guts we would call an ‘engine’.


Thus, within the same breath, we would use both non-industrial and industrial terminology for the same ‘thing’. It soon became clear to us that the “interactives” (as we now describe them) we were making a marked departure from the media forms that had gone before and that a language to describe them, think about them and create them also had to be invented. The media theory we had imbibed as students, that helped us deconstruct the complex layers of semiotics in traditional media, was rendered weak and impotent.


As we generated many of these interactives and published them (c.f. Allenson et al., 1994) we realised that although there was a lot of theory being generated about these new, interactive media, not many people were actually making anything that explored interactivity in its own right rather than as an extension of old media. We also realised that the making of interactives was key to understanding what makes interactivity ‘work’ or not. What interested us was why some interactives that took many weeks of planning, production and technical brilliance turned out to be boring experiences once finished, whilst something knocked up on a Friday afternoon as a jokey experiment hooked the attention of all who played with it. Whatever it was, we knew this process was important and that putting these toys out there in the world was having an impact. As interaction designer and Chair of the New York Art Director’s Club’s interactive jury, Brendan Dawes (2006), later wrote, “these guys changed things. Nothing was the same in the world of so called multimedia ever again”.


To unravel the deceptively simple question of what constitutes interactive engagement through the lens of play and attempt to re-can at least some of the worms it is necessary to examine the ambiguous term, interactivity, as well as what we mean by engagement and play. I also look at the production, exhibition, dissemination and contexts of many of these interactive experiences.


Interactive media – objects and media forms that have interactive elements – are part of a rapidly changing field and in the space of time it has taken to write this thesis technological advances as well as the cultural shifts of those that consume these technologies have been significant. Taking Apple’s iPod as an example that is used within this thesis, we have seen its development from the first generation version to the current multi-touch iPod Touch and iPhone, which has seen three iterations already. Multi-touch interfaces have become increasingly popular, opening up a whole new realm of interaction design challenges yet to be fully grappled with (c.f. Saffer, 2008 for a thorough investigation into this area).


Due to the rapidly changing nature of this area and the rather glacial process of writing a doctoral thesis, it is necessary to place some delimiters on the present exploration. This thesis examines the moment and experience of interaction in the hope that some of these instances will remain constant regardless, or perhaps because of, the onward march of technology. Although various technologies are discussed, specific advances in technology are less relevant and are often a distraction from the underlying experience going on. The following arguments also couch interaction as well as the design of interactions within a framework of play and playfulness, an area that has been central to the development of many human and animal experiences for millennia and is the subject of continual exploration, research and, not least, enjoyment.


The thesis is structured in much the same way as the initial principles of interactivity that are later described. The initial chapters invite the reader to play and to be open to reconciling theoretical differences and paradoxes. I cover as much groundwork of various frameworks as possible, such as interactivity and play in the design process, the psychology and philosophy of play, metaphors and cognition. I shall argue that metaphor, and in particular Lakoff and Johnson’s (1999; 2002) embodied realism account of metaphor, is central to that framework and the understanding of the body and interaction. These provide the playing field and rules for the discussion.


Finally, I attempt to set out this interactive moment and experience in terms of a set of principles that may help us rise above the tangle of disciplines that are continually hoping to lock down a singular definition of interaction and interface design, new-media art and human-computer or computer-human interaction (HCI). These are then explored within the context of play and interaction within social networks and applications. A postscript notes the importance of these cultural shifts for the future. Dealing with the future is not only tricky, but usually proves the writer to be a fool once we actually get there and can look back on those predictions. As such, the postscript is deliberately and somewhat necessarily less rigorous and more speculative than the rest of the thesis.


Concerning the tangle of disciplines described above, it is important to the author and perhaps more important for the reader that the cross-disciplinary nature of interactive media be taken into account. Interactivity, especially in the digital arena, is still a developing form and lacks a consensus of nomenclature, process and categorisation. As will be demonstrated later, part of this problem is that each discipline brings its own flavour of meaning to terminology fairly loosely, but commonly and interchangeably used.


I consider myself both a interaction designer and an artist and there are no neat boundaries to these divisions. I am, after all, a complete person. When I create an interactive artwork I tend to feel that I am really designing it. It is, for me, a very different feeling to creating a traditional artwork such as a drawing or a painting. When it is installed and someone asks me a question about its ‘meaning’ (whatever that term might mean in a question), then I feel a little more like an artist. But the same work could be more clearly a design piece if installed in a shop window for a client. At Antirom (see Allenson et al., 1994; Polaine, 2004b) and still now, I prefer not to make the distinction in terms of my own processes and I enjoy the creative mobility that this blurriness affords.


This blurriness is, however, frustrating when trying to explain what I or others developing interactives actually do. The nearest over-arching title that I have found I can live with is interaction designer, which seems to cover most aspects of working in this area. Dan Saffer, in Designing for Interaction (2006), makes a valiant attempt to define interaction design with a Venn diagram. Naturally, as soon as something is defined, one sees what is missing and Saffer’s diagram ends up showing just how badly defined the term interaction design is. His diagram requires the corralling of several other disciplines to make sense of it and it does not even begin to delve into the realms of performance, exhibition, media and net art, to name but a few:


Figure 1: Dan Saffer’s definition of Interaction Design. Re-Drawn from Saffer, 2006


Thus, this thesis approaches a developing discipline and necessarily borrows from a wide range of existing, established and not-so-established disciplines. Art and media historians and theorists may feel it does not draw upon enough cultural theory, nor does it draw on the often opaque language of those disciplines. HCI proponents will want more empirical and technical data. Behavioural psychologists may also have wished for more direct experimental data and linguists a deeper examination of the semiotics, semantics and syntax of interactivity.


Quite apart from making the field enormously broad in terms of research and literature review, there is also an inherent tension between these different disciplines aiming to find a definitive theory of interactivity. HCI tends to define interactivity in terms of a tool-based view of the world, user-experience and usability experts follow a similar view (Nielsen, 1993; Shedroff, 2001; Nielsen & Tahir, 2002; Garrett, 2003; Saffer, 2006; O'Reilly, 2005). Artists tend to be searching for the answer to interactivity in terms of spectatorship and authorship (Cornock & Edmonds, 1973; Laurel, 1993; Manovich, 2001; Cameron, 2004; Edmonds, Turner, & Candy, 2004; Haque & Pangaro, 2006), just to mention two areas. Cultural theorists, art and media historians and curators are also a significant influence in the effort to try and define interactive media, of course (Weibel, Klotz, Duguet, & Shaw, 1997; Manovich, 2001; Weibel & Shaw, 2003; Huhtamo, 2004; Paul, 2005; Bullivant, 2006; Polaine, 2006a). In the other corner, fighting for academic recognition and a theoretical position of its own, free of previous textual analysis is videogame theory (Herz, 1997; Frasca, 1999; Salen & Zimmerman, 2003; Aarseth, 2004; Poole, 2004; Raessens, 2006; Rodriguez, 2006). In short, everyone wants the study of interactivity to sit within their own area of expertise rather than treat it on its own terms.


If this thesis is read within the narrower remit of any one of those disciplines not only might it be found lacking (though I hope not), but also the reader will miss the primary point of a cross-disciplinary enquiry. To approach interactivity from all of these disciplines thoroughly in one attempt would require the writing of several theses, some of which have already written by others, and would lose the overall analysis of the assemblage of traits that interaction design has inherited (or stolen out of necessity) from all those disciplines in order to develop into the area it has become.


One of the reasons for using play and metaphor as the main pillars for much of the overarching argument and approach presented here is that they cross almost all disciplines and aspects of life. Metaphors are indeed concepts we live by (Lakoff & Johnson, 1980) and the use of metaphor crosses boundaries of usage from pre-birth to afterlife, high art to hard science.  Play can be encountered in every cultural sphere, even if sometimes only in its absence. Although it is such universal aspect of the human condition, it remains elusive to define even though we almost all know it when we see it (Sutton-Smith, 1997). The same is true of engaging interactivity and the reason it is often so hard to define is because we are often too keen to try and couch it exclusively within the terminology of a single discipline, an approach that usually fails to satisfy the enquirer and almost always fails to satisfy those from other disciplines. Finally, I, and many colleagues with whom I have worked, have found playful interaction and playful approaches to creating interactive work successful over the past 17 years or so. An examination of why that should be the case is, for me, the most interesting question of all.

What is interactivity?

Since the late 80s and early 90s the number of screen-based environments, from art installations to wireless devices, has dramatically increased and many are interactive in some shape or form. They form integral parts of our daily lives and this has been mirrored by a steady increase in the study of usability and information architecture (Nielsen, 1993; Rheingold, 1993; Norman, 1998; Nielsen & Tahir, 2002; Garrett, 2003; Polaine, 2004a; Pesce, 2004).There is a body of literature that discusses new media, from art installations to the Internet and mobile phones, in terms of their cultural or linguistic forms (Plant, 1998; Lovink, 2002) and social significance (Haraway, 1991; Rheingold, 1993; Johnson, 1997), to name but a few areas that these technologies have touched. These discourses, however, tend to either concentrate on the user's ability to access information or complete tasks as the prime objective of interaction (generally within the field of Human Computer Interaction), or concern themselves primarily with the socio-cultural impact of technologies rather than examining the moment and possible pleasures of interaction in and of itself.


The gap between these two forms is not surprising. It is difficult to empirically research an interactor’s experiences and internal states when interacting with artworks and interfaces in isolation from the surrounding content, intention and environment. Whilst not wanting to invoke a discussion of ‘interactive narrative’ at this point (see Willing Suspension of Disbelief and Interactivity later) there is a useful comparison to be made to the art of story-telling. In the development of narrative the structure – quite separate from the content, language and subtext of the story – is crucial to overall engagement and coherence (McKee, 1997). Without a well-structured foundation, it is difficult for a story to engage, regardless of how enticing the writing might otherwise be. Yet by the same token, a linguistic or syntactical analysis of a text largely kills off any enjoyment of the story as an entertaining experience. Much of the cultural theory examining new media referenced above seems to do the equivalent of examining the use of language, subtext and story (and Foucauldian power relationships of language) whilst skimming over the underlying foundation and critical difference in terms of the experience – the experience of interactivity.


Although the interactive experience may be difficult to analyse, residing as it does inside the consciousness of the interactor, it should be noted that the problem is perhaps more to do with the use of the wrong tools for the job rather than the object of study itself. As noted in the introduction, disparate (rival, even, considering the shrinking pool of research grant funds) academic disciplines compete to own a definition of interactivity. Espen Aarseth, writing on the debate over narratological relevance between stories and computer games, likens this to a land-rush:

One crucial aspect of this debate is whether games can said to be ‘texts’, and thereby subject to a textual-hermeneutic approach. Here we find the political question of genre at play: the fight over the games’ generic categorization is a fight for academic influence over what is perhaps the dominant contemporary form of cultural expression.

[...] The great stake-claiming race is on, and academics from neighbouring fields, such as literature and film studies, are eagerly grasping ‘the chance to begin again, in a golden land of opportunity and adventure’ (to quote from the ad in Blade Runner). As with any land rush, the respect for local culture and history is minimal, while the belief in one’s own traditions, tools and competence is unfailing (Aarseth, 2004, p. 45).

In the meantime HCI has, up until recently (Gaver, Beaver, & Benford, 2003), been bogged down in behavioural-response analysis and tool-based thinking, which is the interactive equivalent to a kind of linguistic analysis of syntax devoid of much acknowledgement of emotion or phenomenal experience. The gap between these two approaches to interactivity is the focus of this thesis. A key argument is an understanding of interactivity that is enhanced by a deep understanding of play and how play feeds into both the creation and the experience of interactive works, be they interfaces, installations artworks or something in-between. It also asserts that interactivity is the important “new” part of “new media” – at least in terms of these theoretical frameworks mentioned above – and that the physical experience of interactivity cannot be fully explained or understood by existing theories of ‘old’ media. That said, this ‘newness’ itself presents a whole host of problems, some of which this thesis intends to unpack.


One of the issues with the term ‘new media’ is that it has not enjoyed a common understanding or definition. ‘Interaction’ and ‘interactivity’ are perhaps even more problematic since they have definitions in semiotics and cultural theory that are quite different from the one proposed here. Central to the argument of this thesis is the idea that physical interaction gives rise to a psychological and phenomenological experience of absorption and flow (Merleau-Ponty, 1964b; Csikszentmihalyi, 1990; Csikszentmihalyi, 1975), rather than interaction being a purely psychological affair or the mingling of meanings in a viewer’s mind (Manovich, 2001). The contention is that through understanding this process and its relationship to play we may be able to develop a set of principles of interactivity that help us both analyse and create interactive experiences.


Tying these strands together – interactivity (which often incorporates metaphor), play and the embodied experience – is the work of George Lakoff and Mark Johnson (1980; 1999). Lakoff and Johnson’s work on metaphors, the embodied mind and de-throning the Cartesian dominance of Western philosophy provides the bridge between the concerns of critical media art history and theory and those of HCI, whilst remaining focused on our embodied experience and understanding of the world. Of course their own view is heavily indebted to the phenomenological philosophy of Merleau-Ponty that also informs this thesis (Merleau-Ponty, 1964a; Merleau-Ponty, 1964b; Merleau-Ponty, 1964c; Merleau-Ponty, 1990; Matthews, 2002; Merleau-Ponty, 2004)


Brenda Laurel’s Computers as Theatre (Laurel, 1993) took a different direction from that of HCI very early on by exploring the parallels between human-computer interaction and dramatic representation and the emotional engagement this entails. More recently, the development of ‘experience design’ (Shedroff, 2001) has begun to address the entire user experience in the field of design, but the processes and ‘language’ of interactivity – the underlying structures that help to create interactive engagement – are relatively under-explored and ill-defined (Johnson, 1997; Schwartz, 2004; Johnson, 2006)


In the meantime the development and discovery of ‘good’ and ‘bad’ interaction has tended to be confined to individual experimentation within the realms of art, design and everything in-between. Since beginning this research, we have also witnessed an explosion of ‘Web 2.0’ applications that explore social networks, communities and, essentially, group play – some of which have become cultural and commercial explosive successes. Websites and web applications such as MySpace, YouTube, Flickr, Google Maps and the rise of blogging and Twitter have seen consumers become producers and socially networked (O'Reilly, 2005). But perhaps more significantly in relation to this thesis is that we are seeing the growing culture of community play. This suggests a further avenue of exploration and forms the final chapter of this thesis.


The crucial question of “What is interactivity?”, however, remains problematic. Lev Manovich’s book, The Language of New Media (Manovich, 2001) explores new media’s heritage of cinema and computer technologies and documents a broad history of work, but his view and definition of new media is a narrow reading of its genesis. Crucially, Manovich denies that interactivity is a fundamental, defining component of new media, instead arguing that all texts and art are interactive as they require the “psychological process of filling-in, hypothesis formation, recall, and identification, which are required for us to comprehend [them]” (Manovich, 2001, p. 57). Manovich avoids using the term ‘interactive’ because he suggests, “there is a danger that we will interpret ‘interaction’ literally.” That is, that interaction will relate to the physical aspects of interaction (with buttons, mouse and the screen) “at the expense of psychological interaction” (Manovich, 2001, p. 204). Huhtamo (2004) points out in his defence of “interactive” that many traditional art critics dismiss the term as irrelevant by arguing that all art can be “active” in some sense, and Manovich takes the same path as these critics. “However, interactive art added to the mental activity a haptic dimension: the visitor was not only allowed, but required to touch the work” (Huhtamo, 2004, p. 3).


Yet it is precisely Manovich’s literal definition of new media’s origins, of the physical coming together of cinema and computers with Konrad Zuse’s use of old film as a punch-card programming system, that prevents him from being able to include interactivity within his theoretical framework. Part of the difficulty arises from Manovich drawing so heavily upon cinematic theory, which concerns itself primarily with the psychological relationship between viewer and moving image.


The nature of story structure and plot, character and action have been explored since the days of Aristotle’s Poetics (Pesce, 2000) in order to understand how to engage the audience in the emotional and spiritual journeys of the characters. Cinema uses editing and sound as well as mise-en-scene, the relationship of the camera to the subject, the lighting and the art direction to enhance the emotional subtext of the story and draw audiences in to the world of the film (Bazin & Gray, 1967). In this framework there is little or no space for physical (as opposed to psychological) interactivity. As author of The Playful World (Pesce, 2000) Mark Pesce suggests, “Cinema is the true lean-back medium because you suspend everything else – it is storytelling and you don’t talk over the storyteller” (Polaine, 2004a).


When we discard the physical of course we are left with psychological interaction, yet it is exactly this physical interaction that is the “new” in new media. Screens and projection surfaces may change radically each year as new technology arrives, but sequential frames running through time are still videos (in the broadest sense), whether on a mobile phone or a television screen. By using cinema as a starting point we come no closer to defining interactivity, leading Manovich to admit:

Although it is relatively easy to specify different interactive structures used in new media objects, it is much more difficult to deal theoretically with users’ experiences of these structures. This aspect of interactivity remains one of the most difficult theoretical questions raised by new media (Manovich, 2001, p. 56).

This thesis suggests a different definition of interactivity to that of Manovich, one in which the physical interaction is a key component. In this physical definition of interactivity an ‘interactor’ makes a change to the device presented to them (usually to elements on a screen, but not exclusively) which in turn changes the interactor’s own behaviour. Edmonds and Cornock used the term participant in their influential paper, The Creative Process Where the Artist Is Amplified or Superseded by the Computer, when describing an ‘art system’ (1973), but I use the term ‘interactor’ because I feel it more accurately describes the role than user, viewer or participant (a participant, can be passive or coerced, after all).


In a complete interaction the interactor’s changed behaviour creates another change in the device’s reaction, which results in another change in the interactor’s behaviour, thus producing a feedback loop of interaction, a term Cornock and Edmonds also employed when describing this ‘dynamic situation’ as a ‘matrix’ (Cornock & Edmonds, 1973).


This idea of a physical-virtual feedback loop allows us to escape both Manovich’s purely psychological definition of interaction and look beyond the concerns of interface and usability by referring back to action, as Laurel (1993) suggests. This form of action, reaction and interaction is also a fundamental aspect of play and game playing (Freud, 1950; Huizinga, 1955; Caillois, 1961; Derrida & Bass, 1987; Sutton-Smith, 1997). Indeed, in describing their 1969 installation, Interplay, Cornock and Edmonds wrote: “We would therefore consider this system as one leading to amplified play situations that are potentially creative” (1973, p. 14).


In this system, embodiment is crucial, for without it there is no interactive system and it defies any analysis beyond the textual-hermeneutic that Aarseth describes (Aarseth, 2004). Simon Penny notes,

The embodied, enacted dimension of interactive entertainment has not been adequately considered. In particular, embodied interaction with representation, where bodily action changes the representation in a way which is analogous to, and is designed to be analogous to, human action in the world of physical objects, raises scenarios which conventional critiques of representation, and those aspects of art theory that remain influenced by traditional psychology of visual experience, are not well equipped to deal with (Penny, 2004, p. 73).

This creative feedback loop, and the physical component of it, also contributes to the flow experience (Csikszentmihalyi, 1992). This state of complete absorption in a task in and of itself is also a critical element to interactive engagement as it is evoked throughout this thesis. As we shall see in the coming chapters, these various themes and approaches can be tied together to develop a more coherent understanding of the interactive experience through the notion of play.


Interactivity and the HCI field

From navigational menus to videogames, interactivity is often conceived as part of an interface to other content. This is indeed the purpose of many interfaces and their design is often critical in the success or failure of products and services (Nielsen, 1993; Nielsen & Tahir, 2002; 37 Signals, 2006). This has been a primary focus of the field of Human Computer Interaction (HCI) up until relatively recently (Gaver et al., 2003). Unfortunately, this field frequently ignores the emotional experience of the moment of interaction and relegates interaction to a function of a control mechanism at best and something to be mastered and 'got through' at worst – they are not “honestly interactive systems” as Haque and Pangero (2006) describe. Laurel argues that we have this relationship the wrong way around:

Action is indeed the primary component of human-computer activity – not environments, interfaces, or objects. But environments, interfaces, and objects are traditionally much easier to conceive of and represent than a quality that is fundamentally invisible, and the structure of which is contested at best (Laurel, 1993, p. 135).

In design, this inevitably leads to the focus on the technology and interface rather than the emotional experience of the interaction. Laurel continues:

What if we were to define the action of information retrieval, not as looking for something, but examining or experiencing it? This seemingly innocuous shift in point of view puts the emphasis in an entirely different domain: the action involved in perceiving, interpreting, and experiencing information (Laurel, 1993, p. 140).

Taking this view, the focus of design shifts from designing interfaces to designing interactions and interactive experiences and indeed this is what we are currently seeing the new generation of online applications such as YouTube, MySpace, Flickr, Twitter, et al. provide – a space in which people create their content and the experience of that process is as important (if not more so) than the end result (Coates, 2006; O'Reilly, 2005).

Towards a language of interactivity

Without a clear understanding of the language and nature of interactivity the analysis and design of interactive media will continue to be a fractured, confusing and potentially emotionally barren landscape. Although literature concerning video game culture is useful and interesting (e.g. Herz, 1997; Kent, 2002; Wolf, 2002; Wolf & Perron, 2003; Poole, 2004) and is explored in later chapters, video games set up a number of conditions, such as competitive strategies and visceral experiences, that bring interaction into a rather different, hyperactive zone. Competition and goal-based gameplay tends to overshadow the interaction itself and interface is again reduced to a functional role. This is essentially the progress rhetoric of Huizinga’s Homo Ludens (Huizinga, 1955) which focuses on play as the development of civilisation. The notion of play applied to the realm of interactivity provides a way forward beyond the limits of media forms and usability and a possible solution to the theoretical difficulties with interactivity that Manovich discusses.


Merleau-Ponty's ethos of remaining a perpetual beginner (Matthews, 2002), Winnicott's (2001) notion of play and the almost naēve meditative state that it induces can be useful design guides along with Csikszentmihalyi’s notion of flow (Csikszentmihalyi, 1975; Csikszentmihalyi, 1990), a notion that will be examined in a later chapter of this thesis. By narrowing our focus to the moment of interaction and developing interactive “toys” in which the interface is dissolved and interactivity is paramount we are a little closer to grasping the principles of engaging interactions. It is hoped that these explorations can help the field of interactivity move beyond the duality of technological obsession versus cultural/media theory. In this way we can develop a coherent theory of interactivity that mirrors the theories of cinema without being a simple transposition of one medium’s theoretical framework onto another cultural form.


In turn, these principles may then be applied to a wider range of objects and interfaces in which interaction plays a role. As the developed Western world escalates our options for leisure pursuits whilst simultaneously increasing the amount of stress that making these choices involves (Schwartz, 2004), developing playful diversions and understanding how playfulness can be woven into everyday interactions may prove to be an ever more important consideration for designers and the basis for future research.


CHAPTER 1 - The Role of Play in Interactivity

There is a body of material from sociology and psychology (Huizinga, 1955; Caillois, 1961; Sutton-Smith, 1997; 2001; Brown & Vaughan, 2009) concerning the experience of play that is pertinent to understanding the pleasure of play in interaction. Winnicott's (2001) view of play is that it operates in a halfway world between our inner and outer worlds. More importantly, he defines play as a meditative space and highlights its associated physical movements in his summary of children at play:

The area of playing is not inner psychic reality. It is outside the individual, but it is not the external world […] Playing involves the body because of the manipulation of objects […] Playing is essentially satisfying (Winnicott, 2001, p. 52).

Mark Pesce, author of The Playful World (2000), believes that Sony's camera-based games using the EyeToy on the PlayStation 2 represent a fundamental shift in interactivity where the body becomes the interface. Pesce argues that this kind of playful interaction connects with us at a deep level in the same way that Aristotle's dramatic arcs seem hard-wired into our psyches (Polaine, 2004a).


Brian Sutton-Smith in his exhaustive review of the field of play, The Ambiguity of Play (1997) argues that the theory of play is far from unified and its ambiguity comes from several ‘rhetorics’ of play that influence any use of it within related fields. These rhetorics are explored later in this thesis, however it is perhaps not too dangerous to suggest that as children we learn about the world through play, we come to understand how language, human relationships, physics and our own bodies work – to name but a few of the enormous learning tasks that face an infant – all of which involve making mistakes by necessity. It is important to point out that these may not be the purpose of play for as Sutton-Smith points out, this is remains ambiguous:

Play is used in so many ways in the arts, literature and semiotics, and is made to enter into so many relationships with metaphor, that there is no way these varieties can be contained in one simple ludic category (Sutton-Smith, 1997, p. 149).

Although these kinds of explorations and processes may not be the purpose of play they are, however, features of play. In terms of designing for interaction and playfulness as a design process, it is useful to consider the rhetoric of childhood play – as we mature play becomes an activity dismissed as childish and not serious and the need to stop making mistakes is drummed into us via our educational systems. These are systems that privilege results over process, goals over intrinsic experience – they are the rhetorics of animal progress and the rhetorics of child play that Sutton-Smith (1997) explores. Yet to understand interactivity fully it is necessary to be willing to make plenty of mistakes on the way, for there is still plenty of unknown territory to explore.


One of the rhetorics of play is that it is held to be 'not serious' and thus easily dismissed, but because it is 'not serious' it matters less when one makes mistakes (Sutton-Smith, 1997). Most designers will be aware of the power of prototyping, role-playing and brainstorming, all of which require a suspension of judgement and the willingness to 'play along' in order to discover something new and break free of traditional thinking (Schrage, 2000; Kelley & Littman, 2001).


In discussing Antirom’s work practice, Cameron (1998) describes it as an essentially modernist approach of uncovering the "essence" of interactivity. This, when viewed through the philosophical lens of Merleau-Ponty’s (1990) phenomenological philosophy, advocates, not a total disconnection with experience in order to understand it in an empirical, Cartesian sense, but rather a "distancing" that allows us to step back from our entanglement and re-create our naivety, to remain the "perpetual beginner" as Merleau-Ponty scholar, Eric Matthews (Matthews, 2002), describes it. In this view human beings "are part of the world that they experience, and who experience it, not in the form of pure contemplation, but in the course of active involvement with it" (Matthews, 2002, p. 35). We are not simply brains observing the world passively, we are complete, embodied beings that experience the world through our physical bodies.


It is via this approach that it is possible to both explore the structures of engagement (as one might discuss the story and character in a traditional narrative) but remain open to the possibilities and experiences of new and interesting modes of interactivity. There appears to be a direct parallel here between Merleau-Ponty's (1990) attempts to escape the dry, logical Cartesian view of being via phenomenological philosophy and the traditional battle lines often drawn between programmers or engineers (who often literally work within Cartesian co-ordinates) and designers' desires to utilise the emotive properties of objects, graphics, spaces and interactions. At the very least, this ‘messy’ approach (Cameron, 1998) is another way to escape the emotionally bare empiricism of HCI.


It is appropriate, therefore, that a playful approach to design can also help us understand how to design for playful interaction in a media form which is still in its infancy.

Play as a design process, interactivity as content

Throughout this thesis the role of play and interactivity is discussed in depth, but before moving onto play’s relationship to interactivity itself, let us examine the role of play and playfulness as a design strategy for interaction for they are, in my experience, deeply entwined. As discussed previously, the bulk of the practical research into interactivity and interaction design has been from artists and designers experimenting in both personal and commercial work and discovering what ‘works’ and what does not.


In 1994 I co-founded the new media collective Antirom whose first work was an eponymous CD-ROM that was intended as a "critique of the poverty of contemporary multimedia" (Allenson et al., 1994). Antirom's interests, research and projects were fuelled by a desire to try and ‘discover’ the interactive equivalents to cinematic language, very much in the modernist tradition of attempting to discover the 'essence' of a medium, as Antirom co-founder Andy Cameron explains (Cameron, 1998).


Significantly, the process of 'discovering' these interactions at Antirom was through an iterative design process that required the willingness to make many mistakes:

We have not discarded lines of enquiry merely because they appeared to be ridiculous or stupid, and we have made as many mistakes as we could, as quickly as we could (Cameron, 1998, p. 6).

The Antirom members felt that many of the early 90s commercial interactive multimedia outpourings (mainly kiosks and CD-ROMS) consisted of "ill-conceived point-and-click 3D interfaces" (Allenson et al., 1994) grafted onto re-purposed old media content such as video, text, images and audio. The Xplora CD-ROM by musician and artist Peter Gabriel (1993) is one such example. Although Xplora was a polished and popular production, the interactivity within it mainly consisted of navigational elements that aided (or in the case of the VIP area, hindered) access to video and audio samples of Gabriel's musical works. As such, the interactivity was little more than database access and VCR button functionality.


Figure 2: A screenshot from the 1994 Antirom CD-ROM. Users click and re-mix audio and rotating body sections in an interactive rendition of the traditional 'Misfits' children's book. (Allenson et al., 1994)


Antirom sought to address this by creating interactive works in which the interface was the content and the purpose of the interaction was the experience of the interaction, not a vehicle to access another, old-media, experience. The original Antirom CD-ROM (and subsequent projects such as an installation for the Barbican's JAM exhibition and several installations, kiosks and CD-ROMs for Levis Strauss and Co. Ltd.) consisted of a collection of simple interactive experiments (Allenson et al., 1994; Allenson et al., 1995; Allenson et al., 1996b).


Figure 3: The "Production" section of the Levi's kiosk. This section allowed users to interact directly with a simplified version of the manufacturing process with each process being turned into an interactive "toy" (Allenson et al., 1995).


These interactives became known as ‘toys’ because of the nature of the audience's interaction with them – most of these toys were playful and created an engaging diversion, without having the kinds of goals or competitiveness of games. The pleasure was in the playing. Each interactive deliberately utilised only one or two simple forms of interaction, such as moving the mouse around the screen or rolling over different elements, so that the interactor might become absorbed by the experience of interaction rather than being frustrated by the interface.


Difficulties arise when trying to explain exactly why one toy or 'interactive' is more engaging than another. Engagement in this sense uses Laurel's comparison to "willing suspension of disbelief" in theatre, that is to say how engrossed in the experience and action of the interactivity the interactor becomes. Closely related to ‘flow’ (Csikszentmihalyi, 1990), this is the fundamental “difficult theoretical question” than Manovich (2001, p. 56) raises and then avoids. This thesis intends to examine precisely these issues.

Playful interaction and direct manipulation

The work at Antirom (and by its members in other institutions since) found two main factors that contribute to an engaging interaction; these are playfulness and making the experience of an interaction the content or primary aim (Cameron, 1998; Allenson et al., 2001). Although graphics, sounds, spectacle (and occasionally narrative) may all play their part in the overall experience of an interactive work they were often secondary to the interactivity. Interactive ‘engines’ (code with placeholder graphics) were usually designed first and appropriate content inserted after several prototypes were made. This process is much closer to those used in product design and engineering than graphic design (which has since become the more dominant design discipline of the Web) or more traditional fine-art approaches. This is not to say that the audio-visual treatment of an interactive is inconsequential, it is an essential factor to an engaging piece, but rather that the interactivity was a more important starting point. A beautiful but tedious interaction remains a tedious interaction regardless of the graphical treatment.


One example is the ‘soundspace’ audio mixer (see Figure 4). Rather than re-creating a virtual version of a real audio mixer complete with 3D rendered buttons and sliders, the Antirom designers simply assigned sounds to matching graphical elements on the screen. In one version this was a scan of a toy horse, a dollar sign and a sheriff’s badge. Each represents a sound (horses galloping, Western music and cowboy dialogue) and each can be dragged around the screen. The volume of each sound is determined by the proximity of the mouse to the object and the user interacts directly with the content to create the audio mix of their choice.


Figure 4: A simple ‘soundspace’ audio mixer from the Antirom CD-ROM (Allenson et al., 1994).


Starting from a basic interactive principle (rolling the mouse around the screen, for example) and stripping back extraneous elements allowed the Antirom designers to explore interactivity in its own right. This means there is almost no interface learning process for the interactor, but another kind of learning does take place and that was learning to create more pleasing ‘mixes’.


The principle of dissolving the interface and allowing people to interact directly with the content (albeit by using a mouse, keyboard or audio input) is clearly demonstrated in a simple interactive from Fabrica, the Benetton Research and Development Communication Centre, in Italy. Rimbalzo (Fabrica, 2001) is an interactive sound toy that uses an image of a moulded concrete wall as its ‘interface’ and also the graphic content. When the interactor clicks on one of the moulding indents, it falls from the wall like a ball and bounces on the ground producing a simple kalimba (an African thumb piano) note (see Figure 5). The “ball” bounces back to its original position and falls again bouncing and producing the sound ad infinitum. The time interval between bounces and the pitch of the note is defined by the height of the ball on the wall. When several balls are bouncing at once they produce a phased rhythm and musical sequence entirely dependent on the moment in time each ball is set in motion.



Figure 5: Rimbalzo by Fabrica in Italy. Each moulding recess becomes a bouncing ‘ball’ when clicked (Fabrica, 2001).


The interaction itself is extremely simple; one can click on the balls or click on the reset button. Again, the pleasure and engagement comes not from learning how this interaction works but in trying to ‘play’ the interactive as an instrument, to try and become better at creating more ordered or desirable phasing patterns rather than a random sequence (which is also possible).


In the late 1990s Antirom started to experiment with removing the physical interface to the computer by removing the mouse and keyboard and monitoring one small, but significant, attribute of the users and their environments: the volume of the sound entering through the computer’s microphone. Limited as this was, the interactions became much more intuitive and organic as we discovered through Toto Hasegawa’s CD-ROM, Microphone Fiend (Hasegawa, 1996).


Microphone Fiend utilised the microphone sound level to drive some simple interactives that had previously been interacted with via the mouse. In one interactive the amount an iris opens up on screen depends on the sound level. Inside or behind the iris a man turns to the camera, again dependent on the sound level, and mouths “Shut it!”. The interaction is very simple but strangely compelling because of its organic reaction to audio input.


The removal of mouse and keyboard makes this kind of interaction perfect for installations. Interactors do not have to learn an interface, nor do they have to grapple with any kind of physical device and risk breaking it. This last point, whilst not empirically researched here, has been my observation of many installations. People tend to treat a mouse on a pedestal with extreme, almost animal-like caution. They will first observe others using the interactive from a distance, circle the area and then, when no one else is looking, make a first tentative stab at an interaction. If they go unrewarded in the first instance, many people walk away. By “people”, I really mean adults; children dive in, unafraid to make mistakes or to break the installation, and usually have a much more engaging experience, judging by the looks on their faces. The removal of the mouse and keyboard removes the computer nature of an interactive work and allows it to be more magical – the reaction and interaction “just happens”.


Interactions that require no effort to start them on behalf of the interactor, such as the reactive microphone level toy, avoid this problem all together. These interactions start with an initial reaction from the computer causing a reaction from the interactor. If this is successful and engaging, the cycle repeats and the interactor’s reaction creates another reaction in the computer and the interactive feedback loop occurs.


Cameras and the body as affordance

The next step in this process is to use the body, but data gloves and body suits all need to be hooked into the computer via an umbilical cord of wires. Cameras, on the other hand, allow for immediate reaction-interaction scenarios and can be as intuitive as standing in front of a fun-house mirror. Mark Pesce (2000) points to Don Norman’s usage of interface affordance (Norman, 1998) when it comes to using Sony’s EyeToy camera that plugs into their PlayStation 2 videogame console. He believes camera-based systems like the EyeToy are the alternative to VR and are immediately intuitive.

The interesting thing about [the EyeToy] is that if you show it to a three year-old, who doesn’t have fine motor skills, they immediately understand how to [interact] and they are falling in love with the EyeToy and the games. On the other end of the scale, you show the EyeToy to a fifty year-old, someone who would never play a videogame for a whole bunch of different reasons (one of which has to do with the controller) they start playing it immediately. It has removed the idea of the interface, it has made the interface completely invisible as far as they are concerned and it is also acting now as a bit of a mirror and a camera so they understand that their body has become the affordance (Polaine, 2004a).

Once the need to master a fiddly and complex game controller, or remember complex combinations of special moves is removed the nature of the games that are played also changes. Pesce suspects this shift represents a conceptual rather than technical one (Polaine, 2004a). As always, the tools used affect the content and vice versa; sometimes the conceptual leap is far more difficult than the technical.


Given the usually highly competitive nature of most videogames, it is interesting to see that the EyeToy comes bundled with a set of games that, whilst mildly competitive, are rather more openly playful and fun in nature than the usual videogame fare. Most of the games in Play 2 (London Studios, 2005), the second release for the EyeToy, are staged in such a way that players can compete against each other in a round-based structure. Whilst there is a clear winner at the end of the game, a large part of the enjoyment comes from act of playing the game, much more so than the winning. Other games follow the “keep with the beat” idea of Dance Dance Revolution (Konami Corporation, 1998), an arcade game in which the player dances on pressure pads and tries to keep in time with the music. The experience is more like Karaoke than the sports field or racing track. The Air Guitar game in Play 2 is not particularly difficult, for example, one must simply strum and place the hands in the correct positions, but a large part of the entertainment value comes from watching someone else play. Watching your friends move around physically in a ludicrous manner is clearly part of the appeal.


Pesce also notes that “there are this whole set of games that aren’t games, they are just things that you do”. Of course, as he points out, none of this is new. Media artists have been experimenting with video cameras from their early days. Myron Krueger’s Videoplace (1974) is one of the pioneering examples that he started working on back in the early 1970s. Videoplace uses cameras to track participants in a space and combines their live image (or silhouette) with computer generated graphics on screen. (For an examination of Krueger’s role in the development of interactive installations see Cameron, 2005).


Mine-Control, a U.S.-based artistic collaboration between Zachary Booth Simpson, Jim Greer and several other programmers has also been creating camera-tracking installations for a number of years. One of their most famous, Shadow Garden (Mine-Control, 2004), is a selection of interactives which respond to the shadow that the participant casts onto a projected scene. Elements in the scene react to the shadow, often displaying realistic physical natures. In one scenario translucent coloured ‘sand’ is pouring from top to bottom of the screen. When the interactor stands in front of the projection and casts a shadow, the sand flows around them as it would in real life. If the interactor makes a cup shape with their shadow, the sand fills the cup and is released again when they remove the shadow.


None of these works make any real comment in the tradition of the fine arts about gardens or video or even interaction. What they do provide is an engaging experience for the interactor and that is both a challenge and an accomplishment in itself. As Cornock and Edmonds write:

The system can be very rich, though the speed with which the participant may exhaust the set of possibilities means that the result could lack substantial interest or value (Cornock & Edmonds, 1973, p. 13).

It remains to be seen whether interactivity will move beyond these simple experiences and move people emotionally in the way that cinema and other traditional narratives do, or whether that is not the role of interactivity at all.


Stripping back interactive works to their essentials and attempting to understand what principles cause the state of play and flow may be a good place to start. Uncluttered by the weight of deep meaning and the Wow! factor of the technology, simple interactions and observance of the interactors at play may allow us to develop the tools and techniques that story-tellers have had at their disposal for years. Once games systems like the EyeToy become more commonplace (there are already millions around the world) the novelty factor of camera input and seeing oneself on television may wear off and the development effort conceptually shifts. Already the EyeToy is a good example of the shift from audience to performer. Many of the games and similar interactives described above rely on the notion that it is more interesting and exciting to become the performer than simply stand on the sidelines and watch others play.


Perhaps the most commercially successful example of the disappearing interface in videogames has been Nintendo’s success with its Wii console. Nintendo opted out of the 3D graphics power arms race with the Wii focussing on the interface and gameplay instead. The most groundbreaking feature of this console is its wireless controller, the Wii Remote (or Wiimote). The Wiimote tracks movement and acceleration on all axes as well as rotation. It can also be used as a pointer for the screen. This means that the controller is, much like camera interaction, untethered from the screen and the console and has led to a completely different demographic playing videogames. There have been several reports of the Wii becoming increasingly popular in retirement homes, particularly the bowling game (Wischnowsky, 2007). In Australia it became the fastest selling games console of all time (Moses, 2006).


The legendary games designer, Will Wright, expresses his view thus:

The PS3 and the Xbox 360 feel like better versions of the last, but pretty much the same game with incremental improvement. But the Wii feels like a major jump – not that the graphics are more powerful, but that it hits a completely different demographic. In some sense I see the Wii as the most significant thing that's happened, at least on the console side, in quite a while (Johnson, 2007).

As we will explore later it is usually the gameplay, not the graphics, that make the game and the magic gameplay ingredient is a surprisingly hard thing to define. For now, let us explore another aspect of the interactive experience that is crucial to its functioning at all – the suspension of our disbelief that what we see on the screen is not really there.

CHAPTER 2 – Flow, Interactivity & Suspension of Disbelief

Interactive Flow versus Narrative Flow

Narrative works are usually concerned with immersing the reader or audience in the story and the narrative suffers when interaction is simply grafted onto it. Either the characters become flat because they are repetitive, pre-recorded elements triggered by the user or because they are capable of acting in so many different ways (corresponding to interactive options) they cease to have a distinctive character.


Artists and designers working with interactive media might be creators of objects, media or even environments that audiences view, but ultimately the interactive elements change the traditional relationship of author and audience. Typically interactivity has been applied to narrative in variations of the “choose your own adventure” style where the user (for there is often little actual interaction) charts a course through a set of forking or circuitous paths (Gabriel, 1993; Anderson, 1995; Gabriel, 1996) The other common manifestation of interactive art is a virtual gallery or journey although the space and environment may well be quite abstract (Gabriel, 1993; Harwood, Fuller, Edgell, Ashworth Hospital Authority, & Ashworth Arts, 1996).


Even the more technically advanced work of artists such as Jeffrey Shaw (Weibel et al., 1997; Shaw et al., 2001; Weibel & Shaw, 2003) tends towards experiencing a virtual space as in the pieces The Legible City, ConFIGURING the CAVE (Weibel et al., 1997) and Eavesdrop (Shaw & Pledger, 2004). These installations use sophisticated technology (at least in respect to the date of their creation) to immerse the participant in the world, but they still follow a similar line of thought; the participant is there to view or navigate through an authored piece of work and perhaps temporarily change it a little. However well disguised, the interaction is still one of discovery of elements in a maze or puzzle.


Whilst these pieces are sometimes successful in terms of exploring ideas of narrative and content, the interactive experience is less fulfilling. The viewing apparatus involved detracts from the interaction. (In The Legible City the participant rides an exercise bike that moves a virtual camera around the scene and in ConFIGURING the CAVE the participants must don polarising goggles in the CAVE video environment). One is constantly made aware of the physical world because of the special apparatus required to view the virtual. The result is that, whilst the physical senses might be “immersed” in the virtual, authored world to some degree, the mind is constantly made aware of the physical world through the effort of the interaction.


Brenda Laurel (1993) points out that this interruption of the flow exists even outside of the narrative world when referring to user interfaces and operating systems:

Instructions, error messages, or dialogue boxes [...] are almost always destructive of our engagement in the activity by forcing us to “pop out” of the mimetic context into a metacontext of interface operations (Laurel, 1993, p. 102).

Videogames are one genre in which narrative and interaction have been relatively successfully combined to engage audiences. One only has to experience the hours passing in a flash when absorbed in the playing of a good game to appreciate the immersive effect. Yet although videogames are usually equipped with a back-story as in Tomb Raider (Eidos Interactive, 1998) or Prince of Persia: Sands of Time (Ubisoft Entertainment, 2003) they are essentially maze and puzzle games loosely wrapped within a narrative setting.  Ultimately the story is just the dressing to a goal-based game and it is reaching the goal – the next level, the highest score, the fastest time – that provides the hook, not the story, which is frequently a poor copy of clichéd cinematic tropes.


In real-time 3D rendered game, you may be able to do what you like because it is not pre-rendered, but this control is an illusion of interactivity. The player can, if they wish, leave Tomb Raider’s Lara Croft in a room for a year or force her to jump off of a cliff and kill herself, but actually one can only move around inside the constraints of the gameplay. We cannot make Lara stop and negotiate with her enemies instead of shooting them, which might show a growth in character so crucial to traditional narratives.


As Steven Johnson describes in his innovative reading of contemporary culture, Everything Bad is Good for You (2005), the mistake that is often made with videogames is to judge them by their content, not their structure. Yet it is their structure that is the real innovation and that has become increasingly more complex over the last three decades as opposed to becoming ‘dumber’ as the popular press would have it. Johnson turns to McLuhan (2001) to explain:

McLuhan believed that this rate of change shed light on the hitherto invisible ways in which media shaped a given society's worldview; let us see the impact of the medium, and not just the message. When your culture revolves exclusively around books for hundreds of years, you can't detect the subtle ways in which the typographic universe alters your assumptions. But if you switch from cinema to radio to television in the course of a lifetime, the effects of the different media become apparent to you, because you have something to measure them against (Johnson, 2006, p. 176).

Johnson continues examining the structure of interaction, gameplay and learning within videogames as well as (and compared to) the multi-threaded plot structures of popular contemporary television shows, including reality television. Some of this we will encounter later with regards to understanding the rules of play, but Johnson’s thoughts on the delicate balance between narrative and interactive flow also shed some light on the narrative/interactivity issue. In particular the ‘aimless’ aspect of games (such as walking Lara Croft around the virtual space for hours, not doing much) explains some of the reasons why narrative and interactivity tend to not “play well” together:

This aimlessness, of course, is the price of interactivity. You're more in control of the narrative now, but your supply of information about the narrative – whom you should talk to next, where that mysterious package has been hidden – is only partial, and so playing one of these games is ultimately all about filing in that information gap. When it works it can be exhilarating, but when it doesn't – well, that's when you start shelling out the fifteen bucks for the cheat sheet (Johnson, 2006, pp. 30-31).

Indeed, poor game or interactive design pulls one out of the immersive experience of the game world as much as the error alert box that Laurel describes. Immersion begins with the willing suspension of disbelief.

Willing Suspension of Disbelief and Interactivity

Laurel compares engagement in interactive experiences to the theatrical notion of “willing suspension of disbelief” introduced by the early nineteenth-century poet and critic Samuel Taylor Coleridge (Laurel, 1993, p. 113). Willing suspension of disbelief in traditional narrative is the audience’s willingness to “play along” and pretend that the fictional world presented to them really exists in order to become engrossed in the story.


This willing suspension of disbelief is broken by having to grapple with an interface (whether real or virtual) that is part of the real world (the screen, the computer), not that of the narrative. In either scenario the flow of the narrative is interrupted. Conversely, interactivity in narrative works is often unsatisfying because it is so restricted by the requirements of the author to satisfy the needs of the narrative, the technology or both. Artist and theorist Simon Penny explains the problem of illusion with regards to the CAVE VR system:

Little of the proprioceptive or perceptuo-motor correlation characteristic of bodily movement in real space is simulated or accommodated. In the case of the CAVE, for instance, such correlation is utterly scrambled in paradigmatically mechanistic style by the disconnection of forward movement from turning, of "drive" from "steering." One can bodily turn, but one cannot bodily walk, lest one rend the screen or wrench the gear off your head. The illusion of forward movement is achieved by dragging the world under one's feet using codified button clicks. This is a laughable example of the way that such systems often inhere awkward and paradoxical user constraints as a result of hardware limitations. More surprising is that this inconsistency seems not to have been found problematic, even in the professional literature (Penny, 2004, p. 78).

At first glance there appears to be a strong argument for the role of the narrative in interactivity. Narratives have been engaging audiences for thousands of years, most of the time through this suspension of disbelief. Is the same not true of interaction and interface? After all, suggests Laurel, when using a computer’s operating system we know that the windows, folders and files are not ‘really’ there, they do not really exist as the physical objects the icons represent, but in order to engage with either the task in hand or the pleasure of the experience of a game, for example, we willingly pretend that they are (Laurel, 1993).


Surely we can utilise this disbelief to create an engaging and absorbing interaction in the same way that plot, character and mise-en-scene engage the audience in the emotional arcs of story and disengage them from the reality of their world? The problem arises when we try and transpose the tools and techniques of narrative to interaction. What, for example, is the equivalent of a camera angle or mood lighting in an interactive experience? What kinds of interactions absorb the interactor? The transposition fails because creating interactivity requires an approach that is quite different from that of a storyteller.


If we change our approach and think about action instead of narratives or tools, as Laurel suggests, then the artist’s or designer’s task is to create an experience, not an object that is a representation of their experience or of something they are trying to communicate. It is possible to induce a willing suspension of disbelief, but one in which the interactor is engaged with the activity at hand, rather than the interface to the activity. This engagement is common in play situations, especially those involving make-believe, and Csikszentmihalyi’s flow state (Csikszentmihalyi, 1975; Csikszentmihalyi, 1990; Csikszentmihalyi, 1992; Csikszentmihalyi, 1996; Csikszentmihalyi, 1997a).


We then need to look for equivalents to cinematic language in interactivity rather than simply applying film theory to interactivity because they both tend to be screen-based. Conventional narratives attempt to mask the structure of the story (plot, characterisation, dramatic turning points) by using that very structure to create emotional hooks on which to hang our disbelief. When the structure starts to crumble we become aware of the printed page or the fact that we are in the cinema and we start to withdraw from the world of the story. This is often because the fine-tuning has gone astray; perhaps a typographic error, an unconvincing visual effect or an infeasible coincidence in the plot. When we find an interactive or and interface confusing and frustrating (and it is not deliberate) we have a similar experience; we are jettisoned back into real-world emotions and removed from those that we were experiencing in the represented world. More importantly, if the interactor has an unintended unpleasant experience or none at all then the piece has fallen at the first hurdle of engagement and they will often give up and not experience any deeper aspects of the work or content, whether an interface or an artwork.


One area of our daily lives in which we regularly suspend our disbelief is in the use of metaphors. Metaphors require a kind of dual processing of meaning – one of the superficial level of the metaphor’s imagery, the other of the underlying concepts. Some metaphors become so commonly used that they do not feel like metaphors at all and unpicking them provides excellent insights into interactivity.  

CHAPTER 3 - Interactivity, Metaphors and the Mind


In order to try and establish a theoretical and practical framework that might underpin the language of interactivity, it is necessary to develop a set of principles that might be applied both in the deconstruction and analysis of existing interactive works and in the construction of new ones. What principles can be applied that if not sure to guarantee success, certainly bring one far closer to it than groping around in the dark?


In traditional story-telling terms these might be dramatic arcs, turning points in the plot, characterisation, etc. (Johnson, 2005). In interactivity, such principles are necessarily drawn from a wide range of disciplines, each tending to be concerned with its own niche. Although essentially non-linear and often random, the process of interacting is still something that happens in the world (real or virtual) and is governed by time. We can begin by breaking down this process into stages and facets that an interactor may experience.

The role of metaphors

Metaphors have long been part of interface design, from well before the dawn of digital technology. As we will see in a moment, metaphors form part of our language and thus even things like physical switches or pumps employ conceptual metaphors for them to operate. They might be simple and often invisible to our conscious thought, but when they go wrong we notice them. In Germany, for example, a light switch is usually turned on by switching it to the 'up' position, the complete opposite to the UK. It is a tiny difference, but as an Englishman my mental map of light switches is “up = off”. Understanding how metaphors and play interrelate beyond these kinds of control systems, which have already been well-documented (Norman, 1998; Norman, 2003) can help lay the groundwork for a set of principles for developing interactivity.


George Lakoff and Mark Johnson’s work on metaphors (1980; 1999) has important implications for the relationship between interactivity and play. Lakoff and Johnson’s Metaphors We Live By (1980) brought to prominence the central role of the metaphor in the way we conceptualise the world around us and showed that metaphors were not just a poet’s flourish, but embedded in our daily thoughts and actions. We cannot escape their use any more than we escape our bodies.


For interaction designers and artists, this is an important step because almost every screen-based (and often physical) interface is based on metaphors, even if they are relatively abstract. As we have already seen from Brenda Laurel (1993), the role of our ‘suspension of disbelief’ when it comes to dealing with interfaces is paramount otherwise we suddenly become aware of the faćade of the interface – the lines on the screen are not really a folder or a file or a tool.


Using Lakoff and Johnson’s theory of metaphors we can start to unravel the relationship between physical actions (however small) and interaction, and understand why play and playfulness might be key factors in interactive engagement. First, however, we need to examine the background to metaphor and the embodied mind according to Lakoff and Johnson.


Philosophy in the Flesh (Lakoff & Johnson, 1999) builds upon their previous work and argues that the major, empirical findings of cognitive science shake the foundations of a priori philosophical speculation (themselves standing on the shoulders of Merleau-Ponty). If that were not enough, they argue the case that this creates radical changes in our understanding of reason and of ourselves:

It is surprising to discover, on the basis of empirical research, that human rationality is not at all what the Western philosophical tradition has held it to be. But it is shocking to discover that we are very different from what our philosophical tradition has told us we are (Lakoff & Johnson, 1999, p. 4).

Lakoff and Johnson argue that reason is not disembodied but that it “arises from the nature of our brains, bodies, and bodily experience” (1999, p. 4). They continue:

For example, there is no Cartesian dualistic person, with a mind separate from and independent of the body, sharing exactly the same disembodied transcendent reason with everyone else, and capable of knowing everything about his or her mind simply by self-reflection. Rather, the mind is inherently embodied, reason is shaped by the body, and since most thought is unconscious, the mind cannot be known simply by self-reflection (Lakoff & Johnson, 1999, p. 5).

Here of course we hear the clear echo from Merleau-Ponty’s phenomenological account of our being in the world:

[R]ather than a mind and a body, man is a mind with a body, a being who can only get to the truth of things because its body is, as it were, embedded in those things (Merleau-Ponty, 2004, p. 56).

This is a significant insight for the field of interactivity because much research focuses solely on the cognitive aspects of an interface or an interaction. Although product and industrial design have been influenced by ergonomics over many years, there remains a separation between the hardware and software interfaces of many commercial interactive systems – a gap that plenty of artists working at this crossover threshold have explored. In fact the growing number of artists and designers working in this area (Debatty, 2009) perhaps shows that experimentation with ‘physical computing’ is still an open field of discovery (Sullivan & Igoe, 2004; Saffer, 2006; Moggridge, 2006; Igoe, 2007).


Merleau-Ponty, in his essay/lecture on Space (2004) uses the example of Cézanne (and painters since) rejecting classical perspective because this analytical view “controls the movement of [details in a painting] unfolding yet also kills their trembling life” (2004, p. 53). He continues:

They [Cézanne, et al.] have been reluctant to settle for an analytical overview and have striven to recapture the feel of perceptual experience itself. Thus different areas of their paintings are seen from different points of view (Merleau-Ponty, 2004, p. 54).

Without wanting to digress too far into a discussion of traditional fine arts versus interactivity, it is worth noting that interactive artworks usually escape the fixed nature of painting as Cézanne was attempting to do. If a generalised view of a work of art is that it is an expression of the artist’s perception of the world, the spectator always has to view this through the medium of the work. That is, the experience that the artist is trying to convey is always second-hand. Often (but by no means always) interactive works are a kind of set-up experience – the artist or designer invites the interactor to directly experience the work. Here, as we discussed previously, the interactive experience is the work itself. Of course each person brings different backgrounds and readings to an interactive artwork, just like any other, but if the interaction is the experience and the experience is the interaction then this is as close to viewing the world through the artist’s own eyes and experience as is perhaps possible. This is most simply put by Michael Carter and Adam Geczy:

Making an interactive work of Art, then, is making a program through which the user can have an Art experience. With interactivity one is not simply a viewer of the work, but both a generator and participant of the work. Some would even say that it is only as a result of audience participation that the work of Art is ultimately realised; and realised over and over again (Carter & Geczy, 2006, p. 227).

Interfaces are both metaphorical and physical

Interactive interfaces, especially those that are screen-based, occupy an interesting middle ground between action and intention, conceptual and embodied action. Hundreds, if not thousands, of linguistic or conceptual metaphors are physical in origin. “How do the pieces of this theory fit together” (Lakoff & Johnson, 1999, p. 51) is one example, which Lakoff and Johnson describe in their terminology as “Organization is Physical Structure” (1999, p. 51). Obviously, in the case of a typical operating system, the information is organised and stored physically on the computer’s hard-drive (or in some other accessible memory system) in ways that most of us, certainly non-technicians, would not understand nor be able to easily conceptualise.


From the UNIX command-line to Graphical User Interfaces, metaphors have been employed to make it easier for us to conceptually grasp the ‘meaning’ and organisation of information in a computer as well as to manipulate it. This, of course, is not news. What Lakoff and Johnson’s theory of metaphor and embodiment enable us to examine is the in-between space, that is the idea that screen-based interfaces are both metaphorical and literal at the same time. To explore this, let us return again to a familiar and simple act for most computer users, dragging a file into a folder.


When I drag and drop a file into a folder on screen I am utilising the metaphor of the desktop and files and folders to organise relationships between batches of binary information. As previously discussed, at some level I am aware that this is not a real, physical piece of paper that I am moving around and that I am allowing myself to suspend that disbelief in order to usefully utilise the metaphor.


Lakoff and Johnson argue, with empirical evidence to support their claims, that the same parts of the brain that are used for the real, physical action of putting a paper file into a folder are the same as those used to conceive and utilise the metaphor of dragging a file into a folder on my computer’s desktop:

Contrary to long-standing opinion about metaphor, primary metaphor is not the result of a conscious multistage process of interpretation. Rather it is a matter of immediate conceptual mapping via neural connections (Lakoff & Johnson, 1999, p. 57).

Yet dragging a file to a folder on the computer’s desktop also requires me to move my body. In my case I am clicking on the mouse button to click on the file and both physically and virtually dragging the mouse across the screen to move the file into the folder. At this point the metaphor is both literal and conceptual, both of which reinforce the neural connections involved with that action.


From Lakoff and Johnson’s conceptual mapping it should be no surprise that even the best designed, logical but virtual interfaces can fall down in terms of user experience if the physical mapping is even slightly wrong. Even a tiny delay – perhaps less than a tenth of a second – on a touchscreen can be disconcerting, creating a kind of dissonance between the physical and the virtual as we found whilst developing many installations at Antirom (Allenson et al., 1995; Allenson et al., 1996a; Allenson et al., 1996b). Finding ways to overcome or hide the technical “time lag” is a common aspect of interaction design, whether it be about bandwidth, processing speed or perception.


In more complex interactive experiences and interfaces, understanding the metaphor, however abstract, is crucial to the ‘successful’ experience of them. Frequently complex metaphors are used to organise, navigate and make sense of a large amount of data, such as Jonathan Harris and Sepandar Kamvar’s wonderful We Feel Fine (Harris & Kamvar, 2005), which harvests weblogs in order to capture a rolling archive of people’s emotions. The initial view, called ‘Madness’, is made up of dots, representing weblog posts, flying around the space like the molecules of smoke in a Brownian Motion experiment:


Figure 6: The ‘Madness’ opening screen of We Feel Fine (Harris & Kamvar, 2005)


Already we see a metaphor of ‘information space’ and colour and shape mapping as well as the clustering in the bottom-left corner (for those unable to be classified from metadata). Added to this is the overlay of the blog quote and the menu bar at the top.


We Feel Fine allows for several different views of the same data. It is possible to drill down (another metaphor) and select, for example, all people that felt happy on the 11th May 2005 in Texas, when it was raining and that are female between the ages of zero and 18. Harris (who designed and developed the interface) manages to combine some very clever data manipulation and a beautiful, playful environment with many elements that encourage simply playing with the interface, quite apart from enjoying the content. It is an excellent example of these two realms working in concert together – the interface is as much part of the experience of the piece as the content itself. Here the most popular emotions expressed are shown in wobbling ‘mounds’. Swishing the mouse over the mound makes it wobble as if it were jelly, which is pleasurable in its own right, but it is also a useful way of displaying comparisons of volume of posts:


Figure 7: The ‘Mounds’ screen of We Feel Fine (Harris & Kamvar, 2005)


In this area one can select blog posts that are defined by the weather (using the date, time and location of the blog post We Feel Fine can then work out the local weather at the time of the post). Here we see several metaphors combined:


Figure 8: The view-by-weather screen of We Feel Fine (Harris & Kamvar, 2005)


It is difficult to see in this still image, but the dots representing the blog posts are animated according to the weather. For example, the posts from when it was rainy in the third column from the left are animated to look like falling rain. The sunny section, on the left, is a rotating cloud of orange dots representing the sun. Here we see metaphors of movement, of form, of spatial relationships, of weather, as well as text (the question mark is made up of dots whose weather cannot be determined). These are just at the top level metaphors, beneath that we have ideas of columns and spatial divisions more common to screen interfaces.


In their analysis of the literature regarding embodied concepts and embodied cognition theory, Lakoff and Johnson suggest how concepts, and thus metaphors, arise from our bodily perception of the world rather than being arrived at by pure reason, divorced from our physical natures:

From a biological perspective, it is eminently plausible that reason has grown out of the sensory and motor systems and that it still uses those systems or structures developed from them. This explains why we have the kinds of concepts we have and why our concepts have the properties they have. It explains why our system for structuring and reasoning about events of all kinds should have the structure of a motor-control system.

[...]Philosophically, the embodiment of reason via the sensorimotor system is of great importance. It is a crucial part of the explanation of why it is possible for our concepts to fit so well with the way we function in the world. They fit so well because they have evolved from our sensorimotor systems, which have in turn evolved to allow us to function well in our physical environment. The embodiment of mind thus leads us to a philosophy of embodied realism. Our concepts cannot be a direct reflection of external, objective, mind-free reality because our sensorimotor system plays a crucial role in shaping them. On the other hand, it is the involvement of the sensorimotor system in the conceptual system that keeps the conceptual system very much in touch with the real world (Lakoff & Johnson, 1999, pp. 43-44).

The process of a metaphorical concept, from this view, can be mapped in this way: a physical action/perception gives rise to the metaphorical use of the action/perception. This usage, in turn, reinforces the neural connection between the action/perception and the metaphor. What is important  about the use of metaphors in interactivity and interfaces is that they are both metaphorical and physical at the same time and that the mapping is simultaneous and completes the loop. An action/perception in the physical world is used as a way of organising and conveying the relationships of information and experiences through a ‘virtual’ interface. The interaction with this virtual interface can then reflect – indeed be exactly the same as – the original action that gave rise to the metaphorical concept utilised by the interaction designers in the first place. Hence, dragging an icon of a file into a folder matches, to some extent, the physical action of dragging a physical file into a physical folder.


Examining the balance and relationship of this duality that much interactivity is based upon – the simultaneous metaphorical conceptualisation and physical action – opens up new avenues of enquiry. Understanding and using play theory can help guide this enquiry because play is structured with a similar duality of concept and action and because it is “outside the individual, but it is not the external world,” as Winnicott (2001, p. 52) describes. One can easily see how counter-intuitive interactive metaphors can be dissonant simply by reversing the axes of a computer’s mouse so that a physical movement to the left moves the screen cursor to the right and the same with up and down. The ‘rules’ of the metaphor are broken in a similar way that the rules of play can be broken either by cheating in a game or simply by not engaging in the imaginary construct of the play space.


Furthermore, the primary metaphors – the basic ‘building blocks’ of more complex conceptual metaphors – are learned during childhood play and learning about the world when the concept and the metaphor are conflated, according to Lakoff and Johnson (1999). For example, the metaphor of having a close relationship relates to being physically close to those we are intimate with. For an infant these two are conflated – physical closeness is closeness of relationship and vice versa. Later we are able to separate out the metaphorical understanding of being close to someone from the physical actuality of it.


Extending this idea, it may well be that interactions (in the sense that we are examining here) that require physical movement reflecting the conceptual metaphor re-conflate the metaphorical concept with the real-world concept. It is these interactions that feel intuitive or natural and are thus likely to engage, unless of course the idea of the interaction is to deliberately confuse and disorientate, just as one might deliberately make oneself dizzy during play (Huizinga, 1955; Caillois, 1961).


The more one examines these simple, primary metaphors within interface and interaction design, the more often they appear by experimentation and accidental discovery than design, as was the case with many of the pieces in Antirom (Allenson et al., 1994). Some examples are audio pitching up in a sound toy being ‘up’ on the screen with lower pitch matched to downward motion; the mouse being near an object that ‘contains’ a sound means that this sound is louder; moving the mouse to the right scrolling a screen object in the opposite direction, giving the feeling of ‘travelling past’ the object, whilst the mouse ‘embodies’ one’s presence in or on the screen space.

A metaphor too far

Since metaphors are so powerful in aiding people to understand something new by making reference to something they already know, it is no surprise that they are used all the time by interface and interaction designers. On our computers we 'push buttons', 'rotate dials', 'open windows', 'surf links' and so on. When an interface is well designed we call it intuitive because it feels like 'second nature'. Almost all of this is based on tacit and sometimes explicit understandings of these metaphors.


Even the excellent and often abstract interactive works by Toshio Iwai (Iwai, 1992; Iwai & Nintendo, 2005), which inspired much of Antirom's output (Allenson et al., 1994), have some kind of metaphorical basis of insects or objects bouncing off of each other. Indeed, the success of Iwai's work is often due to a relatively tenuous mapping of the metaphor. Sometimes using a metaphor too literally can make the whole process fall apart, both in language and in interaction design. We may use the metaphorical mapping Affection is Warmth when we say "they greeted us warmly" (Lakoff & Johnson, 1999, p. 50), but we do not normally expect to have a pre-warmed blanket wrapped around our shoulders when we go to someone's house for dinner.


Part of the driving force behind a great deal of technological development in computer sciences has been driven by 3D modelling and rendering and the Holy Grail of trying to reproduce the physical world faithfully using the computer.  Putting aside debates surrounding the Uncanny Valley (Mori, 1970) – where it has been suggested that computer generated people and creatures become more unsettling and less real the more 'realistic' they become – directly representing a graphical user interface (GUI) in a three-dimensional space often destroys the very flexibility and 'poetic licence' of the metaphor.


Figure 9: The Bumptop Interface - a 3D representation of files on a desktop. (Agarawala & Balakrishnan, 2006)


A clear example of this is the Bumptop interface (Agarawala & Balakrishnan, 2006), which utilises a three-dimensional rendering of the desktop with files and folders as small blocks that can be moved around the desktop by dragging and dropping but also throwing or tossing. Objects then pile up and collide with each other just like real objects on a real desk. The initial rationale of this approach appears to make good sense at first glance:

Despite the metaphor, current virtual desktops bear little resemblance to the look or feel of real world desktops. A workspace in the physical world typically has piles of documents, binders and other objects arranged in a way that provides considerable subtle information to the owner. For example, items are often casually placed but their spatial position and orientation are usually meaningful. Closer items can indicate urgency, and piles of items are “automatically” ordered because new items are typically placed on top. This casual organization, prevalent in the real world, differs greatly from the GUI desktop which forces users to immediately file their documents into a rigid hierarchy (Agarawala & Balakrishnan, 2006).

Yet we soon see that this approach has some severe limitations. Piles and piles of papers on a desk form some kind of organisational structure, but are frequently irritating to have to sift through when one is trying to find something quickly. The ability to search documents by title, content or tags is a decided advantage of a digital system and does not necessarily require the kind of rigid hierarchy the Bumptop creators are suggesting. What would be wonderful with a pile of papers on a real desk would be to see them all separated out for a moment, be able to choose the correct one, and then have them zoom back to their piles. This already exists in Apple's OS X interface with their innovation called  Exposé.


Instead of the visual metaphor of the desktop being given greater attention and detail as in the Bumptop example with little book icons bouncing around with simulated physics, Exposé breaks the desktop metaphor without breaking the flow of intention. For those who have not seen Exposé in action, it is perhaps easier to show or experience than to explain in words (like many interactions).



Figure 10: Apple’s Exposé in action - each of these windows above was previously a full-size window or document overlapping other windows on my desktop.


Exposé allows the user to press an assigned hot-key or place the mouse in a certain corner of the screen to trigger three different possible views of active files. One option is to see all open files and folders displayed on the screen without any overlapping windows. Regardless of how many windows are open they all scale to fit on the screen. Another option is to display only the windows of the foremost application and a third option is to clear the screen of all windows and just show the desktop. In the first two cases, the desktop is ‘dimmed’ by a semi-opaque black overlay to signify that it is not the focus. Additionally, items can be dragged from anywhere onto one of the shrunken windows – for example, a picture from the desktop onto an e-mail message.


In this situation the metaphor of a physical desktop has been broken – in the physical world there is no magic ability to float all of one’s physical papers and documents in the air, choose one, and then have them all go back into place in real life. Yet Exposé manages not to interrupt the flow of intention. In other words, breaking the metaphor does not matter in this case because it does not pull the user out of their flow. It takes great effort and innovation to make such a conceptual leap in terms of interface design and ensure that it still makes intuitive sense as an interface. In fact, when Apple made some changes in an update to their operating system the minor graphical and user-interface tweaks tipped that delicate balance. So many users complained that Apple ended up changing the interface back to its previous state (Frakes, 2008).


When I first saw Exposé  my initial reaction was to wonder why something so simple had not already been developed. I would suggest that it is precisely because it breaks the desktop metaphor and thus had been out of the imaginative field of view for many interface designers (along with the fact that technological developments made it more possible).


Exposé’s metaphor is comparable to Lakoff and Johnson's conceptual metaphor, but is more a metaphor of intention. Why does this metaphor of intention work and what is it a metaphor for? Without being able to ask Apple’s design team directly (who may not know the answer to that question anyway) or access specific research in the area we can only hypothesise. I believe Exposé metaphorically maps the action of rifling through papers or documents, half lifting the edges of a stack of magazines, for example. You get enough information to recognise the item that you are looking for, which may be just the colour of a cover or a part of an image or a single word or two. You drag it out of the middle of the pile without unravelling everything and drop the edges back down again.


This physical action is metaphorically similar to the action of displaying all your open documents in Exposé, albeit that it is abstracted in the GUI, which is the point of a GUI after all. The ability to ‘clear your desk’ of everything magically and then have it all back again in the order it was is one of intention – something you would love to be able to do in real life, but cannot.


Bumptop takes a wrong path because it tries to emulate the physicality of the piles of paper instead of abstracting the intentionality of the physical cues (i.e., that something important is at right-angles on the top of the pile). The result is that the authors have to start breaking their own metaphor with options such as ‘Disable Physics as Necessary’ (Agarawala & Balakrishnan, 2006). Ultimately, quite apart from having to develop a whole new set of user actions to use it, Bumptop ends up having the same problems as a physical desktop, thereby negating much of the benefit of a metaphorical, digital one:

Like the GUI desktop, our prototype runs into problems when the number of items gets large. As Whittaker et al. (2001) found, “the main limitation of [piling] was that it did not scale well: pilers found difficulties accessing information once piles had begun to multiply”. We intend to explore extensions that might deviate somewhat from the physical piling metaphor but benefit from leveraging the underlying computer (Agarawala & Balakrishnan, 2006).

A large part of the skill of interaction design is developing these metaphorical and physical relationships and working out which part of the metaphor must be kept intact and which parts are flexible. The more complex the structure the more complex, and in some cases abstracted, the metaphor can end up being and it is easy for both the designer/artist and the interactor to become confused. As we will see in the next chapter, playful interactions and play in general sometimes involve complex “metaphorical idioms” (Lakoff & Johnson, 1999, p. 68) that become hard to describe in words, even though they are highly familiar, which makes developing principles of interactivity based on play difficult. By examining both how play and metaphors can either succeed or break down, however, it becomes possible to navigate this terrain (to use yet another metaphor).


Direct control and manipulation in interfaces

The interface and controls for manipulating either screen-based, physical or hybrid interactive works, elements and devices are the primary signifiers of any potential action, reaction and interaction. Their affordances prime the interactor to approach their use in a specific way and this is, of course, the purpose of using metaphors in interface design.


In the majority of contemporary interaction and interface design the use of conceptual and visual metaphors is employed to transfer knowledge from a familiar domain into the unfamiliar domain. It is a standard technique that interaction designers use (Krug, 2006; Saffer, 2006) and thus we have ‘files’ and ‘folders’ on a virtual desktop or rotary dials, buttons and sliders on digital interfaces that, being digital and screen-based, could have been anything the designer desired.


We have already examined the value, theory and problems of metaphors,  but it is worth examining the notions of control and direct manipulation in interactive situations. Much of the interactive experience is about the extension of the self or a feeling of remote control. Of particular interest are interfaces that allow the manipulation of content directly rather than through mediating dials, buttons, etc., since this direct interaction can collapse the need to “learn” the interface entirely. In essence, this is the utopian interface that Norman’s design principles are aim for – an object that needs no introduction, no manual, but whose properties and functions are immediately recognisable from the first glance or touch with no possibility of using it ‘incorrectly’ (Norman, 1998).


Ben Schneiderman’s essay, Direct Manipulation (1983), is subtitled as a “step beyond programming languages” (Ibid., p. 57). Written in 1983 many of the ideas expressed, and certainly the wonder at user responses, now seem a little quaint. What is interesting is how much of his account of the “direct manipulation” of data is based on his prior experience of programming in single-line text editors. The single-line text editors, as the name suggests, only allowed the user to view and edit one line at a time, “like seeing the world through a narrow cardboard tube” (Ibid., p. 58). Schneiderman then explains display editors that “display the document in its final form” (Ibid., p. 58) and introduces us to VisiCalc, the world’s first electronic spreadsheet program invented by Harvard MBA student, Dan Bricklin in 1978 (Bricklin, 1999).


Figure 11: VisiCalc running in MS-DOS


Later in the essay Schneiderman uses some more recent examples (from the early 80s) of interfaces with rudimentary graphic elements reminiscent of early Atari console and arcade games. In all of these examples the key ingredient, in Schneiderman’s view, is that they allow users to “directly” interact with and manipulate the data to hand. For someone used to manipulating data via lines of arcane syntax this will, of course, feel like one is “directly manipulating” the data. 25 years later, it is obvious that this was never really the case and that all the examples Schneiderman describes are really about directly manipulating metaphors of data representation.


What starts to emerge from reading this essay is how much one’s previous experience of doing something without the use of a metaphor-based interface affects the acceptance of any attempt to make it more visual, however basic. In the case of some of the examples from the late 70s and early 80s, the ‘visual representations’ that users are directly manipulating are blocky arrays of pixels. Yet, of course, we see this process repeated again and again with new technologies and their display mechanisms, from mobile phones to the displays on cameras and DVD player/recorders. Some of the more basic interfaces that still exist (entering an address in the contacts on a mobile phone, for example) we find relatively frustrating, even though in the time of Schneiderman’s essay they would have been seen as intuitive direct manipulation.


What is happening here to make yesteryear’s super interfaces no longer feel very usable at all today? Why is it that an interface metaphor can seem intuitive at one time and then 25 years later feel awkward and clumsy? What actually happens when we ‘get used to’ an interface or a way of doing things? Why does metaphor work at all as an interface device? In order to try to answer these questions, it is useful to take a detour through behavioural psychology, perception and cognition and cognitive science, which explains the way we decide which tasks are worth spending our precious cognitive machinations on, which ones we can ignore and how we parcel up these various activities in our minds. Along the way we take a look at how we take on-board and subsume metaphors into our perception of the world and this brings us back into the realm of linguistics and Lakoff and Johnson’s “embodied realism” (Lakoff & Johnson, 1999; Johnson & Lakoff, 2002).


CHAPTER 4 - Thinking Outside of the Brain

Scripts, Schemata, Mental Models and Automotive Action

Habituation is one aspect of interaction that plays a significant role in our perception and use of an interactive interface. It is how interaction design conventions arise and why they are very useful design elements to pay attention to  (Krug, 2006; Saffer, 2006), such as an element being highlighted when ‘rolled over’ by the cursor or buttons looking like 3D ‘physical’ buttons. This is true even when the aim of a piece is to disorient and/or surprise the interactor – you have to know the rules in order to break them. Of course, conventions are also a critical part of product design and we interact with them every day, from the red and blue badges on hot and cold taps to the symbols and layout of a car dashboard.


Part of the difficulty with exploring conventions of interaction design – or of design in general, for that matter – is that it is difficult to pinpoint a moment in time when invention becomes convention. Most of us notice a convention in the negative, when it is not there. Most people who drive have had the experience of operating the windscreen wipers instead of signalling a turn when they are driving an unfamiliar car. Even after doing it the first time, habit means we must hope the driver in the car behind notices the wipers come on as we turn a corner on a sunny day, at least two or three times, until we manage to re-program our own internal habits to the new behaviour. The fun starts all over again when we get back into our own car.


Most drivers will also recognise the slightly disturbing experience of driving a familiar, daily route and suddenly arriving home with little memory of the journey. The entire process has become automatic (or automotive, as the theory is called with, I presume, no pun intended – c.f. Bargh & Barndollar, 1996). This is an area examined in detail below.


Cognitive science research – not just human-machine relationships (Suchman, 1987) – has produced an abundance of theories about how we store patterns and oft-repeated actions in our brains. This research tends to be broken down into increasingly smaller parts in order to try to unravel the atomic complexity of everyday actions and thoughts. The research on action and behaviour – from goals and motivations to the conscious and unconscious cognition involved – fills volumes and incorporates debate that is both engaging and highly detailed (Gollwitzer & Bargh, 1996). An entire literature review of the action, behaviour and cognition literature is therefore outside the scope of this thesis. It would be remiss, however, to explore the language of anything without taking a look at how our brains think about language and structure, store and represent our conceptions of the world.


There are several models that can help answer the above questions about interfaces and interactivity: scripts and inference (Schank & Abelson, 1977; Schank, 1980), schemata (Schank & Abelson, 1977; Schank, 1980; Alba & Hasher, 1983) and action motivation, specifically the automotive theory (Bargh & Barndollar, 1996). All of them deal with the brain filling in missing information and most contain a theory of economy of cognition in order to free up cognitive resources for a more important task in the foreground. Most of these theories of memory can be placed under the umbrella of schema-theory.  Kant (1855), it should be noted, introduced the notion of schemata as innate structures 200 years earlier. Much contemporary research stems from the work of Bartlett (1932).


Similar to Minsky’s frame theory (Minsky, 1975), Schanks and Abelson’s script theory provided an explanation of how we comprehend commonplace events (and reports of those events) and infer sequences of events and actions as well as relationships of objects, people and places to each other (Schank & Abelson, 1977; Schank, 1980). I say provided because the theory was born out of research aimed at developing natural language processing systems for artificial intelligence systems, an approach that is no longer in vogue. However, although it is a schema-theory that is considered somewhat problematic, central to its conception of memory and language is the idea that we infer a significant amount of information from partial knowledge of a situation. This aspect of scripts is worth expanding upon.


For example, when we are told, “Karin went to the dentist and had her tooth pulled. She argued over the bill,” we infer that she took a journey to the dentist, registered at reception, sat in the waiting room, was asked into the consulting room and then, hopefully after being given an anaesthetic, the dentist commenced with the procedure of extracting her tooth. We also infer that she was later presented the bill and argued with someone, probably the dentist or the receptionist, about the amount of the bill, not that she was either hovering or standing over the bill at the time.


Lakoff and Johnson would, naturally, examine this in terms of the metaphors used (i.e., an argument is a war, wars are usually fought over a principle or claim, a principle is an object that can be owned, and so forth). Schank explains this cognitively in terms of a series of higher level knowledge structures, mainly through scripts, which are “a kind of key to connecting events together that do not connect by their superficial features but rather by the remembrance of their having been connected before.” (Schank, 1980, p. 253) The prototypical example used is telling a story of a visit to a restaurant:

In a story involving the setting of a restaurant, we cannot infer the causal connection to either ordering or paying from hearing simply that someone has entered a restaurant. However, speakers assume that you know this connection and they do not bother to mention it. There is a causal chain there, but inferring it bit by bit is impossible, which makes scripts necessary (Schank, 1980, p. 253).

Scripts are prototypes for events that allow us to piece together the memory of events because we have experienced something similar before, such as going into a restaurant. Scripts are expanded by plans, which allow us to infer the motivations and intentions of actors in an event; goals, which we also draw inferences from (such as why the goal was chosen or when it might be abandoned); and themes, a higher level overview of the events, actions and actors (from which we might infer other goals and conflicts, for example). Thus, according to Shank, we are able to respond to partial information by applying what we already know and filling in the gaps. We are, as with Minksy’s frames (1974), also able to make sense of new experiences by slotting them into existing knowledge prototypes or generalisations and/or modifying our existing scripts (e.g., going to a dentist is like going to the doctor), which will often contain pointers to other scripts in memory (Schank, 1980, pp. 255-256).


One of the key ideas behind scripts is the concept of the mind as a computer and thus scripts are not considered to be “simply static data structures” in memory (Schank, 1980, p. 281), but rather more likely to be reconstructed when required from smaller pieces that are statically stored. As Shank explains, we would not expect duplicates of the same packet of information to be stored in multiple places and this is a more efficient way of using and accessing memory (storage efficiency being a central plank to the mind-as-computer conception). An imperfect memory is the pay-off for this efficiency:

People have imperfect memories because they are looking to make generalizations about experience that will serve as a source of useful predictions in understanding. That imperfect memory is a by-product of predictive understanding capabilities is a very important point for those of us working in computer modelling (Schank, 1980, p. 282).

This can also be said of Minsky’s frame theory (Minsky, 1974), but there are some problems with the original conception of scripts. One is that they are predicated on the idea of having a pre-existing script that is then modified when we encode new information and it is here that we ‘lose’ information. This naturally brings up the question of where the scripts come from in the first place, which is not adequately explained (Alba & Hasher, 1983). Later versions of the theory add goals, themes and plans, which are considered to be organised hierarchically – some created on-the-fly – by Memory Organisation Packets (Schank, 1980), a concept we do not need to expand on here.


Another theory of memory and cognition is that of the mental model, something that Norman examined in Mental Models (Norman, 1983) and expanded upon in The Design of Everyday Things (1998) (where he called them conceptual models). The latter work has had an enormous influence on the disciplines of interaction design (Saffer, 2006; Moggridge, 2006), usability (Nielsen, 1993; Krug, 2006), information architecture and user experience (Shedroff, 2001; Garrett, 2003). Mental models are predictive and thus evince the expectations of the user. They usually simulate systems and are open to sudden change in the light of new knowledge as well as ad-hoc adjustments, combinations of models and rationalisations to describe the associated behaviour (Kempton, 1986).


Mental models are often developed by people to explain the world around them, even in the absence of any technical knowledge. In his classic example, Kempton (1986) examined the “folk theories” people have of how the controls of their home heating systems affected the actual mechanism of the system. From interviews with a number of Michigan residents, Kempton inferred their “folk theories” using methods developed by Lakoff and Johnson (1980) and then compared these models with the users’ behaviour.


One of the chief reasons for choosing a home heating system as the theme for the experiment was because it gives rise to folk theories that are not in the domain of the expert. That is, instead of dismissing folk theories because they fail to match up to expert theories of physics, for example, it is instead important to look at the problems folk theories solve correctly and then examine the explanations behind them (Wiser & Carey, 1983).


Kempton’s (1986) interviewees had developed two folk theories of how thermostats worked – a “feedback model” and a “valve model” – sometimes combining elements from both. The feedback model is closest to how a thermostat actually works (though not entirely correct) and posits that the thermostat turns the furnace on and off based on the temperature of the room in comparison to the temperature set on the thermostat. Thus a user would set the setting at the desired end temperature and then leave it.


According to the valve model the thermostat controls the rate at which the furnace operates, much like an accelerator pedal on a car or a tap controlling the flow of water. In this model the user constantly (hourly, in most cases) adjusts the thermostat based on their desired temperature needs. For example, a user might turn it up very high when they first come in from the cold to warm the place up quickly “by getting a lot of air out fast” (Kempton, 1986, p. 83) and then turn it down again once he or she was warm enough.


Although the feedback model is closer to the technically correct model, Kempton points out that the valve model not only functions perfectly well for the user, but also results in behaviour that, in fact, saves more energy. Kempton estimated that this approach, when used across American households, would save $5 billion per year – and that was back in 1986.


This is possibly the most important idea to transfer into interaction and interface design – what matters is not what the system’s model really is, but what users think it is. The designer does not usually have the luxury of sitting next to the user/interactor and explaining how the system actually works, so the affordances and feedback of the system should clearly signal the underlying system in a user-friendly interface. Of course there are situations where the underlying system is so complex that it is irrelevant to the needs of the user and the ‘magic black box’ is an adequate enough explanation (Google is a good example of this). There are also situations where the system is deliberately obscured in order to preserve its mystique (as is common with interactive artworks) or because the challenge to the user and the pleasure of the interaction is about trying to work out the rules of the system or game, for example.


Traditionally design has often aimed to hide the workings of machines, mechanics and gadgets in the name of minimalism or simply ‘good design’. In many cases, such as the oft-cited iPod, this has been highly profitable. Yet in contemporary culture, with deep concerns about climate change and energy consumption, making the invisible visible with the aid of ‘smart’ thermostats and power meters is an increasingly important aspect of interface design (Polaine, 2005b).


An overarching theory of memory and cognition is that of schemata, or schema theory, of which frames and scripts are essentially subsets. As we have seen, schemata are also knowledge structures that allow us to reduce the amount of cognitive processing in order to focus on, usually, more important tasks. Problematically, as Alba and Hasher (1983) note in their comprehensive review of the field, the term schema lacks a fixed definition. They propose a prototypical schema theory of memory, assuming the following functions of memory:

[S]election—a process that chooses only some of all incoming stimuli for representation; abstraction—a process that stores the meaning of a message without reference to the original syntactic and lexical content; interpretation—a process by which relevant prior knowledge is generated to aid comprehension; and integration—a process by which a single, holistic memory representation is formed from the products of the previous three operations (Alba & Hasher, 1983, p. 203).


They also argue that any theory of memory needs to account for three fundamental observations: accuracy, incompleteness, and distortion.


The central idea is that if there is no prior knowledge of a new domain of information encountered, there is no prior schema for it to be integrated into and thus retention and recall are difficult. This might be considered an explanation for why metaphor and analogy work so well, because they transfer knowledge of one domain into another.


The danger of this is that our selective encoding of data and our scripts and schema for patching it all back together again leave us with significant blind spots. The system could be described as ‘quick and dirty’ but fallible. Friedman (1979) showed that familiar objects in familiar scenes are paid far less attention than unexpected objects and that the removal of familiar objects was often not perceived at all by viewers, whereas changes to unexpected objects were usually noticed. In essence, we are involved in what Friedman calls a ‘gistification’ process:

Remembering the theme or gist of an event is tantamount to remembering the types of things which were most likely to have occurred; the undetected type-high to high changes and deletions attest to the fact that commonplace (i.e., thematic) episodic information was not especially noted or tagged (Friedman, 1979, p. 344).

The upshot of this is that if the things we think should be in a scene are in fact not there, we quite often invent the fact that we have seen them. It is this process that conjurers and sleight-of-hand magicians use to their enormous advantage (Brown, 2007) and has implications for interaction design where it is a double-edged sword. On the one hand making something stand out helps the novice find the required button/control element, on the other this can produce visually ugly results (witness the flashing multi-coloured text of many an early homepage) and clutter the visual field for experienced users who, we hope, do end up with a clear mental schema of how the interactive system works.


Alba and Hasher, however, point out several issues and note that whilst schema theory is very good at explaining distortions (incorrect recall), those situations are not very common. They argue that schema theory is not so strong when accounting for accurate memory of past events (Alba & Hasher, 1983). (That the distortions are not so common perhaps explains why conjurers get away with so much misdirection. Presumably if the distortions were more commonplace, we would be more aware of them and thus less likely to create them in the first place).


Alba and Hasher suggest that Johnson and Raye’s theory of “reality modelling” (Johnson & Raye, 1981) provides a useful explanation for this discrepancy and “confusion between memories for thoughts and memories for perception” (Ibid., p. 67). That is not to say that schema theory is without merit. Many parts of schema theory still remain pertinent to theories of memory and cognition, but, as Alba and Hasher explain, “more is stored in memory than any version of a schema-based selective encoding mechanism can account for” (Alba & Hasher, 1983, p. 222). Indeed, it seems that we store much more detail in memory than previously thought and in the right circumstances we are able to recall this detail without confusion.

People rarely make synonym substitution errors, are better able to identify sentences when they contain their original words rather than synonyms, and show greater interference in sentence verification when stimulus sentences all contain a common word than when they contain different synonyms of the word (Alba & Hasher, 1983, p. 215).

Experimental psychologist, Steven Pinker, whose research focuses on language and cognition, offers some clues as to why this might be the case. In The Stuff of Thought (2008) Pinker picks apart the semantics of language in order to try and gain insight into the human mind and what makes us tick. He examines verbs in particular and how semantically ‘fussy’ they are, explaining why many synonym substitutions should logically makes sense, but do not, or at least grate when we hear them. Pinker explains:

Since pour says something about the motion, it can be used in the construction that is about motion; hence we can say pour water into the glass. But pour doesn’t care about how or where the liquid ends up. You can pour water into a glass, all over the floor, or out the window of an airplane, dispersing it into a mist. Nothing predictable happens to the destination of a poured liquid, and so the verb is inconsistent with a construction that specifies how the state of a container has been changed. And thus we can’t say she poured the glass with water (Pinker, 2008, pp. 49-50).

In contrast, he continues, the verb fill is “all about the state of the container: no fullness, no filling. But fill is apathetic about how the container became full” (Pinker, 2008, p. 50). Fill is “the syntactic mirror image of pour” (Ibid., p. 50). Thus, we can say fill the glass with water but not fill water into the glass. Pinker uses Talmy’s concept of “force-dynamics” in language (Talmy, 1988) to explain why this should be. Briefly, force-dynamics suggest that language taps into intrinsic or naēve notions of causation. It is these that allow us to not conflate the syntactic meanings of pour and fill and may well explain why people “are better able to identify sentences when they contain their original words rather than synonyms” (Alba & Hasher, 1983, p. 215).


Pinker also examines metaphors and sets up Lakoff as somewhat of a straw man, even whilst agreeing with most of what he has to say (this is rather ironic since Pinker explicitly makes reference to the straw man tactic in an earlier chapter). Although Pinker brings a perhaps needed cold shower to some of Lakoff’s (and Johnson’s) more enthusiastic claims, his summary of Lakoff’s position as being pure relativism is unfounded, certainly in relation to the use of scientific data. Indeed, Lakoff and Johnson make a point of calling for “a philosophy informed by and ongoing critical engagement with the best empirical science available” (Lakoff & Johnson, 1999, p. 552). Their entire position is based on bringing the findings of neuroscience and cognitive science together. The tool for this, in their view, is the metaphor:

Conceptual metaphors permit the use of sensorimotor inference for abstract conceptualization and reason. This is the mechanism by which abstract reason is embodied.

 [...] By allowing us to project beyond our basic-level experience, conceptual metaphor makes possible science, philosophy, and all other forms of abstract theoretical reasoning (Lakoff & Johnson, 1999, p. 556).

Even the slightly weaker dismissal of Lakoff and Johnson’s position as “species relativism” (Pinker, 2008, p. 247) (his italics), that is “our knowledge is nothing but a tool suited to the interests and bodies of Homo sapiens” (Ibid., p. 247), holds little water. For someone who has written so much about language and its insights into the human mind, Pinker he fails to see his own “species relativism”. How could human knowledge be anything other than a suite of tools suited to our interests? Certainly Pinker does not support the idea of an innate dictionary of concepts, nor the idea that language is somehow God-given. Given that humans are the only species known to possess the tools of language as we know it and the discussion is about language as an insight into the human mind, such “species relativism” would seem difficult to avoid.


As it turns out, Pinker is more in agreement than disagreement with Lakoff and Johnson. The force-dynamics model suggests that “conceptual metaphors can be learned and used only if they are analyzed into more abstract elements like ‘cause,’ ‘goal,’ and ‘change,’ which make up the real currency of thought,” argues Pinker (Pinker, 2008, p. 259). All of those are terms that sit readily within a phenomenological account of the embodied mind.


Later Pinker turns to Boroditsky’s (Boroditsky, 2000; Boroditsky & Ramscar, 2002) work on time, spatial metaphors and the mind. Her work showed that the physical experience of motion, and not just the words used in metaphors for motion, can influence the interpretations people give of ambiguous metaphorical expressions of time, such as to bring a meeting forwards. Depending on their primed physical condition they will either interpret this as being set to a later date or an earlier date (Boroditsky & Ramscar, 2002; Pinker, 2008, p. 267). Boroditsky and Ramscar (2002) do temper the strong embodied view of Lakoff and Johnson somewhat, suggesting that “abstract thinking is built on representations of more experience-based domains, and not necessarily on the physical experience itself” (Boroditsky & Ramscar, 2002, p. 185). It is, however, difficult to see how this does not fold in on itself, for if people understand an abstraction like time through the concrete experience of space that is, in turn, experienced through the body, time must be understood in terms of the experience of the body in space.

 The embodiment of AI

By now, the astute observer of citation dates will have noticed that the literature on schemas, frames and scripts appears to fade out around the mid-80s. When we recall that a large driver of Schank’s (Schank & Abelson, 1977; Schank, 1980) work on scripts, language and memory was the development of natural language systems for Artificial Intelligence (AI), we can see why this theoretical framework took a 90-degree turn.


In The Playful World Mark Pesce (2000) recounts how the now MIT Professor of Robotics, Rodney Brooks, joined MIT in 1984 as a junior faculty member with very different ideas about AI. Brooks held strong convictions about the direction AI should take (Brooks, 1986) and was frustrated by the slow progress that AI had made over the preceding 25 years (Brooks, 1992). He also observed the funding crisis that faced the discipline and forced it into becoming relevant, moving on from the ‘toy’ block-worlds that previous approaches had used to test out their systems. Prior to this, approaches to AI had largely been about formalising structures of cognition and perception as well as language (Minsky, 1969). AI ended up using a “brute force” (Pesce, 2000, p. 47) approach to solving rule-based problems, such as playing chess, by simply computing every possible move – an approach that was certainly artificial, but not very intelligent in terms of the way humans think. Brooks instead argued for a vision of “intelligence from the bottom up” (Brooks, 1986) and putting simple layers of intelligence inside simple robots was the key. Pesce (2000) sums up Brooks’s approach thus:

Brooks thought that embodiment was an essential detail in artificial intelligence – not an afterthought, but the main event. Without experience in the real world, Allen [Brooks’s first robot at MIT] couldn’t learn (Pesce, 2000, p. 50).

Allen’s behaviour was to avoid static and dynamic objects. The robot had no pre-programmed map of the space, but when approached it would “scurry away, avoiding collisions as it went” (Brooks, 1991a, p. 7). Programmed with relatively simple ‘layers’ of intelligence, it also had a layer that sent it off to randomly wander about every 10 seconds or so. Combined with the obstacle avoidance, Allen appeared to be an artificial intelligence robot that did not seem to think that much at all – this was the exact opposite to previous approaches of AI, but much more successful. From these simple, embodied systems, a form of intelligence emerged.


To achieve this Brooks developed the physical grounding hypothesis. Dispensing with complicated symbolic representations of the world (and typed input) a system “has to express all its goals and desires as physical action, and must extract all its knowledge from physical sensors” (Brooks, 1991a, p. 5). When describing the ease of debugging, Brooks observes that “[T]he world is its own best model (as usual) [...] one can see at a glance how [the system] is interacting. It is right before your eyes.” (Ibid., p. 12). The resulting array of robots (Brooks has since built many) all demonstrated an interesting feature of growing intelligence:

A key thing to note with these robots is the ways in which seemingly goal-directed behavior emerges from the interactions of simpler non goal-directed behaviors (Brooks, 1991a, p. 7).

So, AI research, and thus the research into the structures of the mind took a turn that pushed it towards embodiment and the idea of emergence of complex intelligence from simple concepts grounded in the physical world. It should be clear after this detour through cognitive science that the obvious place for this line of research to head is an enquiry into the understanding of the world through physical experience. For AI concerned with language, this also requires understanding how language is framed through metaphors of the physical world. This research, coupled with emergent intelligence and neural nets based on the findings of neuroscience, is exactly where we encounter Lakoff once more in his collaborations with Jerome Feldman and his students, particularly David Bailey and Srini Narayanan (Bailey, Feldman, Narayanan, & Lakoff, 1997).


Without wanting to delve even deeper into a detailed description of this research, the overview in Philosophy in the Flesh (Lakoff & Johnson, 1999, p. 576) will suffice. Briefly, the doctoral research of Bailey (1997) and Narayanan (1997) entailed an accurate computerized model of a human body (called Jack) – complete with muscles, joints and correct physical movements – being made to produce a series of hand movements with similar, but semantically distinct flavours such as slide, slap, push, pull, press, hit, yank, poke and more (Bailey et al., 1997, p. 135). Videos of these movements were then labelled by informants with verbs. The system then had to learn the verbs “so that it could both (1) recognize an action and name it correctly and (2) perform the correct action, given the verb” (Lakoff & Johnson, 1999, p. 578). The resulting system was around 80% accurate and worked across a variety of different languages. If this seems rather mundane, Lakoff and Johnson explain the philosophical importance:

The ‘conceptual structure’ is the system for controlling the body [...] In short, the fundamental conceptual roles for making the right linguistic distinctions among the verbs are played by features of the motor system (Lakoff & Johnson, 1999, p. 578).

In other words, the same neural structure is being used to understand the concept of the verb as is being used to actually act it out. At the neural level there is no difference, leading Lakoff and Johnson to conclude, “the same neural circuitry that can move the body can be used to reason with” (Lakoff & Johnson, 1999, p. 583).


Whether Lakoff and Johnson arrive at a complete theory of how the mind conceives of the world, or whether neuroscience and robotics get there first, for the purposes of interaction and interaction design it should by now be clear that embodiment is a key component. Not only that, but our necessarily embodied impression of the world is lossy, selective and constantly changing:

Humans construct an understanding of the world that is very different from the analogue flow of sensation the world presents to them. They package their experience into objects and events. They assemble these objects and events into propositions, which they take to be characterizations of real and possible worlds. The characterizations are highly schematic: they pick out some aspects of a situation and ignore others, allowing the same situation to be construed in multiple ways (Pinker, 2008, p. 428).

 Pay no attention to that man behind the curtain

In his analysis of metaphor and critique of Lakoff and Johnson’s (1980; 1999) position, Pinker (2008) shows that the ‘metaphor part’ of metaphors can be subsumed into our everyday experience. That is, we no longer perceive the underlying metaphor for concepts that we regularly use because they become conventional and we no longer take account of the metaphor. It is as if we see through the metaphor, like the wipers on a rainy windscreen, to the underlying concept. In interface terms, we “get used to it” and we “get” the schema that the metaphor explains.


This goes some way to explaining the issues of interface metaphors being half ‘real’ and half metaphorical and why Apple’s Exposé was able to break the desktop metaphor without it jarring. Because operating a computer is both physical and virtual the process gets blurred – at some point in the usage of a system that retains its metaphorical conventions fairly rigorously the ‘desktop’, with its ‘files’ and ‘folders’, ceases to be a metaphor for its users. It is as though the willing suspension of disbelief is not just suspended, but dispensed with. The desktop really is the desktop and our files really are our files and not just metaphorical representations – something that anyone who has experienced a hard drive crash and lost all their data will appreciate.


On the one hand, the conventions keep the metaphor intact precisely so we can ignore it: when it is even slightly tweaked it can cause a jarring effect, as is often the case after an upgrade to the latest operating system on a computer. On the other hand, because the metaphor has been so successful that we no longer think about it as a metaphor it is possible to add to or extend the system without it feeling like the metaphor is broken, as Apple did with the Exposé feature discussed previously.


Dag Svanĺs in his PhD thesis on Understanding Interactivity (2000) invokes Winograd and Flores’s (1986) interpretation of Heidegger’s (1962) hammering in a nail example. During the time one is focussed on a task, the tool used for the task ceases to exist. Whilst the nail is an object for the carpenter, the hammer becomes part of the being of him as the actor (in our case, interactor) and is “invisible” or “transparent” (Svanĺs, 2000, p. 45). So it would seem with metaphorical interfaces providing they conform to conventions to which we have become habituated. As Svanĺs explains, Winograd and Flores use Heidegger’s terminology of breakdown and readiness-to-hand to describe the fact that we only notice the tool – in our case the interface – when it is either not ready-to-hand or is broken.


Such is the nature of design and interaction, that when done well it feels ‘intuitive’, but when it breaks, crashes or is badly designed, the metaphor becomes all too apparent and we are completely aware of it again. In Csikszentmihalyi’s (1988; 1990) terms the flow is broken and in play terms the rules of the game or the boundaries of the play space have been violated.


This ability to subsume and see through the metaphor may also explain why an older interface for same task that used to feel like an amazing breakthrough now feels clumsy and out of date. If the newer interface enhances the metaphor or replaces it with a new one that is better, it is compelling enough to erase our X-ray vision of the previous one, so that when we return to it we no longer see through to the concept but see the metaphor for what it is. We see the old man behind the curtain and not the grand Wizard of Oz that we thought our interface was after all (Baum & Granger, 1978).

 Are we automata?

Another explanation for “getting used” to an interface by subsuming the metaphor and internalising the higher-level concept is the automotive model (Bargh & Barndollar, 1996). The central premise of the automotive model is an argument “for the existence of unconsciously generated motivations and automatic action – for a conception of the unconscious as an implicit repository of a person’s long-term experience and history of past choices” (Ibid., p. 475).  There are two aspects to this that Bargh and Barndollar (1996) point out, one fairly obvious to anecdotal experience, the other perhaps less obvious.


The classic example of automotive action, as mentioned previously, is driving a car or typing, where “[w]ith experience, these processes come to operate autonomously; once started in motion, they interact with the complex environment as automated strategies” (Bargh & Barndollar, 1996, p. 460). This allows the car driver to carry out a conversation or daydream whilst driving and yet still manage to respond a large range of stimuli. The word autonomously is important here because it is not that we are simply following a set of identical procedures like a assembly line worker here, but that a whole “automated strategy” (Ibid., p. 461) is at work responding to changes in things such as other vehicles on the road, weather conditions, etc. – the drive home, for example, is never exactly the same each time. (In fact, it’s likely that an assembly line worker is also responding to minute changes). The strategy, however, is carried out unconsciously once it has been consciously kicked started (by the decision to drive home, for example) and we tend to only become aware of it when something goes wrong – the car in front suddenly swerving without warning or some such cause of an accident.


The less intuitively obvious finding of Bargh and Barndollar (1996) is that,

behavioural as well as cognitive goals can be activated directly by the environment without conscious choice or awareness of the activation [...] and that the activated goals operate autonomously, bypassing the need for any conscious selection or choice of them, and even producing outcomes different from what the individual would choose if the goal were not primed (Bargh & Barndollar, 1996, p. 475 – my italics).

Additionally, Bargh and Barndollar note that the unconscious has generally been given a “bad rap” (1996, p. 49) because

in order to demonstrate the existence of nonconscious or unconscious phenomena, researchers have had to demonstrate irrationality – judgements and behaviour that cannot be explained in terms of the conscious goals and intentions of the person (Bargh & Barndollar, 1996, p. 458).

This issue of irrationality is, incidentally, a problem in competing theories of play, which all tend to try and find a rational reason for its existence with confusing results (Sutton-Smith, 1997).


This has at least two implications for interactivity and interaction design. The first is that it adds to the evidence that interface and interaction metaphors can become integrated into a series of unconscious behaviours, assumptions, actions and reactions and that these then affect subsequent interactions consciously or unconsciously. Thus, if one is designing an interaction with the idea of inducing a flow state in the interactor, or even just a suspension of disbelief in order to engage with the experience more deeply, the more those unconscious motivations and goals can be primed and activated by the trappings of the interaction, the more they can become transparent to the experience.


The second implication is that an interactive experience might be so governed by conventions that it becomes a struggle on the part of the designer to overcome them. At an artistic level, where a novel interaction or interface might be the entire experience or theme of the piece, the challenge is to pull interactors out of their usual autonomous interaction mode and into the new one, which may or may not be a jarring experience depending on the goal of the artist. On an interface design level, it may be (and often is) difficult to override the conventions that an interactor has internalised as part of their interaction process, leading them to complain that something “should” be designed one way or another, even if the designs are comparable or the new mode of interaction is better than the old.


Bargh and Barndollar’s (1996) studies bring us to another element pertinent to interaction: priming, a technique commonly used in experimental psychology. Priming does what its name suggests, it primes the subject of an experiment in order to influence his or her behaviour and cognitive processes. This allows experimenters to test between primed and non-primed states and isolate areas of cognition and behaviour that they wish to study. An interesting example of this is affective mood states and their influence on strategies of information processing  (Schwarz & Bohner, 1996).

 Not today, I’m not in the mood

Although Schwarz and Bohner (1996), in their examination of the field, suggest that the impact of mood is probably more complex (also triggering people’s sense of self-achievement and satisfaction), the overall evidence is that people in bad moods tend towards procedural and analytic reasoning:

Moreover, people may be unlikely to take risks in a situation that is already considered problematic and gives rise to an increased control motivation. As a result, they may avoid simple heuristics and the playful exploration of novel solutions under negative affect, relying instead on more systematic, effortful, and detail-oriented processing strategies (Schwarz & Bohner, 1996, p. 121).

By contrast, those in good or elated moods tend towards more heuristic and “playful exploration of novel solutions, which fosters creative problem solving during good moods” (Ibid., p. 123). The approaches also require less cognitive effort, especially if the general situation is considered safe:

Thus, simple heuristics may be preferred to more effortful, detail-oriented judgemental strategies; new procedures and possibilities may be explored; and unusual, creative associations may be elaborated. Moreover, diverse types of procedural knowledge may be equally accessible (given that no specific procedure has been activated to deal with a problematic situation), thus further increasing the potential for unusual solutions (Ibid., p. 121).

Norman (2005) uses the example of a crowd of people dying in a burning theatre because the exit doors open outwards instead of inwards (and this is why all emergency exits now open outwards). In a normal, non-stressed situation after pushing the door and finding it does not open, a person would try to pull instead and find it opened inwards. In a high-stress panic situation, such as escaping a fire, a person (also fuelled by the crowd mentality) will push on a door and finding that it does not open, panic thinking that it is locked and keep pushing to no avail.


Norman explains a study in which the same functional interface for an automated teller machine was left bare and basic compared to one that had been designed to look and feel more pleasant. Although the actual functionality of the interface was the same, users reported that the more pleasantly designed interface was easier to use. The conclusion drawn was that the designed interface primed users to seek out alternative approaches to using it. Those that used the functional but non-designed interface simply found any errors or lack of understanding of the interface frustrating and irritating, which focussed them even more on a procedural process; this meant they simply repeated the same approach over and over. (See Ashby, Isen, & Turken, 1999, for more on the neuropsychological theory of positive affect and its influence on cognition for more detail on this area).


This evidence would appear to confirm what most people working in creative professions have known all along, that an environment that is playful, pleasantly designed and relaxing leads to the ability to develop more creative ideas. It is a pity that the work-play dichotomy set up by the Industrial Revolution (Kane, 2004) has meant that most office and educational spaces are drab, soulless environments that are supposedly devoted to ‘serious work’ and that, by and large, only those working in ‘creative’ professions are given licence to work in playfully designed spaces in order to do ‘serious play’ (Schrage, 2000). In the contemporary cultural and economic environment that cannot continue in the same mode. In the face of climate change, novel, creative solutions are paramount.


For interaction design it shows that interfaces that have an element of playfulness to them – or at least put interactors in a good mood through their design – stand a much higher chance of being played with and used, even if they might be functionally more challenging than non- or badly-designed simpler interfaces. One might make an analogy to evolution here – the more attractive the markings, the more likely the chance of a sexual interaction and the passing on of genes. In the case of situations such as bees and flowers, pollination is a by-product of the attraction of the bees to the flowers’ markings and nectar. Indeed,  Schwarz and Bohner (1996) refer to the playful exploration of creative solutions due to elated moods as a “flip side” (Ibid., p. 123), which might tell us something about the worldview of behavioural psychologists.


Of course, priming can also work on a social behavioural level as well, affecting not only our perception of the environment (including other people) but also our behaviour. This is known as the “common-coding hypothesis” (Prinz, 1990). It is what causes us to behave in the same way as a trait that we are looking out for in other people, because a behavioural schema is activated – e.g., perceiving hostility causes us to behave in a hostile way – according to Carver et al (1983). Could the same process apply to interactions, or even entire brands of operating systems?

 The Mojave Experiment

The Apple versus Microsoft Windows battle is well-known to almost anyone who uses computers extensively. Both camps have their passionate users and a quick survey of any computer-related forums will produce an abundance of forum flame battles arguing the various merits and failings of both sides. Up until the recent past – arguably until Apple’s resurgence as a major player in the consumer electronics market due to the success of the iPod and iPhone – the culture of the two companies has been played out across its computer products. Whilst Apple courted the creative community and laid a great deal of store in the design of its products and operating system, Microsoft focussed on the larger, more profitable and amorphous ‘business’ market. Setting aside the fact that the creative industries are, in fact, also profitable businesses (Barnes-Powell, 2006), we see here the typical work-ethic versus play-ethic dialectic (Kane, 2004). 


Microsoft’s approach to much of its software and interface design has been task-oriented, functional but not beautiful, with work prioritised. Apple, by contrast, has focused on aesthetics – sometimes at the cost of functionality – and lifestyle more than ‘business’ in the sense of mundane tasks. Of course, the reality is not as clear cut as that, but it is a relatively sound generalisation of the two ‘camps’ up until relatively recently. Microsoft has also developed their media centres, Xbox and Zune MP3 player, all firmly in the vein of entertainment, but in terms of their operating system, their core focus has been business customers.


In the past few years Apple’s culture of play has permeated its products and started to make inroads into the Windows world. A belief that aesthetics counts, combined with an operating system in OS X that they managed to bring to market before Microsoft could finish Windows Vista, enabled Apple to paint Windows – and PCs in general – as the boring grey suits of the world, whilst Apple was the good-looking fun-lover. This has been skillfully and literally played out by the “‘Hi, I’m a Mac’ ... ‘And I’m a PC’” advertising campaign from Apple. Through this campaign, in part, Apple managed to persuade a large number of Windows users to ‘switch’ to the Apple operating system, OS X, which of course meant them also buying Apple hardware.


The iPod with its massive success across all operating system users was, in some ways, the Trojan horse. It enabled Windows users to get a taste of Apple’s style and lead them into Apple’s world. Apple’s ‘world’ is, some argue, very hermetic and that it is not as possible to get ‘under the hood’ with an OS X machine as it is with a Windows PC. On the other hand, plenty of people are happy for things to ‘just work’ without having to delve into the arcane workings of the operating system. Apple managed to bridge these two worlds by building OS X on top of the UNIX platform, which allows those with the know-how to dig around with a command line terminal and the rest of their users to use a slick operating system.


A large part of OS X’s design – indeed considerable computing resources – are dedicated to what would seem frivolous to a die-hard functionalist. Bouncing icons in the Dock, swish animations of windows opening and closing, slick graphic and interface effects such as Exposé. All of these require  processing, which in the earlier days of computing when the computer was regarded as a tool would consume precious computational resources that needed to be dedicated to the task at hand. Nicholas Negroponte (2006) has argued that today’s fastest laptop is equally as slow running, say, a word-processing application as a laptop from 10 years ago because the resources used by the application have increased in line with the speed of the machine, so that the net speed increase is negligible. However, the experience of the interactions offered by the operating system is very different – a well designed interface can feel more luxurious, as it should given the extra computing resources commanded to provide glitz and glamour. Here we see the culture of play and playfulness taking hold. It is not in any way functionally necessary to have semi-transparent widgets or animations of windows sliding, genie-like back to folders, but it does add to the playfulness of the experience. Added together, the cumulative effect of all these design niceties is generally satisfying and it is fair to say that Apple fans have, in general, a much stronger emotional bond with the Apple brand than Windows users, who might be equally passionate about Windows and Microsoft as a company.


Apple’s CEO, Steve Jobs, once famously commented that the biggest problem with Microsoft was that “they have absolutely no taste [...] and I don’t mean that in a small way, I mean that in a big way” (Cringely et al., 1996). Whilst this sounds (and is) elitist, it has driven Apple’s design philosophy meaning that regardless of the extra expense of their machines, making their products look, behave and function beautifully was highly important and has paid off handsomely.


Yet it is not that Microsoft have not been successful – they remain one of the most successful corporations in the world (Microsoft Corp., 2008) – but when Microsoft have attempted to compete in the same market as Apple and on the same terms, the results have been mixed. Windows Vista, their answer to OS X (although they had been developing it for longer) has not had the warmest response (Pogue, 2006; McAllister, 2008; Stross, 2008). In the development of Vista, however, Microsoft put into practice a great deal that it had learned through users of its previous operating systems and also ‘borrowed’ from Apple. Vista, like OS X, has glassy and glossy see-through interface elements and ‘widgets’, it also has many visual interface effects as part of its ‘Aero’ look. However, so many people disliked using it that many requested companies like Dell to ‘downgrade’ and pre-install Windows XP, the previous version of Windows, instead (Dell Inc., 2008).


It is likely – and one would imagine this is why millions of advertising dollars have been spent by Apple and Microsoft – that behavioural priming is playing a part. Apple’s advertising campaign has  in part described OS X’s ease of use compared to the hassles of using Vista. This has, of course, neatly glossed over any problems that OS X has and placed Microsoft in a position where they cannot simply reply with the same tactic, which could seem a weak response and backfire (especially amid accusations that Vista has copied many features from OS X). So, all Microsoft has really been able to do is to sing the praises of Vista. Unfortunately, fairly or unfairly, this has not worked very well and, indeed, Microsoft have themselves admitted this as was made known in a leaked internal e-mail from CEO Steve Ballmer (Ballmer, 2008).


In 2008 Microsoft launched an ad campaign to try and combat this negative priming, called The Mojave Experiment (Microsoft Corporation, 2008). Much like a social outcast who is unable to reflect on their own behaviour, the premise of The Mojave Experiment was that Vista users – or potential Vista users – would really love Vista if only they got to know the real Vista. In other words, people were disliking Vista not because of its problems, but because everyone else was saying how bad it was and thus priming them to dislike it.


The “method” of The Mojave Experiment was to ask a group of Vista users what they thought of Vista and videotape the interviews. Then, when those users had aired all their grievances, they were shown  the ‘new’ Windows operating system being developed by Microsoft ostensibly called Mojave. Having been taken through all the great features of this new operating system, the interviewees were asked what they thought and, naturally, they were completely enthusiastic about it. One woman changed her score out of 10 from zero to 10 (Microsoft Corporation, 2008). It was then revealed that the operating system that they had just been shown was not, in fact, a new system called Mojave but Windows Vista – an apparently clear demonstration that without the negative priming, people actually find Vista great, not awful as they had previously stated.


In this context, it is interesting to see a company like Microsoft turn to experimental psychology as a marketing campaign, because it plainly recognises the role that psychology plays in the acceptance, enjoyment and use of an operating system. Unfortunately they chose to pay attention to user psychology from the wrong side of the screen. Through The Mojave Experiment Microsoft essentially claims there is nothing wrong with the design of the operating system (something they could change), but that their users are too stupid to perceive its virtues (something they cannot change). This was something that many prominent bloggers immediately picked up on (Miller, 2008; Wilcox, 2008), along with the ‘bad science’ (Shipley, 2008) of the experiment itself, which was conducted in a controlled environment in terms of the computer used and the set-up as well as only offering a short experience of the ‘new’ operating system, when most of the major complaints were about long-term usage issues:

My point is that the problems that Vista has become famous for are not the kinds of problems you encounter in a few minutes of playing with it in a controlled environment. Vista is known for people initially liking it, then after a while discovering it’s not working for them, and “downgrading” to XP. This study has told us exactly what we already knew: that, initially, people like Vista (Shipley, 2008).

It also highlights the fact that both good functionality and careful interaction design need to converge for an interface to really be successful; either one without the other is an unsatisfying experience (this is an issue explored later when discussing ‘delivering the promise’).


Before we leave the discussion of metaphor, cognition and behaviour, let us return to some of the contention surrounding the common-coding hypothesis and the activation of behavioural schemas. In Carver et al’s (1983) experiment, the researchers primed some subjects for hostility in an apparently unrelated first experiment. In a later experiment, when these subjects were asked to shock a ‘learner’, they delivered greater and longer shocks.


Bargh (1996) discusses this issue with regards to the priming of automotive motivation in his experiments and explicitly sets out to discount the behavioural priming from the motivational priming (motivational priming, incidentally, continues and even improves after a delay, whereas the behavioural priming ‘wears off’ reasonably quickly). One of the questions he raises, and suggests is a “controversy that has raged for some time” (Bargh & Barndollar, 1996, p. 470), is this:

Are the same mental structures involved in perceiving the behaviour of others and in producing that same behaviour oneself? [...] The question is whether perceptual representations and action representations are separate and distinct, requiring some kind of translation of information from one code to the other, or whether the same single code is used both to perceive and to engage in that type of behaviour [...] whether perception and behaviour share a common coding system at the symbolic level (Ibid., p. 470 – my italics).

This is, of course, the exactly the issue that Lakoff and Johnson have been tackling for over twenty years and have come to answer with a definitive yes, based on their own research (Lakoff & Johnson, 1980; Lakoff & Johnson, 1999; Johnson & Lakoff, 2002) and that of their collaborators and supporters (Regier, 1996; Bailey et al., 1997; Narayanan, 1997; Rohrer, 2001). An important aspect of embodied realism is that it rejects “the rationalist–empiricist dichotomy in favor of the evidence indicating a third alternative that allows both inborn and learned aspects of our conceptual systems, as well as many that cannot clearly be called either inborn or learned” (Johnson & Lakoff, 2002, p. 248). On the view of embodied realsim, which regards all experience as embodied and argues that there is “no ultimate separation of mind and body” (Ibid., p. 249) there can be no other answer than that perception and behaviour do share a common mental structure, although this is at the neurological level, not a symbolic level as argued by a linguistic interpretation (Pinker, 2008).


Interestingly, Carver (1996) gives us a good example of this view of mind-body separation when arguing that he sees all behaviour as goal-directed, “even though the goals are sometimes so small scale and concrete that they are of interest to motor control researchers [...] rather than to personality and social psychologists” (Ibid., p. 645-646). From the embodied realism point-of-view, this makes no sense at all – the motor control is the personality and the personality is the motor-control mechanism.


If we accept this view of embodied realism and the evidence supporting it, it means that conceptual metaphors become central to interactivity and they allow us to escape, or rather integrate, both Manovich’s purely psychological account of interactivity (Manovich, 2001) and the physical, playful aspect of interaction:

Experience is always an interactive process, involving neural and physiological constraints from the organism as well as characteristic affordances from the environment and other people for creatures with our types of bodies and brains (Johnson & Lakoff, 2002, p. 248).

Metaphor and physical interaction are central aspects to play experiences and thus they become crucial to the analysis of interactivity we will explore in the following chapters. They also form two of the three overlapping spheres that make up the interactive experience as it is being framed here. The third sphere is play.

PART II - Developing the Principles of Interactivity

CHAPTER 5 – The State of Play

We all play occasionally, and we all know what playing feels like. But when it comes to making theoretical statements about what play is, we fall into silliness (Sutton-Smith, 1997, p. 1).

As Sutton-Smith argues above, there are multiple rhetorics, theories and explanations for play, none of which is without ambiguity. Rather like love and pornography, we all know play when we see it but it is very difficult to precisely define. Most of us have some understanding of when play is no longer playful but something else, like a professional competition or something ‘serious’. What of games? Are they playful? Yes, sometimes, but not always. Competition? Yes, sometimes, but not always. Without such a definition how might we set about creating playful experiences for others? How is it that we seem to be able to recognise an ‘invitation to play’ (Pesce, 1996) almost instinctively and what does this mean for artists and designers working in this area? What can guide the creation of their work? Let us examine the state of play.

Homo Ludens

In 1938 the rector of the University of Leyden, Johan Huizinga, published an attempt to provide a comprehensive definition of the “play element in culture” in Homo Ludens (Huizinga, 1955) drawing greatly upon the work of Karl Groos (Groos, 1898; Groos, 1901) and influenced by Piaget (1962, first published 1923). It is also perhaps the most un-playful and dry account of the topic, given the subject matter, yet it remains a spring-board for much contemporary play theory (Rodriguez, 2006) and is important in that it took play seriously, if only to define it later as not-serious.


Huizinga’s rhetoric of play as progress and a civilising force reflects the mindset of the era in which he was writing. For him, the instinctive “need to decorate things [...] cannot be doubted [to be] a play-function of low order akin to the child’s playing in the first years of life” (Huizinga, 1955, p. 168). Huizinga writes that his “whole point is to show that genuine, pure play is one of the main bases of civilisation” (Ibid., p. 5). But, as Rodriguez (2006) points out in his analysis of Homo Ludens, for Huizinga “culture is playful from the start” and thus play is extremely serious for Huizinga who cites several examples (such as players risking their lives) to show that seriousness and play are not mutually exclusive.


Huizinga’s analysis of play-forms in art is largely confined to what he considers the high-arts, such as ballet, poetry and music. Although he comments on the “almost instinctive, spontaneous need to decorate things” in the ‘plastic arts’ (Huizinga, 1955, p. 168), his analysis of both the plastic arts and fine arts is confused and certainly out-dated. On the one hand, play is an innate driving force to “decorate things”,  yet on the other “it is impossible to assume that the aimless meanderings of the hand could ever produce such a thing as style” (Ibid., p. 168). One would imagine that Jackson Pollack, to name but one abstract expressionist, would have taken exception to this.


The plastic arts are, for Huizinga, devoid of play (apart from the moment when an object is “conceived”) because of the skill and craft required to bring it into being. This process, he argues, becomes such a serious and responsible task to the “man who is commissioned [that] any idea of play is out of place” (Ibid., p. 167). Yet countless designers and artisans (digital or otherwise) talk of their process and work as being playful (Schrage, 2000; Kelley & Littman, 2001; Cameron, 2004; Kane, 2004; Polaine, 2004d; Polaine, 2004e; Moggridge, 2006).


Even more of a non-sequitur is Huizinga’s argument comparing the plastic arts to architecture and painting:

To derive art wholly from some hypothetical ‘play-instinct’ obliges us to do the same for architecture and painting. It seems preposterous to ascribe the cave-paintings of Altamira, for instance, to mere doodling – which is what it amounts to if they are ascribed to the ‘play-instinct’. As to architecture the hypothesis is flatly absurd, because there the aesthetic impulse is far from being the dominant one, as the constructions of bees and beavers clearly prove (Huizinga, 1955, p. 168).

The analysis ties itself in knots in at least four ways. Firstly, the definition of art is very narrow and deeply rooted in notions of art as a high-water mark of civilisation. This, of course, precludes ‘bad art’, which must be included in the definition of art whether Huizinga likes it or not. Secondly, there is no reason why the ‘play-instinct’ might not lead to cave-paintings through them being artefacts of sacred  rituals or storytelling (or both). Thirdly, it is circular because it is Huizinga who has dismissed “mere doodling” as not art, yet beyond his notion of civilised art, doodling need not be “mere” in any way. Indeed, Soda Creative’s Ed Burton has developed what is both an interactive artwork and an educational environment from doodling (Burton, 2004; Polaine, 2004c). Finally, it seems to be precisely the “aesthetic impulse” in architecture that distinguishes it from the construction of bees and other animals who are, it is supposed, building from instinct and genetic programming, if evidenced only by some buildings that are architecturally magnificent but functionally awful.


There is an interesting grain to be salvaged from his analysis of the ‘bounded-nature’ of the plastic arts, which he considers only come to life (and have a chance to be playful) where there is “visible action” (Huizinga, 1955, p. 166). Unwittingly this predicts contemporary interactive works:

Once finished their work, dumb and immobile, will produce its effect so long as there are eyes to behold it. The absence of any public action within which the work of plastic art comes to life and is enjoyed would seem to leave no room for the play-factor (Ibid., p. 166).

This is the interactive art equivalent of the tree falling in the forest when no-one is there to see it trope. There appears, now, to be a growing consensus that for many interactive artworks there is no artwork unless there are interactors. That is to say, the interactive experience is the work, or at the very least, the work requires interaction to blossom. Interactive artworks require completion by the user perhaps in ways that other interactive ‘design’ objects and interfaces do not, for they are tools waiting to be used and stand up as such on their own as much as a spade is still a spade when it is standing propped against the wall (Cornock & Edmonds, 1973; St Clair, 2005; Carter & Geczy, 2006; Polaine, 2006b). (Although the spade could also be an artwork, a readymade installation, it is still functionally a spade – this is the inverse to the relationship of the experience of interaction in an interactive installation).


Despite Huizinga’s confusing account of play in art, he provides a definition of play which we can start from and with which Caillois (1961) (whom we will encounter in a moment) also begins:

Summing up the formal characteristics of play, we might call it a free activity standing quite consciously outside ‘ordinary’ life as being ‘not serious’, but at the same time absorbing the player intensely and utterly. It is an activity connected with no material interest, and no profits can be gained by it (Huizinga, 1955, p. 13).

In sum, although Huizinga’s Homo Ludens is an important stake in the ground for play and its consideration in culture – to be “taken seriously”, as it were – it falls short of moving play beyond something that, whilst important to the building of civilisation is, paradoxically, unproductive.  Naturally it also does not to take into account the more recent “ludification of culture” (Raessens, 2006) that we see in areas such as videogames, online spaces and other new media technologies.

Les Jeux et les Hommes

Twenty years later French philosopher and writer, Roger Caillois, building on Huizinga’s work in Man, Play and Games [Les  Jeux et les Hommes] (1961), tightens Huizinga’s definition, in particular to include games of chance or gambling, which, he argues, are largely left out of Huizinga’s account of play because Huizinga does not include material interest in his definition. Caillois feels it is important to acknowledge gambling, but argues that it still leaves play free from material value as gambling is essentially a loss-making activity except for those that own the casino or gambling house (and for whom it is business, not play). Here, though, we still see the rhetoric of play being devoid of value:

A characteristic of play, in fact, is that it creates no wealth or goods, thus differing from work or art […] Nothing has been harvested or manufactured, no masterpiece has been created, no capital has accrued. Play is an occasion of pure waste: waste of time, energy, ingenuity, skill, and often of money… (Caillois, 1961, pp. 5-6).

Caillois initially argues that the “spirit of play is essential to culture, but games and toys are historically the residues of culture” (Ibid., p. 58). That is, most games survive as future shadows of past ‘seriousness’ often because as one empire dissolves and is subsumed or conquered, games are ignored in the process precisely because they’re not serious (Ibid., p. 81). For example, mask play represents previously sacred ritual, chess is a relic of the tombs of the Pharaohs where one might play for eternal salvation. Children playing with bows and arrows and swords are playing with now outmoded forms of warfare.


Yet Caillois goes around in circles on this point noting that these games cannot only be with the remnants of a bygone era because children play in ways that are contemporary too; they play with pistols and cars and aeroplanes and mimic much of what their parents do. Had he been writing now, Caillois would have had to include mobile phones and laptops in amongst the various ‘adult’ activities mimicked by children. One wonders what he would have made of adults who play videogames of warfare across all historic eras or games that involve playing as childlike characters in childlike settings (as many Nintendo games are set), or even babysitting virtual babies, as in the Nintendo DS game Babyz by Ubisoft (2007).


Children’s games, Caillois decides, “are not a degradation of serious activities, but rather that the two different levels are simultaneously involved” (Caillois, 1961, p. 62). The difference for adults involved in the “real” activity is that they do not have the choice to say, “I’m not playing anymore” (Ibid., p. 63). This get-out clause (or get out of jail free card in the game of Monopoly) is what sets play apart from the everyday:

[P]lay is not at all a meaningless residue of a routine adult occupation, although it eventually perpetuates a counterfeit of adult activity after the latter has become obsolete. Above all, play is a parallel, independent activity, opposed to the acts and decisions of ordinary life by special characteristics appropriate to play (Ibid., p. 63).

Separateness is one of the key aspects of Caillois’s definition of play, one that he takes from Huizinga  (who terms this the ‘magic circle’ – one of the ideas that has become central to videogame theory) and extends to this set of qualities:

1. Free: in which playing is not obligatory; if it were, it would at once lose its attractive and joyous quality as diversion;

2. Separate: circumscribed within limits of space and time, defined and fixed in advance;

3. Uncertain: the course of which cannot be determined, not the result attained beforehand, and some latitude for innovations being left to the player’s initiative;

4. Unproductive: creating neither goods, nor wealth, nor new elements of any kind; and, except for the exchange of property among the players, ending in a situation identical to that prevailing at the beginning of the game;

5. Governed by rules: under conventions that suspend ordinary laws, and for the moment establish new legislation, which alone counts;

6. Make-believe: accompanied by a special awareness of a second reality or of a free unreality, as against real life (Caillois, 1961, p. 10).

The rhetoric of play being unproductive appears to arise due to its cultural position as being “not work” and part of leisure time. Yet this feels like more of a cultural construct, particularly in the era before social networks and other online activities carried out during work hours when Huizinga and Caillois were writing. The need for non-play activities to have purpose and value – and thus for play to be a rehearsal for something ‘better’ or ‘higher order’ (i.e., self-improvement, progress, learning, wealth production, etc.) is tightly bound to the ideals of the Industrial Revolution and the parallel shift of the position of children in Victorian society as Pat Kane explores in his in-depth treatise, The Play Ethic (Kane, 2004). Children moved from being considered little dysfunctional adults who both worked and needed to be conditioned into being ‘proper’ adults, to a separate group within culture that had an ‘excuse’ to play, so long as it was progressing them towards correct behaviour and learning. Any play outside of this is supposedly worthless within this rhetoric (Sutton-Smith, 1997; Pesce, 2000; Kane, 2004).


That play can be productive and have material value is something that we now have ample examples for with the rise of online communities, social networks and the read/write web. As games designer, writer and researcher Celia Pearce notes when discussing the productive ludic culture of contemporary online massively multiplayer games (a theme examined towards the end of this thesis)

[T]he boundaries between play and production, between work and leisure, and between media consumption and media production are increasingly blurring [...] More important, productive play also challenges traditional capitalistic notions of ‘productivity’ versus ‘leisure’ (Pearce, 2006, pp. 18-19).

In some cases, the effort of playing these games turns them into work, especially those in which one must gain points to reach the next character level or where one is as involved in setting up a virtual enterprise or leading a multiplayer guild (Yee, 2006). This communal play and its value and potential are discussed at the end of this thesis, for they form the probable future of a shift in attitudes towards play that has potentially far-reaching effects.


Returning to Caillois, for most of the rest of Man, Play and Games he examines play’s position in daily life and details examples of a categorisation of play that breaks down play across two axes. The first axis is made up of four categories agôn (competition), alea (chance), mimicry (simulation) and ilinx (vertigo) (Caillois, 1961, pp. 11-36).


Agôn is the area that Huizinga (1955) covers in the most detail and encompasses competition – from sports and physical challenge to forms based on the intellect, such as chess. Critical to the notion of agôn is a principle of equality at the start of the game so that the two competitors can prove that their superiority is due to their own skills rather than chance.


Alea denotes games of chance that are based upon “a decision independent of the player” (Caillois, 1961, p. 17). Unlike agôn, which may require years of training to achieve winning status, alea negates all training and personal achievement. Interestingly, though, both agôn and alea set about “the creation for the players of conditions of pure equality denied them in real life” (Ibid., p. 19).


Mimicry is perhaps the play aspect that is most commonly recognised in both children’s play and entertainment in the adult world (by this, I do not mean the sex industry euphemism ‘adult entertainment’ although mimicry and illusion certainly form part of that industry). Mimicry is “the temporary acceptance, if not of an illusion [...] then at least of a closed, conventional, and, in certain respects, imaginary universe” (Ibid., p. 19). This can be our own mimicry, when playing a role or a character in a game or a play, as well as our acceptance of others doing the same when the play is entertainment – the willing suspension of disbelief. It is perhaps this idea that the play is ‘make-believe’ or ‘not real’ that fuels the notion of play being non-serious (because ‘it’s only a game’) and non-productive, for any production in the game remains in the make-believe realm. You cannot spend Monopoly money in the supermarket, for example. However, as noted above, this attitude to play is changing, because you can now sell your online multiplayer game character, equipment and treasures to others, as well as your time spent playing other people’s characters as a service for real money. Indeed there is an entire sub-culture of game player ‘farms’ set up to cater to the needs of people who want their character to be at a high level, but have neither the desire, time nor ability to get it there. They do, however, have the money to pay for someone else to do it for them (Yee, 2006).


Ilinx, Caillois describes, “includes those [games] which are based on the pursuit of vertigo and which consist of an attempt to momentarily destroy the stability of perception and inflict a kind of voluptuous panic upon an otherwise lucid mind” (Caillois, 1961, p. 23). These are, of course, mostly physical activities involving bodily movements such as spinning, turning, sudden starts and stops. Caillois uses examples from the whirling dervishes through to tobogganing and motor racing.


In the context of this thesis, especially of screen-based interaction, can ilinx play a role? I believe the answer to be yes, even if the physical movements appear to be small such as the flick of a thumb on a gamepad or a hand moving a mouse. The reason I argue for the role of ilinx here is because those seemingly small movements are usually accompanied by larger, often unconscious, movements of the body as well as the sensory experience of vertigo induced by a screen-based simulation (mimicry). Videogamers will recognise this in the movements of their body – especially hands and arms – when playing fighting or racing games. There is a tendency, when one is in the flow of the game, to move the body in sympathy with the player avatar on screen, be it a car or a person. When racing one often leans into the corners and when fighting the players often move the controller in jerking movements with their punches, kicks or sequences of both.


Even non-competitive, less intense, interactions lend themselves to a greater physical sensation and action than commonly supposed. Who has not pressed a lift call button harder or in a sequence of fast jabs in order to try and make it arrive earlier, even with the knowledge that it makes no difference? Most computer users will know the tendency to type harder when angry, even hitting the keys or mouse when the computer fails to function correctly. When we (re)discovered the scrolling-in-relation-to-the-mouse technique at Antirom (Allenson et al., 1994; Allenson et al., 1996a), a large part of the pleasure of the interaction was (and still is) the sweeping movements of the hand when controlling the scrolling via the mouse. When we then used touch screens for certain projects, we were denied this form of interaction and reduced to jabs again (dragging, although possible, did not work that well on early touchscreens). There is now a renewed pleasure and use of this kind of interaction with the rise of multitouch interfaces. When one looks at examples of these (Buxton, 2007; Han, 2006; Saffer, 2008; Peltonen et al., 2008; Soro, Paddeu, & Lobina, 2008), the sweeping hand movements (or even finger flicks and movements of the iPhone’s interface) are not an incidental part of the features shown, but are essential to the interaction.


Figure 12: The City Wall project in Helsinki - a large multitouch screen in which hand and arm movements are both part of the interaction and the delight (Harmonix Music Systems, 2007)


As discussed previously with regards to the body as affordance, other videogames such as Guitar Hero (Harmonix Music Systems, 2005) and Wii Sports (Nintendo Company Ltd., 2006) take this a step further and either require (Wii Sports) or encourage (Guitar Hero) quite physical interaction.



Figure 13: A Guitar Hero player striking a ‘power stance’ (Image source: boymeetswurld on Flickr).


Figure 14: Wii Sports players in action - without physical action, there is no interaction (Image source: mouton on Flickr)


Caillois places these four categories along another axis, a continuum running from paidia, “the spontaneous manifestations of the play instinct” (Caillois, 1961, p. 28) or unstructured activities without rules, to ludus, which are structured activities with rules and boundaries. Thus, in Caillois’s system, any game or play activity should be able to be placed along such a table of quadrants. This is most easily explained by Caillois’s own table:


Figure 15: Classification of games from Caillois, Man, Play and Games (Caillois, 1961, p. 36)


Translations of both Caillois (1961) and Huizinga (1955) use the term play and games interchangeably for the most part, perhaps because of the difficulty of translating to English from the French (ą jouer - to play, le jeu - game, le jouet - toy) and German (spielen - to play, das Spiel - game, das Spielzeug - toy) where the roots of both play and game are the same. Caillois draws upon other languages for his terminology for this very reason. In the context of the present thesis, which distinguishes between play, games and toys, the Caillois table allows us to place them more accurately in relation to each other. Play is, of course, the overarching verb of the activity – we play games, with toys and openly without any apparatus apart from ourselves. We also use play as a noun when we talk of there being play in a mechanical joint or in systems of all kinds.


It is this versatility of play as well as its overarching nature that makes it such a powerful, if ambiguous, tool for analysing interactivity. It allows us to avoid becoming trapped in any one particular discipline or theoretical standpoint. Interfaces can be complicated and rule-based and playful or unplayful; they can also be open and loose and playful or unplayful. They can range from functional, tool- and goal-based interfaces to an interactive artwork where interaction is a goal in itself. Once again, we can place any of these interactions not only within the Caillois table, but also along a spectrum from playful to non-playful. The level to which this classification can then be abstracted and generalised is what forms the principles of interactivity in this thesis.


Yet, it is difficult to draw play away from any particular rhetoric, whether one of progress and biological development, gaming and gambling or of competition and socialisation:

The constant modern tendency to think of play as simply a function of some other more important cultural process (psychological or sociological) tends to underestimate that the major obvious function of play is the enjoyment of playing or being playful within a specific culture of play (Sutton-Smith, 1997, p. 106).

Sutton-Smith continues, in consideration of the rhetoric of the self and the notion of play being characterised by “attention to means rather than ends”:

This is a notion derived from the history of the work ethic and the related abrupt distinction between work and play in that ideology. The dualistic Cartesian distinction that play is to be play and work is to be work, and that they are to be quite separate from each other, is collapsed in Csikszentmihalyi’s work on flow as well as the research on children playing (Ibid., p. 188).

Caillois is quite specific that “the proper function of play is never to develop capacities. Play is an end in itself” (Caillois, 1961, p. 167). His final words on the matter sum up the issues:

Play is a total activity. It involves a totality of human behaviour and interests. Various approaches – from psychology to mathematics and, in passing, history and sociology – by reason of their special biases have been unable to contribute anything too fruitful to the study of play. Whatever the theoretical or practical value of the results obtained by each of these perspectives, these results are still without true meaning or impact, unless they are interpreted within the context of the central problem posed by the indivisibility of the world of play. This is the primary basis for interest in games (Ibid., p. 175).

The boundaries of gaming, play and art are becoming increasingly blurred as artists and designers explore new technologies. Art has always had a playful element, of course, but there is still a tendency for galleries to stick to being the white-box “shrine for contemplating sacred objects” (Paul, 2005, p. 4).


The conundrum for those working in the game/play/interactive arena is exploring playfulness and games as an important aspect of culture whilst avoiding the common dismissal of the Big Fine Art world that they simply turn the gallery into a playground (Huhtamo, 2004; Polaine, 2005a). An area of interest to this research, if only to exclude it from the boundaries of reference, is game art – games that have an artistic goal or focus, or artworks that draw upon videogame culture. The danger when practitioners of game art try to subvert typical gaming categorisations and clichés to bring our attention to a particular issue is that they tend to fail to engage with the very medium they are trying to subvert.


Artists, along with the popular press, frequently make the mistake of judging videogames on their content rather than their structure, as Steven Johnson explores with great clarity in Everything Bad is Good For You (Johnson, 2006). Often art games (or game art) are neither terribly interesting games nor particularly successful artworks. It is hard to subvert something without being clear what defines it in the first place – put simply, it is difficult to make an ‘art game’ if you don’t take into account what makes a decent game.


Hidden in this is a point salient to the debate between narrativism and ludology, the study of games from their own perspective of play (Frasca, 1999) (we will examine games theory in more detail in a moment). Aarseth (2004) points out that games are just as old as stories and probably older since they predate language.

It could be argued that games are older than human culture, since even animals play games. You don’t see dogs or cats tell each other stories, but they will play. And games are interspecies communication: you can’t tell your dog a story, but the two of you can play together (Aarseth, 2004, p. 46).

Of course you can tell your dog a story, but you cannot be guaranteed to be understood. Such is the nature of storytelling, especially encapsulated in a linear media form; you can not be guaranteed that your audience is listening or watching. Joking aside, when playing with a dog, there is at least a common understanding of basic rules of play, even if it is just, “you throw the stick and I’ll fetch it and promise not to bite you”.


This thesis does not aim to resolve the debate between narrative and gameplay, although the arguments set forth here side with the ludologist point of view for the main part. However, the paradox of play being both ambiguous and commonplace parallels another example: love. With that in mind we return, once again, to metaphors.

Love is...

One of the main issues facing those wishing to create interactive works is that interaction design and artistic practice draw upon so many discipline areas.  As Sutton-Smith examines in his entire review of the area, the same issues apply to the study of play (Sutton-Smith, 1997).


The likelihood is that for play, like love, there can be no single definition or answer to some of the questions posed by any of these disciplines and most claims for one tend to be based and biased towards the writer’s own discipline. There is no real way out of this and I include myself in this description, yet we need to have some way of grappling with the issues to hand. Continuing on from the deeper discussion of metaphors in the previous chapter there is some value to be had from examining the relationship between how we conceptualise play and how we conceptualise love. Certainly the scope of this thesis does not intend to expound on the virtues (or problems) of love, but love does make for a useful analogy precisely because, like play, it is so well-known and yet so difficult to define.

Our experience of love is basic – as basic as our experience of motion or physical force or objects. But as an experience, it is not highly structured on its own terms (Lakoff & Johnson, 1999, p. 71).

The same can be said of play, as Pat Kane notes in his summary of Sutton-Smith’s rhetorics of play:

We are at play, in a sense, when we feel we are at play; our subjective experience is the crucial indicator that the ludic moment is happening. Play is an attitude before it is anything else (Kane, 2004, p. 48).

Although play can be highly structured as we will see below, the psychological aspect of play is slippery and difficult, if not impossible, to describe without delving into complex metaphors as we do for love:

For a rich and important domain of experience like love, a single conceptual mapping does not do the job of allowing us to reason and talk about the experience of love as a whole. More than one metaphorical mapping is needed (Lakoff & Johnson, 1999, p. 71).

When it comes to exploring play and interactivity, it is important to acknowledge and even embrace this ambiguity. In fact ambiguity, tension and metaphor are important ingredients in an engaging, playful interactive work, even one that we might think of as being tool-based. Part of the charm of Apple’s operating system is that it allows those elements to be present along with the functionality. In many interactive works, the ambiguity is essential to the longevity of the piece, which is one of the reasons we made so much use of randomness in our works at Antirom (Allenson et al., 1994). Far from frustrating our users/audience/interactors by them not quite knowing what was going on, we found they were often engaged by the fact that they had to work something out and did not know exactly what was coming next. Similar arguments are made by those in the story-telling trade. Giving away too much up front bores an audience. Making them do some work keeps them on the edge of their seats (McKee, 1997; Goldman, 2001).


The issue with trying to define interactivity in terms of a single field is that this tends to strip away the actual emotional experience and turn the analysis into one of sequences of actions (Svanĺs, 2000). Although eye-tracking and detailed analysis of user behaviours and actions are useful in some fields – and such empirical approaches are prevalent in HCI – they take little into account in terms of the embodied experience while it is precisely this that draws an interactor in or sends them away in boredom. Taking into account a phenomenological view of the interactive experience is essential to designing and creating engaging interactive experiences.


Once again we can draw from Lakoff and Johnson’s work tackling love:

Take away all of those metaphorical ways of conceptualizing love, and there’s not a whole lot left [...] Without the conventional metaphors for love, we are left with only the skeleton, bereft of the richness of the concept.


The very notion of the aptness of a metaphorical concept requires an embodied realism. Aptness depends on basic-level experience and upon a realistic body-based understanding of our environment.


Our most important abstract concepts, from love to causation to morality, are conceptualized via multiple complex metaphors (Lakoff & Johnson, 1999, p. 73).

Interactivity is not simply about a user’s cognitive behavioural responses, nor is it only about psychological interaction, it is about an embodied experience of a complex interplay of motion, perception, reaction and emotion. To focus only on one area means only seeing a fraction of the picture. Play incorporates many of these same complex and paradoxical elements as love: metaphor (sometimes complex and contradictory), interaction, structure, rules, boundaries, freedom, heightened emotion, special language as well as physical and emotional forces. The range of play extends from simple, unstructured play – such as spinning oneself around – to highly complex events. Interactivity covers a similar range and, like play and games, struggles to find a place in our rather serious, work-focussed culture (Kane, 2004).


Yet, for all this talk of a work-focussed culture, in some areas it has never before been so playful. From the re-mixers and uploaders on YouTube and Facebook to the hyperreal battles of high-end videogame consoles and the performance gaming of the Nintendo Wii, Homo Ludens is alive and well and it all started, as Timothy Leary described to Douglas Rushkoff, when we could change what was on the screen (Rushkoff, 1996, p. 30). In an era where we are now used to being able to edit all sorts of media on our own computers and devices, it is perhaps hard to appreciate just how radical this was – radical enough for Leary to compare it to the Gutenberg printing press (Ibid.), although this now hackneyed comparison is used mostly for the Web these days.


This thesis is primarily concerned with the moment of interaction, play as an overarching mode and open, playful experiences of ‘toys’ rather than the competitive and goal-based experience of games.  This is to avoid falling into an abyss of videogame theory, a large enough field in itself. Videogames are, however, one of the oldest forms in the history of interactive screen-based media as we understand it today. As Poole points out, “Russell and his friends [who invented the first computer game] designed [...] the first symbolic interface” (Poole, 2004, p. 17). Before we examine some of the theory of videogames, let us take a look at the history of how they began in practice.

From Physics Lab to Living Room to the Bus

The history of videogames is littered with different attempts to capture the hearts, minds and wallets of teenagers (and adults) and many of the successful ones have balanced interface innovations with gameplay (c.f. Herz, 1997; Kent, 2002; Wolf, 2002; Wolf & Perron, 2003; Poole, 2004 for a thorough overview). Although Steve Russell et al’s Spacewar! (1966) is usually credited as being the first videogame it was, strictly speaking, the first computer video game. The physicist, William Higinbotham, created the first game, Tennis for Two (1958), on an oscilloscope at the Brookhaven National Laboratory. Fortunately for everyone who now enjoys videogames, Higinbotham did not patent it, which would have meant the patent belonging to the United States government (Poole, 2004, p. 16).


Figure 16: Tennis for Two by William Higinbotham (Image source: http://www.bnl.gov/bnlweb/history/higinbotham.asp)


Figure 17: Spacewar! running on the Computer History Museum's PDP-1 (Image source: joli on Flickr)


Spacewar! was the first game that really fired interest in the idea of videogames on a wider scale. Even though few had access to the PDP computers that it ran on, those that did were avid players. Nolan Bushnell,  an engineering student, joined up with a pinball company to make the first arcade game version of it (Russell et al had decided against selling it and gave away the code). Although “the project bombed” (Ibid., p. 19), Bushnell went solo and decided to create a much simpler game. In 1972 the legendary Atari arcade version of Pong, with its simple “Avoid missing ball for high score”  instructions, was born and soon raking in six times the amount of quarters as the pinball machines. (Ibid., p. 19)


Pong had, however, already been conceived as a living room game by a young television engineer, Ralph H. Baer, back in 1951 (Winter, 1996) and finally released by Magnavox on a home console, the Odyssey, six months before Atari in 1972. Magnavox sued Atari, but the pixel ball was out of the game bag and before long the market was flooded with copies of the game, almost bringing the whole trend to a halt (Poole, 2004). Counter to popular opinion that videogames moved from the arcade to the home, it was, in fact, the other way around, though for many the accessibility, power and social experience of going to the arcade were more compelling reasons to play.


Pong had very basic graphics, two white rectangles representing the “paddles” and a white square representing the “ball” and a line in the centre representing the “net”. How on earth could this be a compelling game?



Figure 18: Tennis on the 1975 German videogame console, the Interton Video 2000. One of many to copy the Pong concept.


I believe the answer is in the clarity of the interface and the rule-system of the game. The physical controller for the original Pong game was a potentiometer – a simple dial – and a button to serve the ball or make a selection in the menu. Turn the dial and the paddle moved up or down depending on the direction you turned it, press the button and the “ball”  was released. That was it, there was nothing else that you could do with this controller and the link between what you were doing with the dial and what changed on screen was very clear. Not only that but, for a short periods at least, this felt playful. Pong is still re-iterated every time a new technology arrives – there are hundreds of re-makes and emulators on the Web and there are versions for mobile phones even though these are capable of real-time 3D games. Even in the days of a Playstation 3, which has computing power many thousandfold greater than the original Pong consoles, playing Pong is still engaging because it is both easy to understand and challenging enough to hold the attention, even more so when playing against a human opponent (which was required in the original version).


Videogames and computers have obviously become immeasurably more complex in the thirty years since Pong, but this has come at a price. On the one had, real-time, photorealistic 3D graphics have opened up worlds of gameplay and interaction that were previously impossible and unheard of. On the other hand, the added complexity has led to the development of controllers that are equally more complex. The Playstation Dual Shock 3 controller from Sony has 15 buttons and two joysticks (which can also be pressed) as well as sensing motion and acceleration along six axes, for example.


Games that run on personal computers are able to utilise even more complex key combinations for they have the entire keyboard as well as the mouse and other peripheral devices (joysticks, gamepads, microphones, cameras) at their disposal. Perhaps the zenith of complexity in videogames – at least in terms of the array of controls – is the simulator, which is not really a game at all. For a dedicated flight simulator enthusiast the ultimate simulator is one in which the controls, response and experience of flying an aircraft is as close to the real thing as possible. The controls on even a light plane are complex and those on a jet airliner or fighter plane require years of training to learn in the real world. A good flight simulator emulates all those controls using the keyboard (and flight stick and pedals if you are really serious).


Time is also sometimes simulated realistically. If it takes seven hours to fly from Los Angeles to New York, then that is seven hours spent playing the flight simulator. As a ‘game’ rather than a hobby this becomes tedious very quickly. If you are not enthralled by the mechanics and ephemera of flying an aeroplane (conversations with the control tower, mapping out waypoints on the transponder) staring at the sky on screen is pretty dull. It is difficult to even get off the ground if you don’t know what you are doing in a good flight simulator. As with some of the online massively multiplayer games, the effort required means that it is not play anymore, but something else. It is training and simulation, work even. This is, of course, the intention of flight simulator developers, but for games developers things are very different.


There is a branch of videogames that has come full circle, giving up the idea that being more complicated or more visually realistic makes for a longer, more engaged gaming experience. The success of Nintendo’s Wii and the accompanying wireless, motion-sensing controller, the Wiimote, has little to do with ‘realism’ in the visual sense. The Wii’s graphics are typical Nintendo – rounded, cartoony, almost childish in style. The rise of games on mobile phones saw the re-invention of 1-bit and 8-bit games from bygone console eras. With small, slower processors, developers had to return to gameplay over graphics in order to entice players to play.


With the iPhone’s multi-touch screen and tilt sensors, mobility and motion have now all been combined into one powerful device. The iPhone has few buttons other than its home, on/off and vibration mode switches. The rest of the physical interface consists of a multitouch screen and  motion sensors – in some ways invisible interfaces. Many game and applications make use of the motion sensors as an integral aspect to the game, such as SEGA’s Super Monkey Ball (2008) in which the player must tilt a monkey inside a ball through a maze without him falling off the side. Of course, the location of a mobile gamer also comes into play – many reviews have noted that the game can be frustrating on public transport because the sensors are too sensitive to the motion of the train or bus (Cohen, 2008). The iPhone also uses a shake of the device as an undo function.


Figure 19: SEGA’s Super Monkey Ball game for the iPhone. (Image source: Mahoney, 2008)


There is also a proof-of-concept multiple iPhone multiplayer version of Pong (Akimoto, 2008) – multiplayer games are still fairly rare for many mobile devices beyond two or three people connecting via Bluetooth. The area of mobile multiplayer gaming, especially on a large scale with massively multiplayer online role-playing games (MMORPGs), is only just beginning, but it will tie together two key features of the major cultural-technological shifts of the last 15 years: games and online communities (Krikke, 2003). Of course, the connections between games and communities are nothing new in themselves as Hector Rodriguez (2006) reminds us in his re-reading of Huizinga:

[Huizinga’s] point is not that, in the course of human evolution, play was transformed into culture: this thesis would still imply that culture and play are two different phenomena. Rather, Huizinga claims that culture is playful from the start. This ludic event is pervasive and fundamental (Rodriguez, 2006).

The changing force has been how these communities are formed, connected together and how they play together and it has wide-reaching effects that are examined in the final chapter of this thesis examining social play. For now, let us turn back to the study and theory of videogames and the important contribution they make to both an deeper understanding of interactivity and the fundamentals of games.

Game Studies

Video and computer games are the most common site of interactivity and play in popular culture. Mainstream media often likes to portray videogames as frenetic, mindless, purposeless addiction, even if they do make huge sums of money (Johnson, 2006) and academia is still a battleground for the ‘ownership’ of the study of games (Aarseth, 2001; Aarseth, 2004).


Mainstream media commentary aside, the academic power-struggle to be the dominant theoretical field for the study of games is enlightening in the way that it highlights the fissures, doubts and problematic relationships between and within the various competing fields, particularly that of narrative and gameplay. Espen Aarseth, who argued that computer games studies’ “year one” was 2001 (Aarseth, 2001), is one of games studies’ most outspoken founders. Aarseth’s book, Cybertext: Perspectives on Ergodic Literature (1997), helped lay the groundwork for the study of computer games – a field that has given itself the name ludology (Frasca, 1999).


Earlier on in this thesis we encountered Aarseth’s framing of the study of videogames as a “land rush” (Aarseth, 2004, p. 45) complete with colonising forces. The main fight over control of the joystick, in his view, is between narratology and ludology. Aarseth and other ludologists argue against a literary or film studies reading of videogames as “texts” and in this he does not mince his words:

Are games texts? The best reason I can think of why one would ask such a crude question is because one is a literary or semiotic theorist and wants to believe in the relevance of one’s own training (Ibid., p. 47).

This snippet unfairly paints Aarseth as more polemical than he is for he has, quite rightly, argued that games analysis must proceed from a variety of perspectives and has argued for the development of a clear and appropriate methodology to do so (Aarseth, 2003). Citing Eskelinen’s critical observation, “in art we might have to configure in order to be able to interpret, whereas in games we have to interpret in order to be able to configure” (Eskelinen, 2004, p. 38), Aarseth argues for a methodology that is unique to computer games analysis and includes the necessity of being a player oneself. On this view, apart from cheating with cheat codes or walkthroughs, which taint any analysis (other than an analysis of cheating), for a player to understand a game they must become a competent player of the game. Or, rather, “how to play well and win, demands an analytical approach” (Aarseth, 2003, p. 5). He continues:

While the interpretation of a literary or filmatic work will require certain analytical skills, the game requires analysis practised as performance, with direct feedback from the system. This is a dynamic, real-time hermeneutics that lacks a corresponding structure in film or literature  (Ibid., p. 5).

This crucial difference represents the same issue of embodiment and experience of interactive systems that we encountered previously and is why a textual analysis of an interactive system only tells half the story (for narratologists half the story is, in fact, the entire story, or at least better than none). A textual analysis of Pong, for example, does not tell you anything about the gameplay, nor why it is still pleasurable and engaging to play 35 years since its invention.


This is largely the thrust of the argument in Cybertext. Aarseth examines the difference between metaphorical structure and logical structure, arguing that the essential difference between traditional texts and cybertexts is the active process of constructing the semiotic sequence, “a work of physical construction that the various concepts of ‘reading’ do not account for” (Aarseth, 1997, p. 1). He describes the differences between reader and player thus:

A reader, however strongly engaged in the unfolding of a narrative, is powerless [...] He cannot have the player’s pleasure of influence: ‘Let’s see what happens when I do this’. The reader’s pleasure is the pleasure of the voyeur. Safe, but impotent (Ibid., p. 4).


Part of the confusion when traditional narrative theories are applied to cybertexts arises from confusing – or not perceiving – “the difference between metaphorical structure and logical structure,” (Ibid., p. 4). Aarseth continues with this explanation:

The cybertext reader is a player, a gambler; the cybertext is a game-world or world-game; it is possible to explore, get lost, and discover secret paths in these texts, not metaphorically, but through the topological structures of the textual machinery. This is not a difference between games and literature but rather between games and narratives. To claim that there is no difference between games and narratives is to ignore essential qualities of both categories. And yet, as this study tries to show, the difference is not clear-cut, and there is significant overlap between the two (Ibid., pp. 4-5).

Aasrseth argues that cybertexts are not about a radical departure with “old-fashioned textuality” but rather “a perspective on all forms of textuality, a way to expand the scope of literary studies to include phenomena that today are perceived as outside of, or marginalized by, the field of literature.” (1997, p. 18).


Other theorists such as Moulthrop (2004), Pearce (2004), Ryan (2001), Jenkins (2004), and Juul (2001b; 2001a; 2005) are researching or have researched the two poles of the ludology/narratology spectrum as well as the space in-between. Pearce identifies a set of “narrative ‘operators’ that can exist within a game” (Pearce, 2004, p. 145) . These operators are Experiential (emergent narrative that develops out of the game), Performative (as seen by spectators of the game), Augmentary (backstory, information, context), Descriptive (re-telling of the game’s point and progress), Metastory (an overlay that creates the context for the game) and Story System (a system of parts that allows players to create their own story – most often seen in ‘interactive narratives’) (Ibid., p. 145). Pearce continues this line of thinking with reference to Aristotle: “[a] good game, even one without an obvious ‘storyline’ (or metastory), while being played, will tend to follow something that resembles the emotional curve of a dramatic arc” (Ibid., p. 145).


In a similar fashion, Juul (2001b) believes the following of narrative and gameplay:

1) The player can tell stories of a game session. 2) Many computer games contain narrative elements, and in many cases the player may play to see a cut-scene or realise a narrative sequence. 3) Games and narratives share some structural traits. Nevertheless, my point is that: 1) Games and stories actually do not translate to each other in the way that novels and movies do. 2) There is an inherent conflict between the now of the interaction and the past or "prior" of the narrative. You can't have narration and interactivity at the same time; there is no such thing as a continuously interactive story. 3) The relations between reader/story and player/game are completely different – the player inhabits a twilight zone where he/she is both an empirical subject outside the game and undertakes a role inside the game (Juul, 2001b, p. 6).

Pearce (2005, p. 1) also argues against setting up an “agonistic debate” between narratology and ludology in her response to Gonzalo Frasca’s essay, Ludologists love stories, too: notes from a debate that never took place (2003). Both Frasca and Pearce appear to have misread each others’ essays to come to these conclusions, because the games studies field accommodates both ends of the spectrum in the quest to find some clearer definitions, underpinnings and further research into games.


Of key importance to a study of interactivity and play is the definition of games that games studies academics have tried to clarify. Zimmerman (2004) provides the following definition:

A game is a voluntary interactive activity, in which one or more players follow rules that constrain their behaviour, enacting an artificial conflict that ends in a quantifiable outcome (Zimmerman, 2004, p. 160).

This is largely an amalgamation of the definitions of Caillois (1961) and Huizinga (1955), especially with regards to rules:

To take part in a game is to submit your behavior to the restrictions of the rules.

Rules might not seem like much fun. But once players set the system of a game into motion, play emerges. And play is the opposite of rules. Rules are fixed, rigid, closed, and unambiguous. Play, on the other hand, is uncertain, creative, improvisational, and open-ended. The strange coupling of rules and play is one of the fascinating paradoxes of games (Zimmerman, 2004, p. 161).

Montfort (2004) uses the term “contests” to define games because it is more inclusive of other play experiences:

It is more sensible to define games as contests and also allow the existence of puzzles and problems that are not games. Defined this way, a crossword puzzle is a puzzle, not a game; ‘Let’s see who can finish the crossword puzzle first’ is a game. Similarly, chess is a game; the knights’ tour is a puzzle that uses the gaming equipment and rules for movement from the game of chess (Montfort, 2004, p. 314).

As we have seen above, the computer game is the grandfather of much of what we now understand as interactive media (mainly, but not exclusively, screen-based). It is also a field that has been studied more closely and rigorously in academia than interaction design in its own right. Interaction design has tended to be dominated by the related, but different, field of HCI. In an area that has a paucity of historical analysis in comparison to, say, the fine arts, we should take note of all the history we can get.


As the various theorists mentioned above have argued, game theory has also helped in the study of interactivity and play, drawing them away from the specimen case studies by other disciplines and emphasising the role of the researcher being a player. The importance of the embodied aspect of play and interaction is also reinforced time and time again in games theory since playing is an active, not passive, activity.


Interactivity, incidentally, is a problematic term for Aarseth, who has called it a “silly and abused term” (Aarseth, 2004, p. 52).  There is no doubt that it is a term that has been abused, but it is no less abused than simulation, his preferred (and admittedly more accurate) term for what goes on in computer games. But his dual reading and understanding of both ‘old-fashioned’ textuality and gameplay leads him to this interesting communication theory analysis (after Eco, 1976):

[J]ust as the game becomes a text for the user at the time of playing, so, it can be argued, does the user become a text for the game, since they exchange and react to each other’s messages according to a set of codes. The game plays the user just as the user plays the game, and there is no message apart from the play (Aarseth, 1997, p. 162, – emphasis mine).

On this view, the play is the content, the interactivity is the content. Other readings of the text of the interactivity remain slippery to grasp, especially in terms of ascribing it meaning. Whilst there is no doubt that other content – visual, auditory, textual, for example – can and must co-exist with the interactivity, starting from the interactivity when creating interactive works enables the experience to be built up and out from the essential nature of that moment of interaction rather than it being bolted onto ‘old-fashioned’ content.


In their insightful and thorough tome, Rules of Play: Game Design Fundamentals, Katie Salen and Eric Zimmerman (2003), draw together a great deal of the literature on play and games in both computer and non-computer based settings. For them, “the goal of successful game design is the creation of meaningful play” (Ibid., p. 33). Perhaps the greatest insight that games theory has to offer us in this thesis is that designing an engaging game is hard. Designing a great game with meaningful gameplay is very difficult indeed and also a subtle process of balancing actions, interactions, player emotions, story (sometimes), environment, rules, testing and technologies, to name just a few of the variables. In their quest to both understand and teach others about how to develop games, Salen and Zimmerman have done a thorough job of picking apart both the process of designing as well as playing games.


Whilst their argument rests on games rather than open play or interactivity in its own right, their criteria for “meaningful play” lead us towards those for engaging interaction:

Meaningful play in a game emerges from the relationship between player action and system outcome; it is the process by which a player takes action within the designed system of a game and the system responds to the action. The meaning of an action in a game resides in the relationship between action and outcome.


Meaningful play occurs when the relationship between actions and outcomes in a game are [sic] both discernable and integrated into the larger context of the game (Salen & Zimmerman, 2003, p. 34).

From this view of meaningful play (in games, at least), Salen and Zimmerman go on to provide three definitions of interactivity, “one of those words that can mean everything and nothing at once” (Ibid., p. 58). One, from Brenda Laurel, we have already encountered, which describes interaction as participation between ‘agents’ (Laurel, 1993). The second is from my ex-Antirom colleague Andy Cameron’s essay, Dissimulations (1995), in which he examines the “central contradiction” of interactive narrative. Cameron’s view (which my own view follows, as should now be fairly clear) is that interactivity in the sense in which it is being discussed here is about intervention:

[I]nteractivity means the ability to intervene in a meaningful way within the representation itself, not to read it differently. Thus interactivity in music would mean the ability to change the sound, interactivity in painting to change colours, or make marks, interactivity in film the ability to change the way the movie comes out and so on. In its most fully realised form, that of the simulation, interactivity allows narrative situations to be described in potentia and then set into motion – a process whereby model building supercedes storytelling, and the what-if engine replaces narrative sequence (Ibid., p. 1).

Cameron also dresses down the dominance of literary theory and cautions against “the literal and utopian mapping of post-structuralist theory onto new technology [...] Just as theory is not praxis, interpretation is not interaction” (Ibid., p. 6).


The third definition of interaction Salen and Zimmerman use comes from games designer, Chris Crawford, who also defines it as a conversation (Crawford, 2000). For him it is an iterative process: Joe says something and Fred responds. Joe responds to Fred, and so on.


Rather than limit themselves to choosing one, or all three, of these quite reasonable definitions, Salen and Zimmerman propose a “multivalent model of interactivity” (Salen & Zimmerman, 2003, p. 59) that accommodates all three and extends them into four modes: Cognitive Interactivity – the emotional, intellectual and psychology that Manovich (2001) argue for; Functional interactivity, which includes structural interactions with the system and interface and is defined in terms such as, “How ‘sticky’ were the buttons? How legible was the text on your high-resolution monitor?”; Explicit interactivity – or participation with designed choices and procedures – this is the physical, ‘actual’ interaction that this thesis primarily concerns itself with; Beyond-the-object-interactivity or participation within the culture of the object – this ranges from fan culture to social interaction (Salen & Zimmerman, 2003, p. 59).


Most importantly for our purposes, Salen and Zimmerman make the distinction of a “designed interaction” (2003, p. 60) as opposed to a general interaction that happens incidentally.

The dropping of an apple on the ground does not meet baseline criteria for interaction: there is a reciprocal relationship between the elements of the system. [...] Do we have any ideas about what dropping an apple might ‘mean’ as a form of interaction between a person and an apple?


No. All we know is that an apple has been dropped. [...] If we change the scenario a little by adding a second player and asking the two participants to toss the apple back and forth, we move towards a situation of designed interaction. If we ask the two apple-tossers to count the number of times in a row they caught the apple before dropping it, we add an even fuller context for the interaction. [...] Each element in the system is assigned a meaning: the toss, the catch, and the dropped toss. Even in the simplest of contexts, design creates meaning (Salen & Zimmerman, 2003, pp. 60-61 - my italics).

Salen and Zimmerman’s four modes of interactivity not only help to unpick the contradictory and overlapping meanings ascribed to the word “interactivity”, but also show linear progress through an ever more complex experience of interaction. The psychological mode is the first moment of interaction because it can be done with our sight, sound and smell senses at a distance. For most of us our first moment of contact with an interface is the way it looks regardless of the usability and technology underlying it, which is why our first principle of interactivity examined in the following chapters is The Invitation to Play.


The second mode is slightly more complicated but maps to the setting up of the field of interaction and/or play. How well can we understand the rules and affordances of the interactive system, for example? The third mode of explicit interactivity is the central theme of this thesis – what does it feel like to interact? What is the phenomenological experience of the interaction and how can we enhance it to be more engaging?


The final mode of participatory interaction is the most contemporary of them all given the rise of social networking, messaging, life-streaming and presence technologies. If we are all connected all the time, what kind of social and technological interactions does that give rise to? At the time of writing we have just witnessed the largest financial and banking system crash since the Great Depression, due, in a large part to both the interconnectedness of computer-run financial systems as well as millions of social and economic actions and reactions (Brooks, 2008). Social interaction is also what happens after we have experienced an interaction in the world ourselves - we share, we comment, we argue, we challenge to competition, and so forth. This social interaction is examined in Chapter 10 - Social Interaction and Playing with Friends.

The Magic Circle

The magic circle and rules are two other aspects of games and play that are central to their study and have been covered in great depth by the games theory community (and are the subject of much debate). The concept of the magic circle is one that Salen and Zimmerman (2003) borrow from Huizinga:

The arena, the card-table, the magic circle, the temple, the stage, the screen, the tennis court, the court of justice, etc., are all in form and function play-grounds, i.e. forbidden spots, isolated, hedged, round, hallowed, within which special rules obtain (Huizinga, 1955, p. 10).

Although only one of the many examples that Huizinga uses, Salen and Zimmerman use it because, as they say, “there is something genuinely magical that happens when a game begins” (Salen & Zimmerman, 2003, p. 65). The magic is that all the elements of the game – in a traditional boardgame, the playing pieces, the board, the dice, for example – taken on potent meanings and power. To play a game is to agree to be bound by its rules and remain within its boundaries of play – the magic circle. To break the rules is to cheat and thus violate the ‘play’ of the game (if you are caught) and to leave the magic circle means leaving the game, to place yourself back into the ‘real’ world again.


Being ‘in’ or ‘out’ of the magic circle is usually clear – one is usually either playing or not. But it is also “fuzzy and permeable” (Ibid., p. 94). The edges of what constitutes play and an entering into or acceptance of the rules of a game are not always clear and sometimes deliberately muddled, either as part of the gameplay or because of the format of the game space, such as with locative and mixed-reality games (Harvey, 2006; Poremba, 2007).


What can the magic circle tell us about interaction design? The magic circle is a metaphor, even though that metaphor might be reinforced by physical space. A court room is just a room, but it takes on special meaning assigned to it and consented by those submitting themselves to its laws. The room is really a room, but the court itself is a metaphor for power, rules, societal agreements and law. One can be held in contempt of court – for not playing by the rules or respecting the magic circle of the space – but not in contempt of the room.


The magic circle and the rules of play suggest that all play is metaphorical or, at least, all games are metaphorical. As we will discover in the following principles of interactivity, this informs how we approach, interact with and leave (or break) interactive systems. Our starting point is the first principle of interactivity – Pesce’s invitation to play (Pesce, 1996).

CHAPTER 6 – Principle 1: The Invitation to Play

The premise behind this principle is simple in concept, but elusive in practice – how can we entice a potential user into engagement with an interactive work? Creating a seductive invitation to play is critical to those working with interactive media because without it audiences, users or interactors do not even begin to engage with the rest of the work, its meanings, depth and/or functions. If nobody ‘gets’ your interface or is intrigued enough to explore it, the battle is lost before it has even begun. Your audience – the interactors – have already moved onto something else.


In terms of game-art, going retro, simple, small and lowbrow often works better than trying to emulate the technical sophistication of contemporary videogames, which is beyond the resources and programming skills of all but a few artists. When artists do manage to gain access to the kinds of resources required to make a comparable experience on a technological level, the artistic and emotional impact of the content can be very great indeed. This was the case with Char Davies who, with the help of a team of engineers from Softimage in developing her works Osmose (1995) and EphémŹre (1998), managed to create some of the very few interactive works that elicit poetic emotional responses akin to the very best of narrative journeys. As Pesce (1996a) described it:

The final scenes in Osmose bring you out, away from the world. As it recedes, you can almost imagine, "this is what death is like, departing from the physical, into the shadow." Many of Osmose's immersants have noted this feeling – virtual art as near-death experience. Yet each emerges, unafraid, refreshed, reawakened of the delights of the world of the living (p. 47).

One of the significant aspects to both Osmose and EphémŹre is the use of an expansion belt around the chest of the interactor as an embodied interface. This simulates – and was based upon Davies’ own experience of – a scuba diver’s neutral buoyancy vest. Breathing in allows the interactor to ‘rise’ through the 3D space and breathing out brings them back down again. The interface is as natural as breathing.


Creating a spectacle of gaming activities and ephemera is another approach to this, such as Mary Flanagan’s [giantJoystick] work (Flanagan, 2006) for the Game/Play exhibition or PainStation by artist collective Fur (Kirschner, Morawe, & Reiff, 2002). In both of these examples gameplay is noticeably kinaesthetic – a defining aspect of play that is explored later in this thesis (Winnicott, 2001).


Flanagan’s piece takes on another dimension quite literally. Its oversized joystick in the middle of the room is a delightful invitation to play as well as an invitation to play with, or hinder, another player. It is a successful commentary on some of conventions of videogame culture whilst remaining playful. Christiane Paul, Adjunct Curator of New Media Arts at the Whitney Museum of American Art, describes Flanagan’s piece thus:

It provides a much-needed artistic redefinition of technological conventions, which are revealed and re-engineered in a poetic and aesthetic way. The exaggerated scale of the installation emphasizes the physicality (and absurdity) of interfaces in their relationship to the human body and human interaction. Rather than treating the joystick as mere access point to the “other” of the virtual world on a screen, the project highlights the joystick’s role in and connection to the physical world and the social nature of play (Paul, 2006, p. 29).

Ultimately, like the secret to falling in love, perhaps the secret to play will forever remain mysterious. The fact that play is “inherently ambiguous and unpredictable” (Sutton-Smith, 1997, p. 156) is part of its charm. The challenge for artists and designers is to experiment and be willing to fail and willing to respond to their audience’s experiences and this should be a playful process in itself.


As I have argued in the past (Polaine, 2006a), a generation of artists and designers have grown up with videogames and as they cease to be a novelty and simply part of the palette of culture from which to draw, the boundaries will blur even further. This is already evident when we trace camera-based interaction from Myron Krueger’s pioneering Videoplace (Krueger, 1974) to the massive popularity of the EyeToy camera on Sony’s Playstation 2 (London Studios, 2005). It is worth examining the role of the interface as a seductive entity in its own right, but our first ports of call are the tensions between purpose, function and form that arise in the field of Human Computer Interface  (HCI) research and ergonomics.

Form, function, medium and message

In well-documented research of human-computer interaction (Laurel, 1993; Nielsen, 1993; Norman, 1998; Svanĺs, 2000; Shedroff, 2001) and human-world interaction such as ergonomics and product design (Norman, 1998) a great deal of emphasis is placed on the initial moment of encounter with an interface or an object. In fact, in the very first encounter, the perceived interface and the object are the same thing. This is what Norman (2003) refers to as the affordance of an object in product design terms, departing slightly from Gibson’s (1979) original view.


When applied to some form of play or interaction, this becomes the invitation to play. Pesce first used this phrase in an essay in which he describes finding a swing that someone had hung upon an oak tree in a secluded spot on Mount Tamalpais near San Francisco (Pesce, 1996). A swing is, he says, clearly an open invitation to play. But a swing is also an object with many affordances and finally, simply an object – a piece of wood hanging from a tree.


Norman (1998; 2005) explores several ideas concerning how humans psychologically interpret products, from taps and light switches to revolving doors and teapots. As part of this process he examines how humans use mental schemas or patterns in order to try and understand and interact with the world. For example, we need to have a mental picture of how a pair of taps or single tap might mix the hot and cold water for us to be able to work out how to use them. Sometimes this schema already exists through previous learned experiences (which may lead to ‘intuitive’ guesses as in the case of the taps) and sometimes it is made obvious by the designer, either through subtle labelling (red for hot, blue for cold – based in part on a cultural constructions) or by the physical design of the object using the Poka Yoke or ‘fool proofing’ principle in the design (Saffer, 2006). A contemporary example of Poka Yoke might be computer cables, which are often shaped in such a way that they will only fit in one socket, in one orientation, unless you break them. A screen-based interface example would be an online payment form preventing you from proceeding further until the form fields were filled in correctly by making the submit or next buttons inactive.


If we have no existing mental schema for something that we have never encountered before, argues Norman, we take in the entire object and try and fit it into one of our existing patterns (the educated guess) or understand its nature and what we can do with it from the design. This is why interface and object are the same thing at this point in our encounter. Norman borrows and modifies J. J. Gibson’s (1979) term ‘affordance’ to describe this aspect of an object:

The word "affordance" was originally invented by the perceptual psychologist J. J. Gibson (1977, 1979) to refer to the actionable properties between the world and an actor (a person or animal). To Gibson, affordances are a relationship. They are a part of nature: they do not have to be visible, known, or desirable. Some affordances are yet to be discovered. Some are dangerous. I suspect that none of us know all the affordances of even everyday objects (Norman, 2003).

In Norman’s reading of affordance being able to use a shoe as a hammer is one affordance of a shoe. It may not be the intended purpose on the part of the shoe designer, but if one has never seen shoes before it may be a perfectly sensible idea and many of use have probably used them to that effect when lacking a hammer or needing to kill a cockroach.


Norman (1998) also describes cultural constraints and conventions, which are similar but not the same as affordances for they do not exist independently in the world. The little ‘pointing finger’ icon that the cursor changes to when you roll over a hyperlink is not an affordance, it is a convention and one that did not exist in the early days of CD-ROMs and Web browsers. Designers and multimedia authors have used many different conventions to signal interactive possibilities using audio, image rollovers and colour highlights, as well as changing the icon of the mouse arrow. Much of this forms the vocabulary of the developing ‘language’ of interactive media. The mouse metaphorically and literally represents our hand exploring the affordances of the representations of digital objects on the screen. In the relatively short history of interaction design, especially web design, many of these cultural conventions have quickly become set in digital stone.


Since these are cultural conventions, there is no reason why the finger icon should not be say, a kangaroo signifying a leap to a new link, just as long as we all understand and agree to that convention. The rule of thumb, however, is that for every-day interfaces conventions should not be replaced unless you know that the replacement is far better than the convention it replaces (Krug, 2006).


Of course, icon design leans heavily on existing cultural conventions. The finger pressing a button is an echo of both the physical world where a real finger presses a real button as well as our metaphorical exploring hand on the screen. As we saw in the previous chapter, such metaphors are deeply rooted in our physical and cultural understanding of the world and hence determine many of the choices we make in terms of interface design.


It is at this point that Manovich’s (2001) parallels of new media and cinema begin to go awry. In cinema we see the cultural conventions of editing, one of which is the fade-to-black and back to picture again, usually signifying the passing of time. It has obvious ‘real-world’ connotations to night falling and a new day dawning (or the lights fading down in live theatre), but it is not an intrinsic affordance of cinema or the cinematic experience. The affordances of cinema for the audience are also related to the physical space – the room is dark because we need to see the projection (although this helps us to descend into the alternate reality of the story), for example. We could make shadows in front of the screen – a practice with a long human history and used to great effect by Rafael Lozano-Hemmer in his Body Movies (2001a) work  – or any number of things that the physicality of the cinema theatre might enable.


Privileging cinema theory as the lens through which to analyse new media, as Manovich (2001) does, fails to take into account the viewing environment and physical experience of the cinema as opposed to reading film as texts. This focus on the content at the expense of the form is very similar to the issue of analysing videogames in terms of their content and not their structure  as games (Johnson, 2005).  It is obvious to anyone that has eaten their dinner whilst watching television that the same content (a film) can be a very different experience at home. Television is much more interactive than we tend to believe. Culturally perceived, television is still the poor cousin of cinema and still seen as a perilous, worthless, passive activity by many (Csikszentmihalyi also expresses this view – see Csikszentmihalyi, 1975). Television is still one of the most consumed media forms in the developed world and lies at the heart of contemporary cultural life (Johnson, 2005). Yet even television’s mainstream power is on the wane thanks to emerging online media forms and the cultural changes they are bringing. Television is also becoming more and more interactive, but only minimally in the technological sense. We shout at characters on the screen, we argue and discuss the content with our family and friends and generally do many things that would be culturally unacceptable in the cinema, but which would have been common in the days of Shakespeare when theatre was the television of the day (Cartwright, 1991).


Each media form and art form metamorphoses over time and Huizinga (1955) gives us an entertaining comparison between the role of music and musicians in the days of Louis XIV and today:

On the one hand the aristocratic public of those days must have been great connoisseurs, but their reverence for the majesty of art and their respect for its executants were, on the other hand, excessively small. Concert manners as we understand them to-day, with their absolute sacramental silence and magical awe of the conductor, are of very recent date. Prints of musical performances in the 18th Century show the audience engaged in elegant conversation. Critical interruptions aimed at the orchestra or conductor were a regular feature of musical life in France even thirty years ago. Music was still largely a divertissement and what was most admired about it was virtuosity. The composer’s creations were not by any means regarded as sacrosanct, as his own property to which he had inalienable rights. Executants made such lavish use of the free cadenza that steps had to be taken. Frederick the Great, for instance, prohibited singers from embellishing a composition to the extent of altering its nature (Huizinga, 1955, p. 163).

(As an aside here, we might note the comparison to open-source, creative commons and copyright issues today as well as participatory and co-creative media environments such YouTube that continue to tangle many of those issues together.)


The experience of television shows us clearly that the physical environment we watch it in gives rise to a set of affordances that do not exist in cinema, even though the content might be exactly the same. Thus the entire, phenomenological experience must be taken into account when examining these media forms – exploring the content alone is not sufficient. The differences in physical context also have an impact.


When trying to develop a language of interactivity these differences are crucial because we are dealing with an interface. That is, the moment and place where technologies (computers, robots, devices) and humans encounter each other or through which human-to-human encounters are mediated. Affordances and conventions are essential parts of any interaction with the world, including language. With the spoken word the affordance might be the sounds humans can make with their mouths, while conventions are the meanings we ascribe to those sounds. Within new media, the physical affordances and virtual ones (most of which are actually conventions and logical constraints) are jumbled together, which is why there is such difficulty in picking out the needle of interactivity from the haystack of graphic information design, product design, usability, user experience design, service design and Human Computer Interaction theory to name but a few disciplines that are concerned with interaction.


HCI has, up until relatively recently, only used the part of J. J. Gibson’s theory of affordances that lends itself to being easily digested and tested in an empirical framework. Gibson points out that affordances always need to be “measured relative to the animal” (Gibson, 1979, p. 127) and that they are not just abstract physical properties. This feeds easily into ergonomics, as does his description of perception of affordances making up the understanding, or meaning, of the object:

If a surface of support with the four properties [horizontal, flat, extended and rigid] is also knee-high above the ground, it affords sitting on. We call it a seat in general [...] Knee-high for a child is not knee-high for an adult, so the affordance is relative to the size of the individual [...] If it can be discriminated as having just these properties, it should look sit-on-able. If it does, the affordance is perceived visually. If the surface properties are seen relative to the body surfaces, the self, they constitute a seat and have meaning (Ibid., p. 128).

Norman takes this idea and uses them as clear guidelines for industrial and interface design:

What matters in design is:

1.  If the desired controls can be perceived

   1.a. In an easy to use design, if they can both readily be perceived and interpreted

2.  If the desired actions can be discovered

   2.a. Whether standard conventions are obeyed

In a game, the designer may deliberately violate the A principles (1A and 2A). Sometimes, even in effective, mass-consumer designs, it is useful to violate these principles (Norman, 2003).

An important note here is that Norman acknowledges the value of deliberately violating the principles of “good design”, something that usability experts often fail to appreciate (Nielsen & Tahir, 2002). The deliberate violation of conventions, of our normal state of being, is central to games and play. As Sutton-Smith reminds us when speaking of the playful versus play (a point we will return to): “The key is that the playful is disruptive of settled expectations. It is the genre of comedians and tricksters, of wits and dilettantes.” (Sutton-Smith, 1997, p. 148)


This is all well and good for a relatively one-way experience of control or usage of a tool or interface, but it leaves out the latter part of J. J. Gibson’s affordance theory, which explores the nature of ourselves and other creatures perceiving the world at large. Indeed, his book is titled The Ecological Approach to Perception (Gibson, 1979) precisely because the laboratory methodologies for testing perception ignored many of its qualities ‘in the wild’

Psychologists carry out elegant experiments in the laboratory to find out how and how well these qualities [of objects] are discriminated. The psychologists assume that objects are composed of their qualities. But I now suggest that what we perceive when we look at objects are their affordances, not their qualities (Ibid., pp. 134-135).

Throughout his theory Gibson embeds the observer in space and argues that affordances are relative to both the environment and the observer. Here, once again, we see a phenomenological account of perception brought into play:

[A]n affordance is neither an objective property nor a subjective property; or it is both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behaviour. It is both physical and psychical, yet neither (Ibid., p. 129).

In Gibson’s account of the affordances of other persons and animals, he moves on from the one-way human/animal-object experience and we encounter a strikingly clear definition of interaction:

When touched they touch back, when struck they strike back; in short, they interact with the observer and with one another. Behaviour affords behaviour, and the whole subject matter of psychology and of the social sciences can be thought of as an elaboration of this basic fact (Gibson, 1979, p. 135).

“Behaviour affords behaviour” avoids much of the struggle with defining interactivity via its various media forms and instead focuses on the process of interaction. Gibson goes on to discuss sexual, social and other social behaviours that require the interpretation (or misinterpretation) of the other’s behaviour and we can easily include play in this discussion. We can also see human social interaction as well as human-computer interaction coming together in a playful space, as we will see with Rafael Lozano-Hemmer’s Body Movies (2001a) and in the final chapter examining social interaction and play.


Pesce (2000) argues that as the world becomes more and more complex technologically and almost everything we encounter has some kind of interface, play is the methodology by which we will manage to navigate such complexity. At present, this tends to manifest in obviously man-made objects that are devices – mobile phones, PDAs, computers, televisions, etc. Pesce fast-forwards into the future where nano-technology has left the limited realms of research laboratories and become embedded in the fabric of our worlds. Walls and furniture may move, make sounds, network, communicate or display information as readily as screens and computers do today.


We cannot possibly learn the interfaces to all these different interactive environments. Already we know that the average Web surfer will give up trying to achieve a task (paying a bill, buying an item) if it is more than a few clicks away (Nielsen, 2001). Yet, paradoxically, when exploring a theme or a thread of information, Web surfers will follow the forking path for hours on end through the delights of fascination and discovery (both terms intrinsic to play and flow). Pesce suggests that play will be more and more fundamental to the process of simply being in the contemporary world. If we do not understand an interface straight away, simply by playing with it, we will have neither the time nor the desire to explore it and use it further.


The deceptively simple “getting it” part of an interaction is the difficult part to design without a set of principles of interactivity. I use the colloquial term “getting it” here because it helps to avoid using the specific terminology of any particular discipline (user experience design, usability, information architecture, HCI, etc.) and describes the amorphous, intuitive sense of understanding (or not) that interactors describe. Affordance has a great deal to do with interactors understanding and engaging with an interface and much of this is already contained in interaction design literature. Introducing an element of playfulness is much more elusive, as many toy manufacturers know. Tim Walsh, in his beautiful and engaging book, Timeless Toys: Classic Toys and the Playmakers Who Created Them (2005), documents the history and development of classic games and toys such as Twister, Slinky, Play-Doh, Scrabble and many more. Throughout the history of games and toys a common feature is that many of the successful ones were stumbled upon by accident.


Ernö Rubik, a 29-year-old professor at the Academy of Applied Arts and Design in Hungary and the inventor of Rubik’s Cube, originally intended to create a moving sculpture out of blocks of wood connected together by elastic string in order to explain three-dimensional geometry, when he noticed the blocks twisting and lining up in rows. In his unpublished autobiography, Rubik describes the moment of discovery in a perfect example of “I know it when I see it” :

It was wonderful, to see how, after only a few turns, the colors became mixed, apparently in random fashion. It was tremendously satisfying to watch this color parade. Like after a nice walk when you have seen many lovely sights you decide to go home, after a while I decided it was time to go home, let us put the cubes back in order. And it was at that moment that I came face to face with the Big Challenge: What is the way home? (Walsh, 2005, p. 231).

In other cases, an odd material invented by accident turned out to be a hit, as in the case of Silly Puttty and its unusual affordances.  What constitutes a ‘hit’ game or toy still remains something of a gamble. Many failed to convince potential investors and publishers at first before going on to sell millions (over 250 million in the case of Rubik’s Cube).


Part of the process of experimenting with and exploring interactivity, at least in the digital arena, is exploring the affordances of the components, whether they are chunks of code or electronic components. In many cases new paradigms are discovered through playing with the parts. In some cases an old paradigm, such as a zoetrope, is ‘re-discovered’, as was the case with Antirom’s scrolling film interactives (Allenson et al., 1994; Allenson et al., 1996a).


Through the iterative, pass-the-parcel, collaboration process at Antirom, the code engine that moved graphic elements in a scrolling motion, with their speed related to the mouse position, had a set of images added to it that were scans of Super 8 film. The initial idea was that these would make for interesting graphical content because they were long strips. As soon as they were loaded into the code engine, however, we realised that when the scrolling speed matched the height of a frame of the film, the film would animate as if in a zoetrope or a film moving through the gate of a projector.


Figure 20: Antirom’s Installation for the JAM exhibition at the Barbican Centre, London (Allenson et al., 1996a)


Design of all kinds, but particularly interaction design, is a combination of experimentation, play, considered thought and iterative prototyping. In Norman’s (2003) design principles a large part of the design effort goes into highlighting some affordances and downplaying others so that the desired controls can be perceived and the desired actions are able to be discovered. Norman’s use of the words “controls” and “actions” signifies his focus on the design of objects that are used for a task of some kind, a means to an end. Of course in these situations these principles make perfect sense, but, as Norman also suggests with his comment on games, when these principles are violated something else may be the order of the day: the means and the end may be the same thing.


Discovering the action of something may be the purpose of it and nothing more. As Steven Johnson (2006) argues, the idea of many videogames is not to read the manual, learn the rules and then sit down to try to win. The idea is to work out what you have to do in the first place. Discovering the rules is half of the game, whether it is classically framed as a ‘game’ or simply an interface:

Just as we saw in the world of games, learning the intricacies of a new interface can be a genuine pleasure. This is a story that is not often enough told in describing our evolving relationship with software. There is a kind of exploratory wonder in downloading a new application, and meandering through its commands and dialog boxes, learning its tricks by feel. I've often found certain applications are more fun to explore the first time than they actually are to use – because in the initial exploration, you can delight in features that are clever without being terribly helpful. This sounds like something only a hardened tech geek would say, but I suspect the feeling has become much more mainstream over the past few years. Think of the millions of ordinary music fans who downloaded Apple's iTunes software: I'm sure many of them enjoyed their first walk through the applications, seeing all the tools that would revolutionize the way they listened to music. Many of them, I suspect, eschewed the manual all together, choosing to probe the application the way gamers investigate their virtual worlds: from the inside. That probing is a powerful form of intellectual activity – you're learning the rules of a complex system without a guide, after all. And it's all the more powerful for being fun (Johnson, 2006, pp. 122-123).

Games are perhaps not the best example here because we quite often do use tools and interfaces to enable or enhance our actions in a game. Anything from a tennis racket to a Playstation controller is an interface to the actions of the game. In the case of Nintendo’s Wii Sports the tennis racket is simultaneously the controller (Nintendo Company Ltd., 2007).


Before we even decide that we want to understand the rules or play the game, however, we have to be enticed and seduced into the process and this is the invitation to play. The rules and their boundaries form the next principle, because having made the decision to engage the first activity is to find the edges of the playing field and the limits of the rules. It is, as Johnson points out, a pleasurable experience analogous to unpacking a present or a new purchase (an experience that Apple is particularly adept at designing).


Within any design project that also involves some kind of engineering (computer-based or industrial), there is often a tension between the designers and the engineers. Even if these roles may be played by the same person, that tension often still exists in an internal creative struggle. Complex programming does not always give rise to a deeply engaging interactive experience – often the simplest sketch can draw interactors in for a considerable time. On the other hand, something that should be simple can be boring or frustrating if the engineering and programming does not adequately support the idea.


The much-cited iPod from Apple is a remarkable example of these two principles aligning with great success. The original and subsequent iPods are beautiful and enticing as physical objects. They feel pleasant in the palm of the hand and the touch-sensitive scroll wheel (in the second generation version) invites play as does the multitouch screen in the latest incarnation of the iPhone and iPod Touch. None of these properties is intrinsic to the function of the device, which is to play music (and now display photographs, video and play games). The iPod’s designer, Jonathan Ive, could have chosen a number of other options for navigating the content on the device. Although there are plenty of other MP3 players on the market, some of which have had better technical specifications than the iPod, none have matched its popular appeal. Though a completely lifestyle purchasing decision, it is interesting to note that Apple’s marketing has not relied on this, or on certain types of models, to lend their appeal to the iPod. The models in the early advertisements are in silhouette, the iPod is the sexy model, its white headphone cables and ear buds are the star. The invitation to play is integral to the  iPod’s advertising and branding and the actual experience lives up to the promise, a crucial factor in its success.


Without wanting to digress too far into the marketing material of Apple, it is worth noting that for the release of Apple’s multi-touch iPhone and iPod Touch in 2007 the advertisements (and those released since) concentrated on the interface entirely (Apple Inc., 2008). A finger enters from off-screen to play with the interface and the entire experience of the interaction is the selling point peppered with generic captions such as “Music” or “Videos”. In these commercials, the content is subservient to the experience of accessing the content. Apple invite you to purchase an iPod Touch or iPhone and play (and pay them for the privilege, of course).


If the invitation to play is not part of an interactive work – whether an interface to a physical product. a game, an artwork or a MP3 player – then it is unlikely that potential interactors will experience anything else, and they will have neither the desire nor the time to learn the interface. In many more practical cases, such as a computer operating system, website or a database, the invitation to play is embedded in the desire to achieve a task and plenty of users/interactors will battle through a difficult interface to achieve their desired goal (Krug, 2006). In these cases the play usually ceases to be very playful and the ‘work’ or ‘task in hand’ takes priority over the experience. It is less an invitation to play than a demand to battle through the rules of the system. However, research has shown that users find interfaces that are pleasurable to use are also easier to use (Norman, 2005; Krug, 2006; Saffer, 2006).


When the interface (and this might be as simple as a transformed video image of the interactor) invites immediate play, and the interactor can understand the interface through play then they are learning the interface during the process of doing. This often does not feel like an explicit learning experience at all. This may either be the purpose of the work in its entirety or something that pleasantly draws the interactor into a more complex relationship with other content.


We can see in traditional media forms a similar parallel in the book cover, the movie trailer, and the television commercial to name but a few. The decision to engage with any of these forms is almost wholly emotionally driven even though it might be informed by complex psychological patterns – a process most recently popularised by Malcom Gladwell’s Blink: The Power of Thinking Without Thinking (2005). Paying close attention to the emotional response to the invitation to play is essential and one of the ways of eliciting it is to get rid of the interface entirely.

Case Study: Time Sketches

The Creativity and Cognition Studio (CCS) at the University of Technology, Sydney (UTS) is an interdisciplinary research environment that brings together “artists, technologists, curators, sociologists and everything in between(Edmonds, 2007). One of the projects that the CCS runs is beta_space in collaboration with the Powerhouse Museum in Sydney. beta_space is an experimental exhibition area of the museum devoted to allowing the CCS researchers to examine audience experiences of interactive artworks. It is, as Muller and Edmonds describe it, a “living laboratory” (Muller & Edmonds, 2006). In early 2006 I was invited to place two works, called Time Sketches (Polaine, 2006b), in the space.


I had been developing the interactive engines for Time Sketches for my own experimentation as part of  some early explorations for this thesis, although they were not initially intended to be a central part of the final submission as works themselves. beta_space, however, proved too valuable an environment not to conduct some research on audience experiences of my interactive works, helping to confirm and inform my own anecdotal experiences from 15 years of working both as an interaction designer and artist.


The works themselves are intentionally simple because I wanted to try and use the least amount of interface to encourage the greatest amount of interactivity. Essentially the question here is, “What is the minimum interaction required to create a playful interactive experience?” This was particularly pertinent to a museum and/or gallery environment in which visitors are often encouraged to stand and look and remain in ‘reverent visitor’ mode. The advantage and potential challenge of the Powerhouse Museum is that it is already full of interactive exhibits in the vicinity of beta_space. This sets the stage and atmosphere well because audiences – I prefer the term ‘interactors’ –  are inclined towards interaction with exhibits and installations already. The drawback is that it is much more of a challenge to divert and gain their attention.


The Time Sketches comprised a primary piece called Time Smear and a smaller, secondary piece, called Time Slicer. Both deal with live video and the slicing of time as interactive building blocks and came out of my long fascination with time-lapse and slow-motion filming as an expression of time beyond the usual human experience of perception. They both use a camera and place the interactor on the screen as the content whilst manipulating the time and frames of the video stream, chopping it up into fleeting moments and stretching it out across space. The result is a digital hall of mirrors, where the interactors can see warped versions of themselves. As previously discussed, the use of a camera as the interface effectively dissolves the interface and thus the interaction – and in this case the interactor – becomes the content, the experience and the “work” itself.


Figure 21: Time Smear (Polaine, 2006b)


Time Smear works in a similar way to a scanner or photocopier. It freezes a “slice” of space vertically  and then the next slice and so on across the screen width. The sample size can be altered as can the speed of the scan. This is similar, though the inverse, to a slit-scan camera often used for ‘photo-finish’ photography used in sporting events.


In a traditional slit-scan image, the shutter (the slit) remains in one place and constantly open whilst the film moves past it at a constant rate. In digital versions of this, a single line of pixels from a camera is joined to the previous on screen, creating a similar effect (see Levin, 2005 for an informal catalogue of slit-scan artworks, including Time Smear).


Time Smear works in reverse by freeze-framing a slice (a column, in this version) of pixels and then the next slice and the next over time leaving what usually appears to be a “smeared” image if there is movement in front of the camera, hence the name of the work. In the Time Smear image above, the boy on the left initially stayed still before moving his fist and wrist in line with the scan line and then staying still again once he had reached his brother’s jaw. In the small reflection in the middle of the screen (the screen was a back-projection onto acrylic) it is possible to see his face captured looking at the screen to measure his positioning.


Although the physical and screen-based elements of Time Smear are simple, quite some time was spent trying the work out on people to see what parts were essential or not and how necessary it was to make the ‘interface’ clear to interactors walking in fresh. Crucially, the video feed is live before it is frozen in a slice so that the interactor can orchestrate their movements. This means that the screen is partly live and partly frozen as the slice line wipes across the screen. A previous version of the work did not have this feature – the new frozen frame slices simply wiped over the previous static one – and as a result the interaction soon became a frustrating hit-and-miss affair because it was very difficult to work out exactly where the freeze frame would be and where to position yourself.


Once the installation was put in place several other adjustments were required. The first problem was that the background in the video frame, the back wall of beta_space was a plain colour and not moving. In my tests in my own workspace with students and friends, there was always some movement in the background so that it was obvious to an onlooker what was going on. In beta_space a person glancing in from the doorway (not in the video frame) would see a blank screen that looked like a live, albeit empty, video feed because there was no movement to be frozen, just the plain back wall. It was only when someone walked in front of the camera, and then only when they happened to walk in at the right moment in the middle of a slice scan that they understood what the installation was about. Otherwise they tended to move on, perhaps thinking it was broken.


To remedy this, I added a tiny affordance in the form of a very small marker – a white line two pixels wide and 10 pixels high – that ran across the bottom of the screen. Although subtle, this served as enough of a visual cue for interactors to successfully engage with the piece as well as for potential interactors to see that the work was active and awaiting participation.


Figure 22: Detail of the Time Smear slice line indicator


This simple interaction usually manages to capture people’s attention and make them wave their arms in the air, move around strangely, and generally play with the possibilities of the piece. The longer they engage in this behavior the more successful I feel the work has been compared to the thirty seconds someone might spend in front of a well-known painting at a national gallery. Most importantly, the invitation to play is clear as soon as one wanders in front of the camera and sees oneself on the screen.

When I first went in here the thing that initially captured my eye was the big screen there, and the fact that there was that little line at the bottom made it fairly intuitive to get the hang of exactly what it did. But to begin with I was looking at... I guess the different kind of images that I could make just using this small amount of interactivity, for example walking around and seeing if I could give myself really long fingers or something like that or just the illusion of doing that. I thought it was pretty interesting just the way how, by standing still, the image would remain the same, but if you moved at all, it would warp in unexpected ways – Time-Smear Interactor.

The end results are, of course, reliant on the interactor’s movements but are often reminiscent of the paintings of Francis Bacon in the way they smear and distort features and elements. Many interactors used the piece in order to see their body from different angles and distortions in a pleasant reflection of  the earlier quote from Merleau-Ponty on the Cubists: “Thus different areas of their paintings are seen from different points of view” (Merleau-Ponty, 2004, p. 54).


A second piece, Time Slicer, works in a much simpler manner. The video stream from the camera is brought into the program at a relatively low frame-rate and each frame is then displayed next to the previous one forming a grid of 20 squares on the screen, each containing a frame from the video. When the refresh of these squares reaches the bottom right of the screen the process starts again from the top left. The entire screen refreshes around once every one or two seconds, depending on the speed the code engine is set to run at.


Fine-tuning the piece – creating the correct level and balance of challenge versus fulfilment (or boredom and anxiety in Csikszentmihalyi’s terms) – was critical and is a central aspect to interaction design and a process sometimes left out of the development of interactive artworks (c.f. Edmonds, 2006; Muller & Edmonds, 2006; Muller, Edmonds, & Connell, 2006a; Muller, Turner, Khut, & Edmonds, 2006b). Through trial and error I adjusted the code until I found the best time delay to engage interactors with the work. Anything faster and it is too fast for them to see the results of their actions in previous frames, too slow and it no longer has a sense of movement. Again, the aim here was to create as much interactive engagement from the minimum amount of interface as possible.

Figure 23: Time Slicer (Polaine, 2006b)


In the testing phase in my own studio, from which the above image is taken, a great deal of engagement and fun came from allowing the interactor to handle the camera. Due to the time delay a common ‘discovery’ was to play with trying to recreate one’s face from the various slices by moving the camera across it in time with the freeze-frames. This also allowed people to explore the rules and dynamics of the piece, which will be examined in further detail in the next chapter, Principle 2 - The Playing Field and the Rules.


In its set-up in beta_space Time Slicer was running on a plasma screen opposite the open entrance area to the space. This provided an immediate ‘attractor’ to passers-by and drew them into the space and, importantly, in front of the camera that Time Smear was using. Although the lighting in the space was not ideal, this layout worked relatively well in terms of bringing interactors into the space and kick-starting the initial action, reaction, interaction cycle.


Unfortunately I was unable to have the camera for Time Slicer available for the audience to play with due to concerns that it might get broken and time and budgetary constraints that prevented me from having an appropriate casing made for it. This meant that most of the interaction was fairly rudimentary in front of the camera as in the following photograph:


Figure 24: Time Slicer installed in beta_space


The chief advantage of beta_space over a conventional gallery space is that it is specifically set up for research into audience participation with interactive artworks and to ‘beta test’ those works (Muller & Edmonds, 2006). Additionally, researchers at the Creativity and Cognition Studios, UTS had already developed a methodology to interview interactors who were members of the public visiting the gallery (Costello, Muller, Amitani, & Edmonds, 2005; Muller et al., 2006b). The primary method used was video-cued recall, which involves filming a subject’s interaction and then asking them to describe their own emotions and thought processes as they watch the video playback memory immediately afterwards. This technique, supported by Suchman (Suchman & Trigg, 1991), prevents the interview process from interfering with the interaction and also prevents lapses of recall when interviewing subjects afterwards, which can be a problem when much of the interactivity is about often unconscious bodily movements (Ericsson & Simon, 1993).  The Creativity and Cognition Studio researchers have had great success in terms of both accuracy and experiential data with this methodology (Costello et al., 2005).


The full transcriptions of these interviews are detailed in the appendices. They are not intended to form a full research project in their own right (the number of subjects is too small to form a coherent empirical analysis), but nevertheless the results contain insights that are highlighted here to illustrate elements of the principles of interactivity.


Here we see the initial experience of a father and son in front of the Time Sketches:

I was a bit nervous to start with. Callum [his son] didn’t know what we were doing. Then we saw the video. We didn’t really work out what it was at first so we just looked at both screens. It’s very interesting – Time Sketches Interactor.

In most of these cases the invitation to play was slightly forced because the experimenters explicitly asked people who were already showing a slight interest in the work whether they would take part in the research. It is quite possible that they would not have played with the works and simply continued onto other exhibits in the museum. The subsequent responses, however, tell us a great deal about the processes that people go through when initially exploring the boundaries of an interactive work and these are covered in more detail in the following chapter.


Most importantly, they help explain the experience of the interaction, the part of the work that I consider the artwork. The images themselves, whilst engaging and sometimes amusing, are less important to me as the artist/designer – they are ephemeral records of the moment. This is a common feature of interactive artworks, as Cornock and Edmonds note in their commentary about their 1969/1970 work, Datapack, a piece in which ‘participants’ (interactors in my terminology) are “allotted parcels of air space” and the “idea of a sculpture” rather than the actual object. In Datapack the participants can take away the typed record (an edition of multiples), but not the digital object itself:

Some measure of entertainment may be gained from the proceedings and from the edition of multiples which may be treated as having some aesthetic value or interest. But the burden of the exercise lies with the individual and his own conceptual behaviour when confronted with the art system using our chosen matrix (Cornock & Edmonds, 1973, p. 15).

Artist and theorist, Simon Penny, in his argument for a critical analysis of embodied interaction sums it up thus:


[T]he persuasiveness of interactivity is not in the images per se, but in the fact that bodily behaviour is intertwined with the formation of representations. It is the ongoing interaction between these representations and the embodied behaviour of the user that makes such images more than images. This interaction renders conventional critiques of representation inadequate, and calls for the theoretical and aesthetic study of embodied interaction (Penny, 2004, p. 83).

Case Study: Body Movies – public play and interaction

Before moving onto exploring the second principle of interactivity, I want examine a work that appears to function successfully across many levels. It is perhaps one the most monumental examples of public play and interaction and is Rafael Lozano-Hemmer’s 2001 work, Body Movies (2001a).


Body Movies consists of an enormous projection on the side of a building. In the first installation of the work this was the faćade of the Pathé Cinema in Rotterdam, which is 90 metres long by 22 metres tall. The cinema also faces onto a large public square. Projected onto the faćade from above are sets of portraits (normally six or seven at a time) of over 1,000 people photographed in the streets of Rotterdam and other cities around the world. The projections of the figures are then bleached out by two 7000 Watt Xenon lamps placed at ground level. When passers-by walk in front of the lamps they cast a shadow and reveal the figures. When each figure is simultaneously completely revealed by the shadows a new set of figures is displayed and the process starts again.


Figure 25: Rafael Lozano-Hemmer's Body Movies (2001b). Photo: Jan Sprij


Though monumental in scale, the idea is extremely simple. As children most of us have played with our hand or other shapes held over a torch shining on the bedroom ceiling at night. Lozano-Hemmer explains that part of the original inspiration came from the 17th Century artist, Samuel van Hoogstraten, who was a master of optical illusion and trompe d’oeil and whose work “emphasises the performative role of the observer” (Lozano-Hemmer, 2001b). In particular he focused upon van Hoogstraten’s 1675 engraving, "The Shadow Dance" from his book Inleiding tot de Hogeschool der Schilderkunst (Hoogstraten, 1675) which shows a bright light source at ground level and the various figures casting shadows as demons or angels.


Figure 26: Samuel van Hoogstraten's 1675 engraving, The Shadow Dance (Hoogstraten, 1675)


Body Movies is a successful piece of work on many levels. It has been shown in several major cities internationally and is no small undertaking. It is clear from the writings about the work (Cameron, 2004; Bullivant, 2006) and from Lozano-Hemmer’s documentation (Lozano-Hemmer, 2001b) that ‘audiences’ also find the work extremely engaging and playful. I have placed ‘audiences’ in inverted commas because they are also the performers and interactors of the work and with the work. What Lozano-Hemmer manages to achieve is an interactive and video-based public installation that works on several levels, each one stepping the participant closer to the meanings and ideas contained within the piece. The very first step, however, is a very clear invitation to play.


By creating such a monumental work in a public square Lozano-Hemmer is immediately able to draw upon the passers-by for both his content and his interactors. The powerful lights and resulting giant shadows provide a compelling reason for passers-by to stop and engage with the work – this aspect is a giant invitation to play. Almost immediately people stop and play shadow shapes across the faćade. The ability to ‘scale’ themselves larger and smaller is intuitive and very quickly complete strangers (and, of course, friends) begin to play with each other’s shadows, creating little scenes and scenarios, playing God or the devil, or giant puppeteers and marionettes. The sense of joy in the performative aspect is palpable from the video documentation, the shadows perhaps providing a sense of anonymity and remove that gives the interactor-performers license and safety to perform in public. At this point the interactors have not yet engaged with the ‘content’ of the work, with the projected images of people, but they have been caught, seduced and drawn into the playing space.


Often this is as far as the engagement goes; however it is a noteworthy feat to create an environment in a public space where complete strangers play, interact and connect with each other, especially in the current climate of fear about strangers. (This need for human connection and interaction is explored in the final chapter on playing together). The next stage in the audience engagement appears to be people starting to play with the projected portraits, sometimes ‘inhabiting’ them, as Lozano-Hemmer puts it. We can see this process in the image below from the installation in Duisburg, Germany:


Figure 27: Rafael Lozano-Hemmer, Body Movies (2003), Duisburg Akzente, Germany


At this point the interactors begin to engage with Lozano-Hemmer’s own content and not just his system or play space and this is the result of the interactors beginning to explore the setting and the rules of the play space. Many aspects of the piece’s rules and space exist in the affordances of the public square, the building and the nature of interpersonal interaction. A large part of Lozano-Hemmer’s artistry, however, is how he has used these affordances and combined them with his additional ones of the projections and shadows. The construction of these kinds of interactive, playful spaces, the explicit and tacit rules, and the ease with which interactors can find their boundaries forms the second principle of interactivity.


CHAPTER 7 – Principle 2: The Playing Field & the Rules

It was like a toy. When you first get a toy in [sic] to start playing with it, just play with it and just testing it out and see the limits and what you can do with it, and things like that, trying to work out how it does it as well.  That was the bit I enjoyed as well. Trying to work it out and see what it could do – Time Sketches interactor.

As the games theory analysis of Huizinga’s ‘magic circle’ demonstrated, games and play spaces as well as ‘open’ play utilise rules in order to release the players from everyday life and allow them to temporarily adopt another persona or way of acting. Often the rules are deceptively simple, such as keeping a kite in the air, playing within the lines on the field or scoring the most goals. Children’s games often have rules that are simple and easily understood, which belie the resulting complexities once play is in progress. The tacit or “unwritten rules”, as Stephen Sniderman  (2005) describes them, can be literally infinitely complex but are an essential aspect to play, games and life.


When creating an interactive artwork, often the actual objects (the graphics on the screen or the sounds, for example) are the least important aspect of the work. Even in more design-based scenarios, the feel and the experience can make or break its usage. If we have managed to seduce the audience into becoming interactors by the means of a powerful invitation to play, the key to keeping them engaged in the interactivity is setting up the correct conditions and rules for perpetuating a playful flow experience (Csikszentmihalyi, 1990; Csikszentmihalyi, 1996). Of course, for more functional interfaces such as applications and operating systems the objects on the screen are important, but so is playfulness. As Mark Pesce points out (2000), the more we are confronted by interfaces everywhere, on every surface, the more it will be necessary to be able to learn how to use them through play because we have no time to read a manual. Those interfaces that do not invite us to play and, once we start playing, quickly reveal their rules and boundaries are likely to fall by the wayside regardless of how clever the underlying technology is. Those companies, designers and engineers who manage to combine the two – technical advances with simpler, more playful interfaces – are set to have an advantage as we have witnessed in Apple’s remarkably quick success with the iPhone (NPD Group, 2008).


Once the invitation to play has been successful, there are two key aspects to further engaging the player, interactor or user – the playing field (the “magic circle”) and the rules. These translate into the boundaries and system that the creator of an interactive work has decided upon (here I am deliberately using the more open term of ‘work’ to include the spectrum from artworks to task-focussed interfaces). Some of the boundaries and rules are physical such as the size of the screen or the size of the physical space in which a work is installed. Some are unknowable and sometimes accidental, such as the location and physical attributes of the participant (this is increasingly a feature of mobile and location-based interactive experiences). Some elements, such as what the interactive work can and cannot do, what sounds and images it may include or exclude, and the dynamics of the interaction are usually the focus of much trial and error. The ‘rules of the system’ may change over time based upon the actions of the interactor and this forms the core of the third principle, The Challenge and the Border between Boredom and Anxiety. In the first instance, however, having engaged with the work, the interactors immediately test the boundaries and affordances of the system and try to discover the rules of play.

The Rules of Play

At first glance rules and play appear to be antithetical. Is not play all about freedom from rules? Is not interaction all about choices? This would suggest that fewer rules and more freedom of choice would result in a better game. Barry Schwartz’s critique of consumer culture and the commonly held philosophy that more choice is always better in The Paradox of Choice (2004) explains just why too much choice can be overwhelming and have the opposite effect. Rather than make the wrong choice out of too many options, we often decide to make no choice at all.


Rules and constrictions are often paradoxical – they often lead to greater freedom by nature of their giving structure. Writing a story from scratch with a completely new style and structure from what has ever gone before is difficult, if not impossible, unless we lock ourselves away from the world at birth. As we have already discussed in reference to Aristotle, story arcs, character journeys, dynamic tension and release in individual scenes all help rather than hinder the process of telling a story. Many of the classic gurus on writing (such as McKee, 1997 and Field, 1984) emphasise how important structure is and that when the structure works, the writing itself is what brings the story to life. Conversely, no matter how spectacular the use of language, if the structure is not there the reader or audience will become lost or bored. There are, of course, exceptions to this approach and innovative writers seemingly break these rules all the time (although on closer inspection the rules are often more bent than broken). The paradox, in writing at least, is that the use of a fairly strict set of structural rules frees up the writer to write.


This paradox is not just true for writing. Artists and designers have long used a restrictive palette of colours, forms or words to drive their creativity and the limited power of early computers and consoles inspired great feats of imagination and ingenuity from both the user/player and the designers. We see this clearly in play and games as well – all games are predicated on rules, but the situation is more complicated than it at first appears.


The Lusory Attitude

Some games are extremely complex with many rules, sub-rules and exceptions such as the famous role-playing game Dungeons & Dragons (Gygax, 1979). Others, such as Noughts and Crosses (or Tic Tac Toe - the name of the game in the USA) seem simple: Players play on a three by three grid. Each player takes turns to mark a square of the grid with their own marker (a cross or a nought). The game is won by the player who places three of their markers in a row horizontally, vertically or diagonally.


Yet, as Sniderman (2005) points out, there are countless unwritten meta-rules at play in a game as simple as Noughts and Crosses. Some of Sniderman’s questioning includes: How much pressure can the other player put on his or her opponent to place their mark? What constitutes a valid distraction? A cough? A punch in the face? How long is it acceptable to wait to take your turn? What does it mean to let someone “take back” a foolish move, such as might be the case when an adult plays against a child? Countless rules are at play that we may scarcely be aware of and we would end up in  “the infinite-regress trap” (Sniderman, 2005, p. 492) were we to try and describe them all, for there is always a set of rules (social, legal, moral, etc.) above any that we might try to name. On the other hand, we can play games and reach a mostly un-stated consensus without disappearing down the relativist’s black hole:

Perhaps the single most important ‘rules’ that are literally unstatable, then, are those that define the context of the game and answer the question, ‘When is the game being played?’ None of us can say how we know that we are in fact playing a particular game (rather than, say, just practicing), but we generally have no trouble knowing that we are (Ibid., p. 484).

Sniderman’s explanation for this is simple – “it’s only a game” (Ibid., p. 495). With this, we return to the theme of the willing suspension of disbelief we encountered in Chapter 2:

Games feel like any life-event, so we can be immersed in them even though we may know intellectually that they are artificial constructions. Therefore, it makes no difference to us (emotionally) that a list of rules governing them cannot be completed, just as we can be profoundly affected by a joke or a piece of fiction or nightmare that is not logical, realistic, or ‘complete’.

We can suspend disbelief and rationality [...] and respond deeply to creations of the imagination – our own or others’.

[...] In other words, we can operate on (at least) two distinct levels of cognition at once. We can play any game as if it had an autonomous existence, even though we know perfectly well that the players create the game each time they agree to play and that any player at any time can destroy the game by quitting, by arguing, by stalling, or by any number of other spoilsport tactics (Ibid., p. 495).

It is this dual state of play that Winnicott describes as being an “inner psychic reality” that is “outside the individual, but it is not the external world” (Winnicott, 2001, p. 52). If we accept Sniderman’s account of the nature of rules in games and pair it with Winnicott’s description of play, we are led back into the world of metaphor once more.


Crucial to understanding the relationship between play, metaphor and interactivity is this: When a metaphor borrows from one sphere of experience to describe another, new, experience, it is functioning in both states simultaneously. In order for metaphors to function, at least those that are new and not subsumed into daily language, we must also suspend our disbelief. For example, I know that the stock-market is not really a bubble and that it has not literally burst, but I suspend my disbelief in order to profit from accepting the metaphor and understanding the situation. But I use both the mental image of an actual bubble bursting and the metaphorical understanding of its meaning at the same time. Without holding the metaphor and the new situation in the mind at once, the metaphor could not work.


Bernard Suits in his philosophical treatise on play and games, The Grasshopper (2005),  conducts a dialectical analysis of the definition of games with particular attention to rules. Rules, he argues, “are accepted for the activity they make possible” (Suits, 2005, p. 181) and in connection with the world outside of the game he has this to say: “In morals, obedience to rules makes the action right, but in games it makes the action” (Ibid., p. 182).


Suits describes this willingness to abide by rules that in any other situation outside of a game would make little sense as the ‘lusory attitude’ (Ibid., p. 185) and gives a final, concise definition of playing a game as “the voluntary attempt to overcome unnecessary obstacles” (Ibid., p. 190). In essence, without the players having a lusory attitude there is no game because the game relies on the players playing within the agreed rules precisely because they are deliberately inefficient or obstructive. The simplest way to ‘win’ a hurdles race, for example, would be to run alongside them and not bother to jump. The simplest way to ‘win’ a fight in a boxing match would be to shoot your opponent. Obviously these things do not (normally) happen and it is because all players maintain the lusory attitude or the willing suspension of disbelief that the game rules are not ‘real’. The rules – and, by extension, the playing ‘field’ or magic circle – are the game. Not submitting to these rules means there is no game.

Using the lens of gameplay

This is all well and good for playing games, but where does this leave us in terms of interaction design that is not explicitly gameplay oriented? Can it inform the study of interaction more broadly and, if so, how?


There are two key insights to be had through the lens of gameplay here. The first is that the integrity of the magic circle, the willing suspension of disbelief and the internal coherence of the rule system can be directly mapped to the coherence of interactions employing metaphors. In the same way that it is frustrating and incoherent in Tomb Raider that Lara Croft has a pair of pistols, but cannot shoot open a lock on a wooden door (and instead must find a key), interfaces and interactions whose metaphors lack coherence suffer similar problems. This is demonstrated time and time again in the literature on usability and interaction design (see Krug, 2006 and Saffer, 2006, for example). A graphic that looks like it is a clickable button but is not or, conversely, a button that reveals no affordances as a button, fail basically usability criteria.


Classic approaches to usability and HCI, however, can often become caught up in a list of rules that are very specific to the particular technology to hand. Web sites that used excessive bandwidth just seven or eight years ago were deemed to have terrible usability, yet now in the days of broadband (in most parts of the developed world), downloading video and music streams is the norm. The upshot is that, whilst designers can still push things to excess, it is sometimes worth spending the extra bandwidth ‘hit’ on a code or visual solution that makes the usability better, not worse.


When usability experts such as Jakob Nielsen (1993) prescribe a set of technologically or engineering defined rules for usability they become rapidly out-dated and they often fail to take into account the purpose and ‘playing field’ in question. Rules that are developed in the same way as game rules in order to maintain the integrity and coherence of the game are much more transferable. Thinking about maintaining the willing suspension of disbelief or the integrity of the magic circle avoids technological determinism in interaction design and can be more easily transported across the continuum from art installations to very functional interaction and interface design. The principle of maintaining the suspension of disbelief and the magic circle remains the same, it is just that the magic circle or illusory world itself changes depending on context.


The second insight that we can take from this analysis of gameplay is that it is essential to understand, predict and manipulate the lusory attitude of the interactor. In user-centered and experience design terms this is akin to working from the user’s perspective through techniques such as insights field research and personas (Krug, 2006; Saffer, 2006; Mulder & Yaar, 2006). To understand the interactor’s lusory attitude is to understand the set of rules that the interactor or user is ‘playing’ by or what mental schemas they have of the system they are part of and interacting with.


It is common, for example, for novice web users to type the entire URL of a web site into the search engine box rather than the address bar. One the one hand, this is a fundamental misunderstanding of how the Web and Web browsers work. On the other, it is a perfectly legitimate way of conceiving of the Web given that most browsers come installed with a set ‘home page’ that usually includes a search engine box prominently displayed and the dominance of search online, particularly by Google. This is demonstrated by some advertisers who now simply say “search for..” and the product name or keyword rather than using their URL in their adverts (Tait, 2008).


By trying to understand both the task the interactor is wanting to complete and their mental model for doing so, we are essentially asking what game rules they are playing by. The classical approach to interaction design differs, however, in that the tasks of a ‘user’ are usually aimed to be completed by the most efficient means possible. This puts their activities outside Suits’s (2005) definition because in an efficient task-oriented interaction, unnecessary obstacles are not usually willingly overcome.


One way of thinking about gaming theory’s application to the interaction design of tool and task-based interfaces might be as a kind of inverse game design exercise. Here one could imagine that the aim of the designer is not to carefully introduce entertaining obstacles and constraining rules into the system, but to remove them as much as possible so that the ‘player’ can go directly from A to B as efficiently as possible. This approach aligns with many traditional HCI and usability approaches to interaction design, focussing, as they do, on the premise of interaction and interfaces being about tools and tasks. The problem with this approach from an interactive experience point of view is that it assumes that efficiency is the highest form of currency in interaction design and every other aspect of the experience is second-class. The results of this approach can often be efficient interfaces but uninspiring experiences.


If we focus on the user as a player instead of a worker or task achiever the approach is much more flexible. It may, for example, be beneficial and enjoyable for the user if there are deliberate obstacles in the process. This can be seen in “Simple Modes” in applications, in which the user sacrifices being able to do everything for a simpler, more minimal activity environment. It is also present in most “training” or “set-up” modes for applications where the user is unable to proceed to the next step until the current one has been completed. On a visual front, we see various effects in operating systems, such as windows morphing, fading in and out or sliding as animations. This is not the most efficient way to display or remove a window, but it can be more engaging and entertaining for the user. It also draws attention to key processes and provides some affordances where there are no ‘natural’ ones because of the on-screen nature of the environment.


By thinking of the user as a player, we are able to apply the same way of thinking (e.g. What are the rules they are playing by? What is their lusory attitude?) to a wide range of interactions, including more experimental and abstract ones. At Antirom we often made interactive sound ‘toys’ and it was noticeable that some were extremely simple, little more than audio playback buttons or mixers, and some were complex and more like a musical instrument (Allenson et al., 1994; Allenson et al., 1995; Allenson et al., 1996a; Allenson et al., 1996b). The technology of these toys was almost identical in each version, as was the audio-visual treatment. What changed was the complexity of interaction and, depending on the task or goal at hand (for example, a museum exhibit versus a tool for performance) they ranged from being either engaging or frustrating when one initially ‘played’ them. This process of experimentation highlighted the necessity to craft the right interactions for the right purpose, the right rules for the right game and player.


Taking the example of a real-world instrument such as a guitar, it is easy to see how important rules are because they relate to the entire mode of playing. A guitar has six strings, tuned in a certain way, with a number of frets along the neck. Those are the “rules” of a guitar. From this musicians can create all manner of combinations and evoke a wide range of musical emotions. The rules can be broken by de-tuning or post-processing the guitar sounds, but this is more of a bending of the rules. Breaking them completely renders the guitar less of an instrument than a box with a noise making capability.


Laurel suggests two rules of thumb for setting up good human-computer interaction. These are that “the potential for action in that particular universe is effectively laid out, and that the first incidents in the action set up promising lines of probability for future actions” (Laurel, 1993, p. 64). Although this advice is particularly relevant to interface design, successfully setting up the world of the interaction often requires stripping away any extraneous elements that might make the interaction confusing.


Again, Laurel turns to Aristotle for advice in this area:

An imitation of an action must represent one action, a complete whole, with its several incidents so closely connected that the transposal or withdrawal of any one of them will disjoin and dislocate the whole. For that which makes no perceptible difference by its presence or absence is not really part of the whole (Poetics, 1451a, 30-35 as cited in Laurel, 1993).

That is not to say that there is not a learning process involved and that every element should be labelled with what it does. In fact, that learning process is one of the key elements of engagement – it is part of the play. Trying to become “better” at the interaction, whether a sound toy or something that makes patterns from interaction with a camera, is an important feature the of interactive experience. Clicking a button labelled “click here to play a sound” and hearing a sound is predictably unsatisfying, unless it is a part of an interface in which that sound is crucial (such as a warning siren).


Interactive works that have an initial reactive element are successful because of the continuation of these “first incidents”. If the interactor does not even have to do anything except walk past a camera, touch the screen once or make a sound (all of which are elementary acts) to set a chain of events in motion the overhead of working out what to do in the first place is removed. Csikszentmihalyi has found that this clarity is essential to the flow experience. “The reason it is possible to achieve such complete involvement in a flow experience is that goals are usually clear, and feedback immediate” (Csikszentmihalyi, 1990, p. 54).

We are all hackers

A key aspect of play is hacking, prodding the system we are playing in, testing its boundaries and the rigidity, flexibility and integrity of its rules. Most people who have played any kind of game – even simple ones like Noughts and Crosses – have hacked it at some point. As we saw in Sinderman’s (2005) study of game rules, we bend rules in order to help a child player when we are adults. We add or remove rules in order to adjust the game to our skill levels or to add excitement to the routine.


I live very near to a public square where there is a children’s play area that includes a slide and a climbing frame and house. A slide, on its own, is simply a toy. Aside from the challenge of overcoming a fear of heights, there is no real challenge in using a slide; you simply sit at the top and push off until you slide to the bottom and enjoy the vertigo play experience. One day I observed two children playing on the slide. One climbed onto the slide part of it and stood with one foot on either side of the rim of it. He then challenged the other child to slide down and through the arch that his legs were making. At this point playing with the toy started to morph into a game. The little boy had “hacked” the normal role of the slide in order to create a more interesting play experience – a game.


As the Time Sketches (Polaine, 2006b) interactor quoted at the start of this chapter demonstrated, one of the things that we do when confronted with a new interface is to “just test it out” and “see the limits and what you can do with it”. Tagged onto this is that this is not just a process of working out the system’s rules and boundaries, but a process of “trying to work out how it does it as well”. When an interaction is framed as a playful experience, it triggers a lusory attitude in the interactor and encourages them to learn about the interactive system through play, so it doesn’t feel like learning at all, it feels like play. And play often feels intuitive.


Once we have invited the interactor to play and once they have understood the play space and the rules of the system, what then? Do they become bored eventually and leave or do they become increasingly more engaged or proficient? One of the central aspects of interaction is the interactive feedback loop. If one imagines interaction as the to and fro of a conversation, we need to ask how we keep the conversation going beyond the first greeting and talking about the weather. Within interaction design – from artworks to user interfaces – matching the demands on and of the interactor to their abilities is the difference between what Csikszentmihalyi (1975) calls the border between boredom and anxiety.

CHAPTER 8 – Principle 3: Challenge, Boredom and Anxiety


The psychologist Mihaly Csikszentmihalyi has written extensively on the state of complete absorption that one feels during certain activities, a state he calls flow (Csikszentmihalyi, 1975; Csikszentmihalyi, 1990; Csikszentmihalyi, 1996; Csikszentmihalyi, 1997b; Csikszentmihalyi, 1997a; Csikszentmihalyi, 2002). Many of his case studies have looked at artists and the way they work while others have examined the classic “in the zone” feeling that people like athletes and mountain climbers feel when totally absorbed by what they are doing. In his early work Csikszentmihalyi (1975) is rather scathing of television and videogames largely because they are not “real” activities. That is, the viewer or player lives out extreme situations or challenges vicariously through those of the characters on screen so the physiological and psychological stakes are different thus preventing real flow from occurring.


His language and tone suggest, however, both a lack of knowledge of videogames – which were in their infancy when he wrote his first book – and a dislike of television based on his misperception of its lack of cultural value (Csikszentmihalyi, 1975). Reading his theory differently, ignoring the rhetoric that TV and videogames are junk, it seems to perfectly sum up the processes going on when one is absorbed in a videogame and also in more minor interactive “experiences”. Since this original work, flow has become popular amongst experience designers and researchers who have used flow theory to analyse the experience and movement of web surfers through a site (Novak & Hoffman, 1997; Mathwick & Rigdon, 2004; van Gorp, 2008).


The relationship of interactivity to flow becomes even more apparent when coupled with an understanding of play, both in terms of games and playing for its own sake. Winnicott's (2001) view of play is that it operates in a halfway world between our inner and outer worlds and that “[p]laying is essentially satisfying" (Winnicott, 2001, pp. 51-52).


A central aspect of Csikszentmihalyi's (1988) notion of flow or the optimal experience is his examination of activities that are autotelic or intrinsically satisfying. Approaching interactivity in this manner brings a very different psychological understanding to the definition of interaction posited by Manovich (2001):

The key element of an optimal experience is that it is an end in itself. Even if initially undertaken for other reasons, the activity that consumes us becomes intrinsically rewarding. […] The term ‘autotelic’ derives from two Greek words, auto meaning self, and telos meaning goal. It refers to a self-contained activity, one that is done not with the expectation of some future benefit, but simply because the doing itself is the reward (Csikszentmihalyi, 1990, p. 67).

When one is involved in a flow experience, one is totally absorbed in the activity at hand (Csikszentmihalyi, 1992). This flow experience is not dissimilar from being emotionally absorbed whilst watching dramatic action, except that the absorption comes from one’s own activity and is not experienced vicariously through actors or other agents. As I argued previously, however, interactive media instigate a collapsing of the viewer-spectator, subject-object relationship – something that flow theory does not take into account.


Csikszentmihalyi outlines the eight conditions for the flow experience based on hundreds of interviews over many years. Participants reported at least one and often all of the following (laid out here as a list for clarity):

First, the experience usually occurs when we confront tasks we have a chance of completing.

Second, we must be able to concentrate on what we are doing.

Third and fourth, the concentration is usually possible because the task undertaken has clear goals and provides immediate feedback.

Fifth, one acts with a deep but effortless involvement that removes from awareness the worries and frustrations of everyday life.

Sixth, enjoyable experiences allow people to exercise a sense of control over their actions.

Seventh, concern for the self disappears, yet paradoxically the sense of self emerges stronger after the flow experience is over.

Finally, the sense of the duration of time is altered; hours pass by in minutes, and minutes can stretch out to seem like hours. The combination of all these elements causes a sense of deep enjoyment that is so rewarding people feel that expending a great deal of energy is worthwhile to be able to feel it (Csikszentmihalyi, 1990, p. 49).

Most of us would recognise these through some of our more engaging and pleasurable life experiences, and many also correspond to the sensations of being absorbed in the activity of playing (Sutton-Smith, 1997; Winnicott, 2001; Brown & Vaughan, 2009). Laurel also suggests "a key point that Aristotle made is that patterns are pleasurable to perceive in and of themselves" (Laurel, 1993, p. 55). It is interesting to note here that Aristotle is not only referring to the pleasure of catharsis in terms of dramatic events, but also the simple patterns or 'melodies' of these events.


Interaction and narrative (particularly in this traditional sense) pull in opposite directions. As we saw  earlier, even in the world of videogames, one of the most established traditions of screen-based interactivity, narrative and interactivity are not easy bedfellows. It would initially appear that the rules of narrative (at least in the sense that Aristotle and many others since have defined them) are the problem. Surely the ultimate in interactivity would be to create a completely open-ended environment, perhaps using an as yet uninvented technology that could replicate anything in the real world that the virtual environment needed? This was the failed dream of VR. Such an environment would be essentially “rule free”, but as we have seen in the theories of Caillois (1961) and Huizinga (1955) and, later, Suits (2005) and Sniderman (2005), games and even relatively open play need rules for them to function.


Flow is induced by a relationship between goals, competence and feedback  When Csikszentmihalyi speaks of the “border between boredom and anxiety” (Csikszentmihalyi, 1975) he is speaking of goals, feedback and challenges, which is what rules and boundaries provide.

The reason it is possible to achieve such complete involvement in a flow experience is that goals are usually clear, and feedback immediate (Csikszentmihalyi, 1990, p. 54).

It is no surprise that usability experts and user experience designers have made use of flow theory. However, the above definition of the border between boredom and anxiety also describes games, play and many learning and teaching theories (e.g., Ramsden, 1992; Brookfield, 1995; Gibbs, 1999; Biggs, 2003).


When the invitation to play and the rules and playing field are clearly discernable by the interactor, the goals usually become clear as well as the competencies required in order to achieve them. In some situations the aim is to keep the required competencies as low as possible so that an interface, for example, is as accessible to as many people as possible. Since Csikszentmihalyi’s original research his flow theory has been used to research the habits of Internet users and online consumers. Andrew King (2008) discusses several studies (Novak & Hoffman, 1997; Novak, Hoffman, & Yung, 2000) and uses them to explore the application of flow principles to web site design. One key aspect is speed:

Hoffman, Novak, and Yung (2000) found that the speed of interaction had a “direct positive influence on flow” on feelings of challenge and arousal (which directly influence flow), and on importance. Skill, control, and time distortion also had a direct influence on flow.


They found that speed had the greatest effect on the amount of time spent online and on frequency of visits for web applications. For repeat visits, the most important factors were skill/control, length of time on the web, importance, and speed.


So to make your site compelling enough to return to, make sure that it offers a perceived level of control by matching challenges to user skills, important content, and fast response times (King, 2008, p. 32).

King continues, summarising the research on flow in terms of goals and challenges for web users:


Less-experienced users tend to see the web in a hedonic, playful way, while more experienced users tend to view the web in a utilitarian way, or a means to accomplish tasks. The authors [Novak et al] found that telepresence/time distortion, exploratory behavior, focused attention, and challenge/arousal correlated with recreational web use, while skill/control, importance, and experience correlated with task-oriented activities, such as research, work, and shopping.

[...] In either case, to facilitate flow, as designers we must offer plenty of speed and “enough challenge to arouse the consumer, but not so much that she becomes frustrated navigating through the site and logs off (Ibid., pp. 32-33).

The advantage of using the flow theory in this context is that it is entirely user-centric. As King notes in the quote above, the actual experiences and actions that an interactor is engaged in can vary widely and still the flow experience can be attained. The important factor is the relationship of the activities to the competencies of the interactor.


Interestingly, in the example above, the playfulness of the web experience appears to ‘wear off’ over time and usage. The same phenomenon was noted by Nick Yee in his essay called The Labor of Fun (2006) in which he explores how computer games, especially online games that require excessive ‘levelling’ (building up of experience points, wealth or both), can become as tedious as work.


But what of interactive experiences that do not have specific goals and aims, do they still require levels of skill? The answer is yes, but that the levels of skill are also user-defined. This is most easily explained by the example of playing with a toy as opposed to a game. Toys tend to be simpler than games and they quite often have very little in the way of goals. A bouncy ball bounces, a soft toy can be cuddled, played with in role-play, thrown around, etc. Even complex toddler activity centres are usually fairly goalless (from interactor’s point of view), focussed rather on multiple modes of stimulation. In the aforementioned playground slide, the goal, if it can be said to be one, is simply to have fun sliding to the bottom. That is until the player adds some of their own rules at which point it starts to become a game.

The border between boredom and anxiety

The enjoyment of such toys comes not from trying to achieve an extrinsic goal, but in discovering how it works and what things can be achieved with the toy – an intrinsic goal. An infant’s toy that simply squeaks when squeezed and is pleasurable to touch and manipulate is enjoyable for its own sake. Throwing stones into a lake is satisfying to me because of the smoothness of the stones, their weight in the hand, the satisfying “plop” as they drop into the water, and the ripples that the stones create.


Using a toy, even throwing a stone, can soon develop into a set of mini-challenges that enhance the experience and prolong the pleasure of it. In the stone throwing example, this might be trying to hit a target of floating debris, or it might be trying to make the smallest or biggest splash. It may become a game of skipping stones where the object of the game is to skip the stone across the water as many times as possible. In this last example, this gives rise to the search for specialised “equipment” – the perfect flat, smooth, round pebble – and one can see how this process might quickly develop into a competitive game.


The mini-challenges, or intrinsic pleasures, of the toy sit at one end of a continuum with intrinsic, “purposeless” playfulness at one end and highly specialised, competitive games at the other. Csikszentmihalyi’s flow can occur at any point along this continuum as long as the balance between boredom (not enough challenge or too greater skill level) and anxiety (too great a challenge or not enough skill) is maintained, “when the challenges are just balanced with the person’s capacity to act” (Csikszentmihalyi, 1990, p. 52).


Given that a person’s skills improve with practice Csikszentmihalyi observes that the flow state is intermittent, particularly if the ratio of challenge to skill level does not remain constant. Flow, therefore, is a state that can travel up and down our play-game continuum. It is one of the reasons why Csikszentmihalyi’s rejection of videogames is unfounded – it is simply that he has failed to recognise the level of skill and challenge involved in many games and the typical “move through to the next level” design of videogames.


In Antirom’s sound interactive works (Allenson et al., 1994; Allenson et al., 1995; Allenson et al., 1996a; Allenson et al., 1996b), the enjoyment often comes from trying to work out what movements affect which parameters, essentially trying to get inside the creator’s mind and uncover the ‘wiring’ of the programming – the way in which the elements of interaction and sound are patched together. The other pleasure to be taken is one of self-discovery. How well can one ‘play’ the toy? Again, in a sound interactive, this might be a case of trying to make something rhythmical or musical out of some simple building blocks supplied by the author of the interaction.


On the one hand making a complex interactive (like a piano) with which it is possible to create a multitude of different combinations is powerful and gives great scope for personal expression, but the learning curve is likely to become tedious with dedicated practice required. On the other hand, making something very simple (like the ‘play’ button on a CD-player) can mean that it carries little interactive interest after the first one or two interactions.


In the sound example, making an interactive that has some scope for improvement but that does not produce a complete cacophony at the first attempt is a good balance of Laurel’s potentials for action. The act of learning needs to be pleasurable in itself (and perhaps this is more important than the final accomplishment) if the interactor is to remain engaged. Part of the skill of designing a successful interactive is balancing these various constraints and conditions to tune the level of boredom and anxiety and keep the piece engaging.


A web site, especially a commercial site, usually has a set of built-on goals, even if it is just trying to help the user navigate through the content clearly. Videogames, on the other hand, have to deliberately place obstacles in the path of the player precisely because they are games (Suits, 2005). Yet, at the same time, they face the problem of helping the player understand the controls and moves required to play the game. Game designers need to find ways to teach users (who invariably do not read the instructions, either on-screen or printed) without interrupting the flow of the game. King (2008) interviews Csikszentmihalyi, who is evidently more au fait with videogames these days. He asks whether speed of interaction plays a role in the flow experience. Csikszentmihalyi responds:

If you mean the speed at which the program loads, the screens change, the commands are carried out—then indeed speed should correlate with flow. If you are playing a fantasy game, for instance, and it takes time to move from one level to the next, then the interruption allows you to get distracted, to lose the concentration on the alternate reality. You have time to think: “Why am I wasting time on this? Shouldn’t I be taking the dog for a walk, or studying?”— and the game is over, psychologically speaking (Ibid., p. 33).

As Eric Zimmerman (2003) explains, computer game designers realised that reading manuals interrupted the entire flow and enthusiasm of the initial game excitement. How then to help players understand what they should do?


The initial response was the training level, a first level of the game that the player can (or must) go through in order to learn how the controls work and the moves a character can make. This was employed in games such as Tomb Raider (Eidos Interactive, 1998) and Tom Clancy’s Splinter Cell (Ubisoft, 2002). Speaking personally as a player of these games, the problem is that after the first run of these training levels they are intensely boring because they lack any actual gameplay and penalties. In a sense they are ‘play’ outside the ‘reality’ of the game and contain no risks. Splinter Cell is particularly irritating because the player can make a wrong move and have to start the training level again and is not allowed to proceed into the main game until the training level is complete. Even the time take to complete the training level feels wasted. Tomb Raider at least allowed for ignoring Lara Croft’s guidance and open exploration of the training area. Neither of these add to the flow experience in any way and usually hinder it significantly because the challenges, rewards (feedback) and skill levels are mismatched.


The common solution to this in more recent videogames is to dispense with any special training and simply make the first level of the game – and sometimes the entire game – a training exercise. Prince of Persia - Sands of Time (Ubisoft Entertainment, 2003) is an excellent example of achieving this balancing act, along with a cheat built into the conceit of the game. The game itself is built around the story of a Prince who is tricked by an evil Vizier into releasing The Sands of Time with The Dagger of Time. The Sands of Time turn everyone in the Kingdom into monsters. Naturally, the Prince must rescue the Kingdom and return the Sands of Time to the Hourglass and this is the central driver to the game’s story.


The core of the gameplay itself is a combination of puzzle solving, agility (which is really puzzle solving in that the player must work out the correct sequence of moves for any physical task) and a labyrinth. In this sense, The Sands of Time is little different from Tomb Raider or, indeed, the arcade classic, Pac-Man (Namco, 1980). What is different is that the game starts immediately, with no training level. The first task is relatively simple and it teaches the player a single move (how to run and jump) in order to complete the task. The subsequent tasks tend to then require the learning of another type of move (the Prince can, for example, run horizontally up and along a wall in order to avoid a pit in front of him) and then, next, a task that requires a combination of the former two moves. In terms of flow this style of gameplay ensures that the challenges remain almost perfectly matched to the skill levels of the player – always ahead enough to be challenging and thus not boring, but not so difficult that the player has such a problem keeping up that they give up.


Should the challenge be too great – and this can happen when one has run low on health and then enters a battle – the game’s conceit of the shifting nature of time comes into play. Each time the player kills one of the monsters in the game, he or she can use the Dagger of Time to draw out the Sands of Time from the monster and gain a “charge” of time. These charges can then be used to rewind time by ten seconds in the game (and can also be used to freeze or slow down time). This means that should the game’s challenges overcome the player they can “rewind” and reattempt the task. For many situations this is enough to correct a foolish or mis-timed move and results not in the emotional let-down of cheating in a game, but in much more fluid gameplay. The fact that this aspect is built into the gameplay avoids jumping the player out of the gaming experience to a menu screen that asks whether or not you wish to continue thus keeping the flow intact.


The time charges also add another player challenge to the game, because part of the gameplay is judging whether or not to use up a charge in order to turn back events (much like any other special move or power-up in a game) or simply let the character die and, if you have enough lives, be reborn and start again. This means the player is not just playing the obvious game, but also the game’s gameplay mechanism, which is made up of the rules. So we see several layers of play going on simultaneously; players are able to traverse these layers without jumping out of the mimetic context and without having to break their suspension of disbelief. Working out how to play is play itself and a powerful learning tool. To repeat the earlier quote from Steven Johnson:

That probing is a powerful form of intellectual activity – you're learning the rules of a complex system without a guide, after all. And it's all the more powerful for being fun (Johnson, 2006, pp. 122-123).

This has potent ramifications for interaction design right across the spectrum from artworks to functionally driven user interfaces. In general, usability theory and user experience design makes an effort to make the steps required to achieve a task as obvious to the user as possible. Yet there is a tension here in terms of catering for a range of skill levels and abilities. “Power users” of operating systems often complain that a simplified interface does not allow them enough flexibility and customisation. Beginners find interfaces with too many options bewildering and don’t know where to start, which is one of the reasons why many people turn to search as their main way of navigating websites (Krug, 2006).


If the design effort focuses on a middle way – some options, but not too complicated – there is a danger that it can be unsatisfying for everyone. A layered approach can move a user or ‘player’ (this combination is best described by the word ‘interactor’) through the skill levels without them being aware of it. In this scenario the invitation to play is the first, obvious and simple layer, then the playing field and rules are exposed and later further challenges and options. Ideally these stages of the experience unfold without the interactor even being aware of the learning process and it is this that makes a ‘deep’ or complex interactive experience feel like an intuitive flow experience. Striking this balance is the essence of interaction design. Working out the weighting of the various elements in order to achieve this comes with experience and experimentation and varies from project to project, but the underlying approach remains the same, which is the significant advantage of using play theory as the foundation of the process.


Classically play has a central aspect that is usually recognised as one of its defining features and that is that it is not work. As Kane (2004) and Brown & Vaughan (2009) amongst others have argued, this is a false dichotomy. Work can and should be playful, which does not mean it cannot be serious and meaningful at the same time. Focussing on getting tasks done can lead to forgetting that how the tasks get done is often more important, especially for the person doing the task. As Stuart Brown discovered in his research on play and people’s play histories, “the opposite of play is not work – the opposite of play is depression.” (Brown & Vaughan, 2009, p. 126).


Triviality, apparent purposelessness, fun and open-endedness are some of the hallmarks of playful experiences and interactions. The next principle explores this area. There is no need to legitimise play through any kind of rhetoric of progress or incorporating work as a side-effect of play. Play has a value in and of itself. Instead, this principle examines the promise of the playful experience and what happens when the invitation to play turns out to be to a different kind of party than the one we were expecting.


CHAPTER 9 – Principle 4: Triviality, Open-endedness, Promises

As I discussed in Part I of this thesis, much of the work created at Antirom and many of the examples of playful interaction developed there were quite open-ended in their form. In many cases there is no specific goal to the interaction, no task to complete, no interface to navigate in order to reach the content proper. The interface is the content, the content is the interface. Hence most of the works were termed toys rather than games. Later, because these terms often blurred together, Antirom co-founder Andy Cameron wrote about these as ‘interactives’ in his book, The Art of Experimental Interaction Design (2004).


In The Art of Experimental Interaction Design Cameron collected some of the leading interactive projects and it was notable that most of them were intended for spaces outside of the gallery context.   Even one of the largest-scale projects most easily classified as an artwork, Lozano-Hemmer’s Body Movies (2001a) discussed previously, sits literally outside the gallery. Placing interactive works in the gallery sets up a number of issues, some to do with the physical display of works and others to do with the context of works of art and a tension between the notion of ‘serious’ art and the triviality and purposelessness that some of the most successful interactive works exhibit. This tension arises from a mismatch of the promise and the delivery – the invitation to play and the playground and rules don’t add up.

Playing in the Gallery

I find [art galleries] very… there is no interaction. There is nothing to play with. It is just looking at things, so I don’t enjoy them as much, so I just choose not to go - Time Sketches Interactor.

Playful interactive content does not sit well with the ideals of serious commentary, contemplation and the hallowed white walls and respectful silence of the traditional gallery (Paul, 2005). This poses a problem for interactive works because their very purpose may be to create exactly that effect in the interactors. Erkki Huhtamo notes this tension in the history of interactive works making their way into galleries:

The active role of the spectator, turned into a ‘user’ or an ‘interactor’, was essential. The aim was to empower and challenge the visitor to go beyond the modes of usual spectatorship – the contemplating and “passive” attitude of the art lover standing in front of a painting or a statue. Of course, the reception of art itself can always be claimed to be “active” (an argument frequently used by traditional art critics in their invectives against interactive art). However, interactive art added to the mental activity a haptic dimension: the visitor was not only allowed, but required to touch the work (Huhtamo, 2004).

At best, such interactive works are tolerated and corralled into separate areas (often a “children’s gallery” as in the case of the National Gallery of Australia in Canberra, 2005), but even in these cases they are a diversion from the ‘real’ art, often not to be taken seriously. Yet these pieces often are not meant to be weighty or serious in the sense of imparting an obvious message. They are playful and when one plays one is allowed to make mistakes and transcend normal social behaviour precisely because one is playing. Giving play the respect it deserves, however, can be seriously motivated (Kane, 2004) and needs no special dispensation, no excuses – play is important in its own right.


I have argued that engaging interactivity is based on play and play is based in such ideas as physical movement (Winnicott, 2001; Brown & Vaughan, 2009), humour, noise, activity and transgressive behaviour – something set apart from ‘real world’ rules (Huizinga, 1955; Caillois, 1961; Sutton-Smith, 1997). Is there any hope for traditional fine art institutions to accept these works whole-heartedly into their realm having spent decades elevating art to a position of high-culture? In the traditional art world, most of these forms of behaviour will result in you being swiftly escorted from a gallery by security officers as happened to author, Chuck Palahniuk, when he attempted to enter the Seattle Art Museum wearing a Dalmation dog costume, although they let him buy a ticket before throwing him out (Palahniuk, 2004).


Interactive works lack a concrete object that can be hung on the wall or placed on the podium. In many cases the work itself is the experience of the interactor. Returning to the argument of Penny, “conventional critiques of representation, and those aspects of art theory that remain influenced by traditional psychology of visual experience, are not well equipped to deal with [embodied interactive experiences]” (Penny, 2004, p. 73).


There are occasional exceptions to this problem even while many traditional art galleries do indeed struggle to present interactive work in an appropriate manner. Large-scale works such as Char Davies’ Osmose (1995) and EphémŹre (1998) are often only possible to exhibit in more major galleries, especially when they taken into account the usage of space and context. This does, however, owe much more on the ability of the artist to create an engaging interactive work (as in the case of Davies) in spite of, rather than because of, the gallery setting.


The real gallery of interactivity is outside of the fine art world and its hallowed spaces, on the streets, in living rooms, on buses and trains as people interact with toys and games and gadget pleasures in the in-between spaces of their lives, the downtime during which it still permissible to play.


Indeed, much of the more interesting recent interactive work has occurred across social networks created online and wirelessly, mobile phone “toys” and entertainment played whilst on the bus, or in shop window installations and fringe exhibition and performance spaces. These spaces are less bound by the conventions of what I have termed “Big Fine Art” (Polaine, 2005a) and impose less of their own context on the work at present. A shop window passed by thousands of people each day may be a far better venue for a playful interactive or reactive work than a gallery – it is both more public and less onerous. It makes no attempt to be more than an engaging diversion, a moment of play. It may say nothing about anything very much, it may be trivial. On the other hand, as with Tomato’s Tokyo Life installation for Selfridge’s in London (Austin, 2002), it may drag people out of their daily drudgery for a sublime moment and illicit a playful interaction that they are too self-conscious to do in a gallery. Trivial play can be meaningful and essential to a healthy existence (Brown & Vaughan, 2009).


Figure 28: Tokyo Life - An installation for Selfridges’ shop window by UK collective, Tomato. The figures inflate when a person walks past, setting up and dissolving spectator-participant relationships continuously (Austin, 2002).


Videogames are often corralled into galleries such as The Barbican Art Gallery’s Game On (2002) exhibition, where they remain curiosities. The real exhibition space for videogames, in particular, is the lounge room or arcade, not galleries and museums. When we extract games from this environment we risk viewing them as anthropological specimens examined out of context under glass in the museum. There a game is as exotic as a ceremonial tribal mask and equally without context.


Much game art – games either modified or created specifically as a work of art – tends to attempt social commentary using the very medium they are critiquing. As mentioned in the chapter on The State of Play, many of these fail at being games first and foremost because they tend to try and make the games ‘serious’ in order to justify their place in the gallery. This mixing up of form and content is a frequent failing of media art. Videogames are not just screen-based media that have the visual trappings of a 3D environment and a character running around with a gun in a first person view. The space, rules and goals of the game make it what it is – the look is just window dressing. Many art games to take the visuals elements of a game and alter them (this is usually the easiest thing to modify in an existing game engine) and then strip back the gameplay to suit the social commentary. From my experience of many of these, the results are not something that would make a regular videogame player think deeply about their gaming, but rather something they might laugh at for being so amateurish.


Games and play do not require the playfulness to be sucked out of them to make them ‘serious art’ nor does art need to be serious to say something important about the human condition. Games should not need to apologise for being games, nor playful works for seeming to have no other “purpose” or deeper meaning. Play is important to culture even if we struggle to define exactly why (Sutton-Smith, 1997; Kane, 2004; Brown & Vaughan, 2009). We can easily understand this by imagining a world without play. Galleries and museums can afford to let themselves become playgrounds without apology.

Delivering the promise

In many respects, delivering the promise is the culmination of all the other principles. Delivering the promise is the other end of an invitation – what is set up in those first principles of interactivity needs to be delivered throughout the process. There’s nothing worse than turning up to a party in fancy dress only to discover everyone is wearing chic suits.


Earlier we discussed the idea of interaction being something that happens between two agents – either human-computer, human-human or computer-computer (with ‘computer’ interpreted to mean anything with some kind of processing on-board). Failing to deliver the promise is recognisable in many situations and relationships between two ‘agents’, whether they are people or machines, in our everyday lives. It is the reason people have arguments with their partners, it is the reason that  network transactions fail or why services and interfaces create dissonance. If one agent or interactor is expecting one thing and receives something different, then it’s usually a dissonant experience, which is usually unpleasant. Spoken and written language exchanges are full of these kinds of issues – one party thinks they’re making a joke, the other doesn’t get the sarcasm or doesn’t appreciate the sarcasm. Relationship tensions arise from one party feeling like an agreed deal was not honoured (often a deal that was unspoken or tacitly agreed). Devices and computers fail to connect because of mismatched protocols.


A large part of interaction design is about trying to align all these elements. In the parlance of the principles of interactivity laid out in this thesis, it is about ensuring that the interactive system and the interactors are playing by the same rules or, at least, agree on what those rules are and the boundaries of the playing field – the magic circle.


Traditional usability theory would approach this by trying to make everything as clear and usable as possible, but this misses several crucial, affective aspects of the interaction. Katie Minardo Scott explores this in her paper, Is Usability Obsolete?:

[M]any of the traditional usability methods quantify data that we no longer care about. Lab tests, heuristic evaluations, and computational models focus solely on goals like efficiency, accuracy, and initial ease-of-use. While these metrics were relevant early on, they are rudimentary at best. Common system-design techniques like use cases and scenarios should make fast, straightforward, and learnable UI design a given. Again, there are thousands of relevant, successful, timely examples to baseline against. And the “new” metrics like affect, stickiness, buy-in, loyalty, and engagement are nearly impossible to test within the confines of classic usability (Scott, 2009, p. 8).

Where the efficiency of task completion ideology fails to deliver is in situations that are deliberately enigmatic, as is the case of many games and interactive art installations. Where many of those applications fall down, however, is in misunderstanding the interactor’s understanding of the rules of the system.


If an interactor is expecting some mystery and exploration because that paradigm was successfully set up in the Invitation to Play, then it is fine to deliver something that isn’t clearly stated and in which part of the engagement and delight comes from trying to work things out. Making all the tasks and interaction obvious does not always make sense. For example, if a player knows how to finish a puzzle from the outset it can hardly be considered a puzzle. Yet in another situation an interface that promises clarity and ease of use and ends up being totally confusing has clearly failed. Usability theory is unable to account for these two different paradigms. Thinking of them in terms of invitations to play, rules of the game and the play space allows us to integrate all the elements and deliver on the initial promise made in the invitation to play.


One of the reasons for critiquing the exhibition of media arts and game art in galleries is because my experience of nearly two decades of such exhibitions is that many of the exhibits fail to deliver on their promises of interaction. In many cases the hardware simply fails too quickly because it has not been designed to withstand the rigours of public interaction. In a recent visit to ZKM – Zentrum für Kunst und Medientechnologie or the Centre for Art and Media Technology – in Karlsruhe, Germany, I counted at least seven exhibits that were broken. Often the reasons for them being broken is because the affordances of the exhibit are not clearly set-up – novice interactors believe a dial, button or knob should turn in a certain way and break it because it is designed to do something else.


I also observed interactors believing the interactive rules and system to be one thing and giving up because there was no feedback to suggest they were interacting in the “wrong” way. Other exhibits were so unresponsive that their programmatic systems were either in a state of confusion or interactors thought they were not working even though they were.


Several papers published as the key texts for the REFRESH! Conference in Banff in 2005 raised questions about the direction of new media works within the gallery environment. In particular, Christiane Paul’s Challenges for a Ubiquitous Museum: Presenting and Preserving New Media (2005) and Erkki Huhtamo’s Trouble at the Interface, or the Identity Crisis of Interactive Art (2004) examine some of the problems with categorising and exhibiting interactive artworks.


The problem, however, goes beyond the simple physical or curatorial issues of exhibiting interactive works, such as technology and sound or audience understanding and has enormous relevance to developing principles of interactivity. As will no doubt be evident by now, I suggest that interactivity is fundamentally at odds with the concepts of the gallery space, particularly in the larger, more conservative and traditional institutions, or “Big Fine Art” as I have called them elsewhere (Polaine, 2005a), which fail to appreciate the “triviality” or “purposelessness” aspect of interactivity.


Giving an overview of the situation Paul recognises video works as a precursor to interactive ones within the gallery space and notes:

[Video works] have for the longest time been an exception to  the mostly object-based art world rather than the rule. After approximately three decades, video now seems to have found an established, safe place in the art world but the museums [sic] relationship to performance or sound as art forms remains a problematic one (Paul, 2005, p. 2).

Another problem that Paul explores is the difference in depth between interactive works and video works. One can see a snippet of a video work and still gain some sense of the complete piece, she argues, but with interactive, ever-changing works this may not be the case. Whilst it is true that many interactive projects have levels of detail and configuration that are potentially missed by the gallery viewer, it largely depends on the mode of the interactivity and the reason why interactivity has been used. She later makes a point that is central to this conundrum:

One of the biggest challenges for the presentation of new media art is to engage the audience for a period of time that is long enough to allow a piece to reveal its content[…] Moreover, new media art often requires a certain familiarity with interfaces and -- despite the fact that computers seem to have become ubiquitous -- one can still not presume that every audience member will be an expert in navigation paradigms (Ibid., p. 2).

Although Paul makes a strong argument for thinking of alternative ways to present interactive art and critiques the usual white-box “shrine for contemplating sacred objects” (Ibid., p. 4) she still refers to interactivity as a way into the “content” of the work, rather than simply being the work itself. I make this point not as a criticism of Paul, whose understanding of new media and digital art are not in doubt, but to underline a problem with the definition of interactive works and the common slippage of thought about them.


Both Paul and Huhtamo note the necessity of many interactive works to be “completed” by the audience, or interactors. They rightly point out that interactive works move beyond passive spectatorship and that the meaning of works may indeed not “exist” without interaction. There is still a sense here, however, that meaning and content is the aim of the interaction – that we should know or understand something about the artist’s comment on the human condition through the content that we arrive at during or after interaction.


Paul’s comment that, “one can still not presume that every audience member will be an expert in navigation paradigms” (Ibid., p. 2) highlights the issues of the invitation to play and the rules of the system not lining up with the context of the work and the delivery of the promise. Interactive artworks that successfully manage this relationship (as in Lozano-Hemmer’s Body Movies discussed previously) do not require audiences or interactors to have any prior knowledge of navigation paradigms – discovering the navigation paradigm is part of the experience of the work itself.

Case Study: Eavesdrop – an opportunity missed

Galleries and museums such as ZKM are filled with plenty of interactive works in which the interaction itself is less important than the content it leads to, even though the interface is the most prominent (and often expensive) part of the work. Jeffrey Shaw and David Pledger’s work, Eavesdrop (Shaw & Pledger, 2004) is a high-profile piece exhibiting this phenomenon.


Eavesdrop utilises a 360-degree wrap-around screen with a “turret” in the centre that the interactor stands upon. The turret houses a projector so that when the interactor turns it the video pans in the opposite direction to the panning square of projected video. Thus we appear to see a ‘window’ onto a panoramic video that is continuously looping around a nine-minute segment. The arrangement of the video is such that we are placed in the centre of a cast of characters seated in a circle at cabaret tables. The characters are involved in various dialogues.


Each character (including the members of a three-piece band) has their own soundtrack that is mixed in multi-channel surround sound. As the turret revolves and the view is zoomed in and out the sound mix widens and narrows appropriately. The device itself is an impressive piece of engineering and this part of the interactivity is the most pleasurable in terms of interactivity in its own right. When I used the work and watched others interacting, most people appeared to enjoy zooming in the “camera” and excluding the other audio or simply spinning the turret around.


Yet the content of the work itself, the dialogues and internal mental spaces of the characters, represented by cutting from the panorama to a separate video sequence overlaid on the screen and no longer part of the panorama, did not really utilise the intrinsic pleasure of the interaction beyond the sound mixing. The performances and dialogue were underwhelming and the disconnect between the interactive possibilities and the content was ultimately disappointing. One’s taste for or against ham acting aside, rather than exploring this turret-like interaction’s intrinsic qualities the perceived need to place some kind of meaningful content into the work detracted from its interactive potential.


Unlike Char Davies’s works (1995; 1998), the video content of Eavesdrop would have worked equally well panning past the frame on a flat computer screen with a click instead of a zoom to enter the internal world (although zooming would also be possible). Even the sound mixing would work on a home theatre set-up or in stereo. When looked at this way, the work is little different from some of the early 90’s “multimedia” classics such as Peter Gabriel’s Eve (Gabriel, 1996) that used similar scrolling panoramas.


Monash Unversity’s Robert Nelson, art critic for The Age newspaper provides this rather scathing but accurate analysis of the experience:

The rationale for the formidable machinery escapes me [...] The installation proposes a formal continuity, which is disappointed by the content [...] The revolving projector is no more transcendent than a horse who moves its head while wearing blinkers. You arrive at pre-recorded sequences and can move in and out; but none of the actors knows or cares what you do. The limited movements highlight your passivity in witnessing each scene. And you only move because you're dead bored.

This deceptive sop to interactivity tells us nothing about the actors and nothing about the organic movement of the eye in the perception of space. It isn't a critique of perceptual systems or cinema; nor does it deconstruct the mise en scene of theatre or painting. The machine suggests that you're in the centre and the stage is the periphery; but this mechanical inversion doesn't displace the viewer/screen relations of conventional cinema or perspective. It's a toy (Nelson, 2004).

Nelson’s final “it’s a toy” shows the degree to which the interactive system has been falsely set up in terms of the expectations and delivery. He uses the idea of an artwork being a toy in a pejorative sense to highlight its lack of critical value as an artwork. Of course I would argue that artworks can happily be toys and have value. However, Eavesdrop fails at the toy level because even the turret rotation – the most engaging part of the installation – is not very engaging for long. You can’t do anything much with it that allows you to develop your skills or challenges further. It is essentially ignored by the artist as an integral part to the work and is simply a technology that is somewhat in the way of the content.


With Eavesdrop one is left asking what the point of the interactivity is in terms of the relationship to the subject matter of the video. It could have so easily been something more playful and engaging based on other much simpler interactions that were already available. Remodelling and scaling it down it in this way would no doubt have gained it less research funding as a result (it would have been cheaper in any case) and this is one of the issues with scaling up interactive projects to satisfy larger institutions penchant for grandiose works.


This is, however, a hallmark of Shaw’s work (who is by no means alone in this) as Haque and Pangaro explain:

One very influential work in the history of "interactive" art is Jeffrey Shaw's Legible City from 1989. Although it was considered ground-breaking in terms of technological implementation and its participation model inspired many subsequent projects, we do not believe that it is a useful model to follow if one is interested in designing interactive environments. The project has a novel interface solution: the use of a bicycle to navigate around streets of text. However, it makes no moves towards any notion of circularity or conversation that a truly interactive environment should have. The text contained in the streets is static, unchanging, limited. In effect, the system conceals the data that a participant might be interested in, and the participant is forced to navigate round the data in what is, frankly, a strangely difficult way. If you try to cycle beyond the boundaries of the city you step into nothingness. This mode of participation, concealing data and requiring people to manoeuvre around it in order to uncover that data, has become a dominant model of “interactive” art and design, but should more correctly be referred to as “reactive” art. It is also worth noting that ORIM [one-way, reactive interaction model] projects are quite firmly rooted in a 19th century deterministic philosophy that has been challenged even by mainstream science in the 20th century. (Haque & Pangaro, 2006)

Of course, there are plenty of interactive artworks, products, interfaces and interactives that use technology as a central pillar of their interactive experience. Dan Graham’s experimentation with mirrors and video time delays (Graham, 2000), for example, explores the essence of video and time and the viewer/participant’s perception. Graham’s installations are surprisingly simple but immediately engender relatively long periods of interaction. The works of other interactive media artists like David Rokeby, Rafael Lozano-Hemmer or Daniel Rozin actively explore interactivity in its own right rather than as an add-on to traditional video art mores (c.f Rokeby, 1995; Lozano-Hemmer, 2001b; Rozin, 2001) and the technology places a critical role in both the engagement and the development of the interactive experience.


In commercial interaction design projects and products the combination and balance of technology and experience design are equally critical. All the more so because, unlike a work of art, the designers of these products and services do not have the luxury of artistic authorship with which to say, “Here is the work, take it or leave it, understand it or don’t.”  Websites, exhibits, self-service machines, mobile phones, computer operating systems, household devices – the list of interactions in contemporary life is long – all require the design of their interactive experiences to be integrated with their physical designs and technologies. The mobile phone market, especially with regards to smartphones, is littered with the carcasses of those that have failed to integrate the product design with interaction design. Smart marketing and clichés aside, the examples of the iPod and, now, iPhone from Apple have demonstrated what a difference this can make in the marketplace. Interactive media artists could learn a great deal from these designers and, indeed, this iterative design rationale is the key premise of Creativity and Cognition Studio’s beta_space (Muller et al., 2006a) in the Powerhouse Museum in Sydney. In return, manufacturers could learn a lesson about presentation, enigma and delight from those artists who have created successful interactive artworks, as many of them have evolved through experimental play with interactive technologies and media forms.


The powerful aspect to these four principles – the invitation to play, the playing field and rules of play, the creation of flow and delivering the promise – is that they can be applied to such a diverse range of interactions, yet they can integrate other methodologies where the appropriate specificity is required. As we saw with the Lozano-Hemmer Body Movies (2001a) work, interactivity can also bring people together and when constructed well, playful interactions can encourage strangers to emerge from their protective public-space bubble and connect with others (Paulos & Goodman, 2004; Dekel et al., 2005) . The ability of the Internet to connect strangers and develop communities has been documented since its early days and brought to mainstream attention most notably by Howard Rheingold in The Virtual Community: homesteading on the electronic frontier (Rheingold, 1993), his famous exploration of the WELL online community of the early 90s. What becomes clear throughout the book is that the moniker ‘virtual’ is a misnomer – the community may be separated by geographical distance, but the bonds and relationships are very real.


In the past few years, during the writing of this thesis, the explosion of social networks and communities brought these experiences into mainstream society – young and old alike have engaged in applications like Facebook and MySpace (Boyd, 2008). Although applications like Facebook, MySpace, Bebo, YouTube, Flickr and Twitter are now household names, they are still in their infancy in terms of the development of new cultural forms, which makes their potency all the more impressive. In their wake are hundreds of other copycat sites and services, each with their own variations on a theme, but only a few that have managed to cross generational and socio-economic  boundaries and hit the pages and airtime of traditional media (Pesce, 2006; Boyd, 2006)


As I started to receive more and more invitations from less digitally savvy friends (and long and often happily lost acquaintances) to connect on Facebook, it became evident that social interactions –especially those mediated by networked interactive technologies ­– follow a similar pattern to the principles of interactions outlined above. In essence, online communities have continued and expanded from the military roots of the Internet and begun playing together (ironically, the military then had to clamp down on the use of such networks and has only recently decided to open them up again to its personnel – see Shachtman, 2009). Instead of deadly serious war games, however, people are playing social games (and, of course, non-serious war games). With the development of mobile interconnectivity beyond voice calls and text messaging only now starting to take place outside of Japan, we are seeing the early shoots of this kind of social interplay.


New media forms do not develop in a vacuum and there is a constant interaction between culture and the development of the technology. For McLuhan (2001) a “hot” medium was one that extended high-definition, such as film versus TV. “Any hot medium allows of less participation than a cool one,” he wrote originally in 1964 (Ibid., p. 25). His contrary definition becomes difficult to apply in our Web age, despite his coining of the term “global village”. The Web is both high definition and low, highly participatory but also reactionary and sometimes passive. We connect with each other through ever changing keyholes of media forms. On one side, we can now have relatively high-resolution video calls with all the bandwidth, both in the digital and social communication sense, that this entails. On the other side, Twitter’s 140 character limit makes for a tiny portal through which we connect perhaps more personally than that video call. Perhaps Twitter’s giant conversation between thousands is more interactive than a video call between two people.


Some emerging media forms die off quickly, despite the hype of their developers. Others quickly become a staple part of cultural, connected life and almost invisible. Yet more become the flavour of the day before dying away into obscurity again, whilst others languish until a convergence of audience acceptance, processor speeds and cultural paradigms suddenly find a use for them. Like slime moulds or suburbs (Polaine, 2007), independent technologies converge and become a single entity. One of the most powerful of these pertinent to this thesis is the social network – the manifestation of a dispersed society at play.


CHAPTER 10 – Social Interaction and Playing with Friends

During the Northern Hemisphere Summer of 2006 I noticed an interesting phenomenon. As someone who typically explores emerging media I am used to my friends and family playing catch-up with regards to online trends. I am the bore who replies that he has already seen it to those who e-mail the “latest” viral video. Facebook was created in 2004, I joined some time in 2005 and had been a member of Orkut, Google’s earlier social network, for a few years before that. I had left my Facebook account rather dormant, but that was not anything unusual – I have a whole host of social network and other ‘Web 2.0’ application accounts languishing in cyberspace. What was unusual was that in the space of a two week period I experienced an exponential flood of requests to add me as a friend or invite me to join Facebook.


Notably these were from people who were completely new to the entire idea of online communities and social networks, some of whom had barely learned to use e-mail and most of whom I had had little contact with for years, even though they were acquaintances or distance relatives. I was receiving one or two friend requests a day at first and by the end of those two weeks I was receiving around 10 per day. Soon afterwards, I started receiving equally as many requests from Facebook’s system of integrated applications – the infamous “Zombie” application that allows you to ‘bite’ and infect your friends; another that invites to join a pirate band and sail the seas of Facebook and battle other pirates; people sending me virtual hugs or gifts and hourly messages written on my Facebook “wall”. Several simultaneous games of Scrabble were also underway. I could feel the network reaching its ‘tipping point’ – the notion that Malcolm Gladwell (2002) had popularised a few years before. The exponential rate of connections to connections that my own immediate social contacts extended into was triggering the cascade of odd people who were re-entering my digital life.


Bombarded with requests – often repeated ones – and occasional negative responses from those I had not ‘friended’ in return and who felt slighted, I reined back my privacy settings on Facebook and allowed only occasional e-mails to come through from those who had sent me messages within the Facebook system. This last channel of communication had to remain open because friends started to use Facebook’s messaging system like e-mail. The irony of this was that by the very fact of being my friend on Facebook, they all knew my e-mail address anyway, except that the Facebook messaging method meant that if they sent me a message, I received an e-mail from Facebook to tell me they sent me a message and then I had to visit Facebook in order to read the message. Thus Facebook added several steps into the process of a friend sending me a message (all the while generating Web traffic for Facebook, of course).


Within the space of few months, Facebook had become mainstream and, much like blogging, had moved from those “in the know” (mainly teenagers and people working with online media) to being taken up by newspapers, TV chat shows and a demographic of users for whom this kind of online community was an entirely new experience. The whole world (at least those with Internet access) seemed to be “Facebooking” and spending extraordinary amounts of time nurturing their status and connections.


It was evident that these people were not only joining a social network, but that they were playing. Not only were they connecting with each other and playing the obvious games of Facebook, such as Scrabble or the pirate game, but they were also playing Facebook and its unstated, albeit scarcely hidden, social game as well. Examining this and other social networks and communities, a pattern started to emerge that is usually repeated every time a new, but similar technology arrives on the scene (such as Twitter, the most recent example).


The first is quite literally an invitation to play. For early adopters this stems from an interest in these emerging technologies and forms and the appeal of the new ‘new thing’. For many the invite is the invitation to take part in the private beta-testing phase of an online application. Here a common process is to seed invites to a certain number of people and then give these people a number of invites to selectively distribute. This helps to gradually load and stress-test the system that is being developed and it also creates the queue-outside-the-nightclub effect that makes an online social network or application appear to be more popular and populated than it really is. Some social networks in these early stages really are the equivalent of an empty nightclub – everything is there except the crowds, which of course are the main reason for anyone going there in the first place. We go to see and be seen, not just to listen to loud music in a darkened room and drink expensive drinks.


In the case of Facebook the invitations to play were invitations either to join Facebook, become someone’s friend (on Facebook) or to actually play a game. As less expert users started to join Facebook (and Facebook’s simplified interface appeared to appeal to this demographic) all sorts of other, social etiquette questions started to arise – questions that are similar to those asked about the ever-connected nature of mobile phones and e-mail devices like Blackberrys. Should I always add someone as a friend if they add me as a friend? What is the social transaction and expectation of reciprocity? What am I stating publicly if I add person X as a friend but not her friend, person Y? What if someone sends me a virtual gift, should I send one in return? If I don’t take part in this game, will that person be offended or just think I am busy (even if the truth is that I don’t want to continue a online relationship with him or her)? Some added every application or group they were invited to, others became hostile to the incessant zombie bite requests and formed Facebook groups with names like, “No, I won’t add the Zombie application!”.


The different motivations of Facebook users point to a set of players in a game still working out what the rules and boundaries are. Several researchers (see Pesce, 2006; Ellison, Steinfield, & Lampe, 2007; Lampe, Ellison, & Steinfield, 2007; Boyd, 2008; Joinson, 2008, for example) have analysed the different kinds of social strategies in action in social networks. Some of these strategies are passive, including “social surveillance” or “virtual people watching” (Joinson, 2008, p. 1029), in which the players are really more like spectators of a game, others are far more active, right down to “perpetual contact” (Ibid., p. 1029) with ‘closer’ friends (or players).


Social questions arose (and in many cases are still being explored) connected with privacy issues and the effects of some Facebook applications that rank your popularity or attractiveness based on how your friends and friends of friends rated you (Gross & Acquisti, 2005; Strater & Richter, 2007; Strater & Lipford, 2008). Applying J. J. Gibson’s (1979) idea of affordances to the social network we can see Facebook users not only exploring the interface, application and technology, but also the social boundaries, interactions and signals that Facebook affords. That the privacy issues are not clear – social networks, by their very nature, need to have some public facing information otherwise they do not grow – and that attitudes towards privacy vary radically shows a set of users testing out the affordances of the system. These networks are yet another medium, but a social one as well as technical. Asking “What does it mean to change my relationship status (especially to show that I am no longer together with another Facebook user)?” is the social-network equivalent of asking yourself whether a fallen log would afford sitting on or what the affordances of a shard of flint might be when used as a tool.


APIs as invitations

The technologies can be complex in online applications, but they are frequently opened up in a limited but potentially powerful way by the API (Application Programmers Interface), which gives  third party developers ‘hooks’ into the system. In many respects these are the real affordances – they allow third parties to extend the functionality and uses of a service, often into realms the original developers had not thought of. Facebook application developers also push social and technical boundaries and explore the possibilities of Facebook’s API, sometimes causing Facebook, its users or other companies to react to unforeseen consequences (especially with regards to privacy).


In essence, having been invited to play, the users and developers (who are also users of course) start to explore the magic circle of the game and to test and develop its rules. My own response of turning on or off certain privacy settings and ways strangers could contact me through Facebook was me deciding on the rules I am willing to play by and setting my own boundaries. I am only willing to allow certain people into my Facebook magic circle, as it were, and play the Facebook social game. Others remain, at best, spectators or create their own play spaces and find their own playmates.


This kind of heuristic probing and exploration is the hallmark of game-playing as Moulthrop explains:

Games – computer games in particular – appeal because they are configurative, offering the chance to manipulate complex systems within continuous loops of intervention, observation, and response. Interest in such activities grows as more people exchange e-mail, surf the world wide web, post to newsgroups, build web logs, engage in chat and instant messaging, and trade media files through peer-to-peer networks. As in various sorts of gaming, these are all in some degree configurative practices, involving manipulation of dynamic systems that develop in unpredictable or emergent ways.


It might be absurd to suggest that all interactive media are species of game, but games do seem to offer a useful way of thinking about such media (Moulthrop, 2004, pp. 63-64).

In many respects opening up an API to third party developers allows for a flow experience to develop, especially for those more active in a social network such as Facebook. As the users explore the application or service, so they become more competent and expert users. The addition of extra elements, such as Facebook’s many thousands of applications creates not only a kind of Darwinian experiment in itself (Farnham, 2008), but also an increased complexity to match the increasing skills of the user. Interestingly, in a flow sense, the overload of status updates and applications leads to both boredom and anxiety or, rather, boredom of the anxiety of constantly needing to tend to one’s Facebook status and relationships. Whilst some users remain absorbed in Facebooking for many hours at a time, others simply get fed up and leave or let their accounts go dormant.


What appears to happen during this cycle of initial attraction and invitation, frenzied use and absorption and then descent into boredom and quitting of social networks (and other online applications) is also a case of not delivering on the promise. In some situations this is the fault of social and media hype about the application or service. Hype is by its very nature over-sells the realities of its object that tends to inevitably lead to disillusionment (Wipperfurth, 2006) and this hype can just as easily come from family and friends for whom Facebook, for example, resonates strongly as it can from online and offline media reports. When the hype fades and one becomes more acquainted with the particular application – the rules are known, the playing field is clear and the thrill of the exploration of those is over – it might turn out to be a game you do not want to play anymore.


For others, it is not that the promise has failed to be delivered, but the game is over and there is a new game in town to try. For them, it is not the actual game or application that is the drawcard, but the exploration process that Johnson describes when he says,

I've often found certain applications are more fun to explore the first time than they actually are to use – because in the initial exploration, you can delight in features that are clever without being terribly helpful (Johnson, 2006, pp. 122-123).

Once you know how the application works and have tried it a few times, there is always something newer and more exciting to go off and explore. The delivery of the promise in this respect is about a constant supply of something new to explore. Reaching the limits of the system too quickly is the biggest disappointment, something that constantly plagued early branching narrative interactive games before the rise of evolving, autonomous simulation games like The Sims (Maxis, 2000). In fact, Will Wright, the creator of The Sims and regarded as the smartest game designer in the industry, noticed a similar process to Johnson when he was building his first game, a top-down bombing game called Raid on Bungeling Bay (1984):

It was a landscape of islands with roads and factories and things, and I created an editor for doing that, where I could scroll around and put down the roads and things. I found that I was having more fun designing and building these islands than I was bombing them in the game, so I took that editor and I kept working on it (Moggridge, 2006, pp. 364-365).

It was this that turned into the game, SimCity (Maxis, 1989) and set Wright on the journey that led to his creation of The Sims (Maxis, 2000), a game that is almost entirely about social play in a microcosm. The skills and engagement of tending to your Sim world and its characters cast a clear light over tending to one’s Facebook profile. And since the original Sims game, Wright has developed an online version, which closes the loop – real people create virtual people that interact with other real people’s virtual people. As with other online application APIs, players can also create their own designs and share them with the wider community.


In all of the examples above we see that, given the chance and the mechanism, people will play. This play leads to multiple interactions with each other and the interfaces of the systems, games, tools and toys that we create. When connected with each other through technology or a face-to-face situation, people will develop their own language, syntax, customs and rules. People and the communication between them are the content, not the games, items or graphics themselves. As Stuart Brown argues, “Humans are social animals, and play is the gas that drives the engine of social competence” (Brown & Vaughan, 2009, pp. 88-89). Play might be fun, it may sometimes be aimless or seemingly “purposeless”, but as we shall see in the closing chapter, it is anything but trivial and pointless. If anything, a deeper understanding of play and how it pervades everything we do highlights how much of what we consider serious, important work can be easily viewed as pointless. But first, let us see if we can finally place the experience of interaction within a coherent context and framework.

PART III - Conclusion

CHAPTER 11 - Understanding Interactivity Through Play

After this long journey through technology, play, interactivity, metaphors and cognition, are we any closer to developing a language of interactivity? As we saw with social and networked interaction, this language is still being discovered and developed. It seems as soon as we come to understand one form of interaction another technology comes along – such as multitouch gestural interfaces or a new Web service like Twitter – and these change the language again. The only thing that is stable is that everything is in constant flux and the pace of technological and cultural development is unlikely to slow to a halt. Recalling Saffer’s (2006) diagram of interaction design and its analysis of overlapping disciplines, we can now see that this will never, and can never, be complete (in fact Saffer has recently updated the diagram to include new areas and it is even more complicated).


Figure 29: Dan Saffer’s definition of Interaction Design. Re-Drawn from Saffer, 2006


In many respects one might place all of the above spheres inside interaction design, or certainly service design. As discussed in the introduction, trying to define interactivity from the point of view of specific disciplines is like trying to find one’s way out of a labyrinth whose passages are constantly changing. A once promising avenue turns into a blind alley whilst another apparent dead-end turns out to be, if not the way out, certainly a new cave of wonders.


The only future-proof approach to designing for and dealing with an environment of constant change in interactive interfaces, technologies and systems is to look for a mechanism and theoretical approach that underpins them all. A cross-disciplinary, inter-disciplinary or, perhaps, discipline agnostic approach. As play is such a fundamental building block of culture, society, technology and cognition, it is an ideal lens through which to examine the interactive experience. It is versatile enough to cross boundaries and fundamental enough to be understood universally, at least in terms of experience even when it defies concrete explanation (which is part of its power and charm).


A deeper understanding of play alone is a powerful tool with which to analyse the interactive experience, but combining it with the understanding of metaphor, embodied cognition and movement, themselves related to play, helps develop a more grounded and rounded theory from which to draw upon. If we re-think Saffer’s Venn diagram and instead look at the common elements between play, embodied cognition and movement and metaphor, the interactive experience sits at the crossover point of the three:



Figure 30: The interactive experience at the centre of play, metaphor, embodied cognition & movement

Almost every interaction – whether machine-to-human or human-to-human, one-to-one or a socially networked service – is metaphorical either in representation or mental model. Understanding the physical relationship of embodied cognition to the construction and use of metaphors helps us to understand why certain interactions and interfaces feel more intuitive and natural than others. This is particularly pertinent in relation to play because of play’s pre-verbal nature. This, in turn, helps us to design intelligent computer-based systems of interaction and interface because an understanding of cognition from an embodied realism standpoint most accurately represents how intelligence develops (c.f. Brooks, 1991b; Bailey et al., 1997; Narayanan, 1997; Lakoff & Johnson, 1999; Brooks, 2002; Johnson & Lakoff, 2002).


As we have seen, play is also highly metaphorical, requiring an agreed belief in the potency of rules and the boundaries of the “magic circle”. Sometimes these spaces are physical (such as lines on the ground), often they are metaphorical and intangible (such as the boundaries of good sportsmanship or social expectations), sometimes they are both (such as a child deciding a cardboard box is a boat and thus the kitchen floor is shark infested water upon which one cannot walk).


Like metaphors, rules of play are agreed forms and codes of understanding – the willing suspension of disbelief is essential to both. We must be able to understand, for example, that disbelief is not really a machine whose running can be suspended, but that we can understand it as such. We know there is no ultimate set of rules for every eventuality in a game, even though we play as if there is (Sniderman, 2005). Players also take on metaphorical roles, from being the operator or interactor through to being the lion or pirate, or even the judge or the devout lover in our cultural and social ‘games’.


Play begins as a physical and movement-based form. We explore the boundaries and workings of our own bodies, we explore the world around us and its affordances, including those of social relationships. Brown and Vaughan (2009) describe this with clear echoes of Lakoff and Johnson’s (1999) embodied realism:

Movement is primal and accompanies all the elements of play [...] even word or image movement in imaginative play. If you don’t understand and appreciate human movement, you won’t understand yourself or play. Learning about self-movement creates a structure for an individual’s knowledge of the world–it is a way of knowing. Through movement play, we think in motion. Movement structures our knowledge of the world, space, time and our relationship to others (Brown & Vaughan, 2009, p. 84).

In interactive interfaces, metaphor, movement, time, space and relationships all converge. We think of the patterns of light and dark and colour on a screen as objects. We manipulate these objects as if they were real objects, sometimes with magical, metaphorical properties. At the same time we are moving our bodies, either by the use of the mouse or, increasingly directly with our hands and gestures in order to “move” these “objects” on the screen. We take on and play the roles required by each software application, many of which were previously the domain of specialist technicians. We “build” social networks, “chat” with “friends”, even though all the while our physical reality and actions are no different from when we’re typing in a word processor or reading a blog post.


In particular, multitouch and gestural interfaces collapse the boundary between the physical action and the metaphorical action. Even though the metaphor is often still present (such as “pinching” an image to make it smaller) it has become invisible, subsumed by the physical action that is similar or identical to the one we would use if the virtual object were real. It is only when the system breaks that we are jolted out of the “mimetic context”, as Laurel describes it (1993, p. 102).


Interactivity comes in myriad forms and continues to become a feature of more and more aspects of our lives. Interactivity is a complex, many-sided, overlapping concept and it is no wonder conventional post-structuralist readings of interactive media (e.g., Manovich, 2001) or task-oriented, straight-line approaches to usability and HCI (e.g., Nielsen, 1993) have failed to encompass its lotus flower-like interrelated complexity. Using our understanding of play, which is highly intuitive but often escapes concrete definition, allows us to make sense of the complexity and still understand, or at least appreciate, each petal’s singular nature and still respect its relationship to the whole.


The principles of interactivity outlined above can be used as a lens through which to view everything from single aspects of interactive interfaces that are solidly task-oriented, whilst still being able to take in and pull apart complex, messy interactions such as social and service interactions and all the variations between those two ends of the spectrum. The principles also provide a toolkit from which to build interactives. How solid – or, perhaps, flexible – these principles are when applied to wider and wider fields is the remit of future research, especially as newer services and technologies develop that make the objects of analysis even more complex.


The growing interest in play and its role in redefining cultural rules and roles in a post-industrial society is not to be underestimated. Although this is not directly related to the central question of interactivity in this thesis, the impact of the shifting role of play and new technology within culture has profound implications for anyone working in the area. Due to the inevitable nature of long research projects examining emerging and shifting technologies, this will have to be the subject of future research dissertations. However, some issues as they relate to this research are explored in the following postscript.


Social Play & Cultural Disruption

Throughout my research into play, interactivity and emerging technologies a common theme kept arising with regards to the importance of their driving role within culture and the profound changes they create. Profound because the creation of many contemporary interactives is concerned with connecting people, whether through a mobile device’s interface, through a shared online experience and network or through a shared offline experience such as an installation or as part of a service. The following deals with many conjectures about the future, albeit by tracing trajectories from the past and present. It is necessarily less rigorous than the rest of this thesis because the future is, of course, unknown. However, many of the themes and issues described are already emerging, if not fully-formed, and are worthy of discussion as final thoughts and a bridge to future research.


The effects, shifting mind-states and resulting artefacts ripple out into the wider society as more and more of these interactive systems and interfaces have their inner workings opened up for others to add to. They affect everything from sustainable futures through to deep social equality issues and the foundations of the industrial age. Their warning signals come from a variety of directions, but all point to the same trend – the shift from a centralised, controlled, efficiency obsessed industrial revolution model of manufacture and society to one that is decentralised but openly interconnected allowing for spontaneous interactions between niche groups able to create waves of change.


Mark Pesce (2000) in his treatise on the playful world ends his book with a vision of a future in which nanotechnology and evolving ‘active’ intelligence of the computer meet, creating a world which anything can be an interface and everything and everyone is connected. Some of those are already here taking their form in new Web applications, the blogosphere, services like Twitter, gestural interfaces, mobile ever presence and smart surfaces (see Klooster, 2009, for more on smart surfaces). The exact level of these individual advancements in technology is somewhat less important than the timing of them converging and connecting together. For this Pesce borrows Whitehead’s (1978) term of concrescence. “As things grow together, they become a new, unique thing” (Pesce, 2000, p. 268). A recent and well-known example is “Web 2.0”, which is really just a collection and interlinking of several existing technologies and services (O'Reilly, 2005), but whose impact as a whole “entity” has been enormous. Pesce continues:

We are on the threshold of an era of active objects, designed from the ground up, atom by atom. They will contain their own intelligence, so they can listen and respond to our needs. Although solid, they will be changeable, able to transform themselves nearly instantaneously, to meet the requirements of the moment (Pesce, 2000, p. 270).

This gives rise to the question of how one designs an interface for, or interactions with, something that is constantly changeable and transformable. Although it does not physically change, in many respects the iPhone and similar multi-touch devices that only have a screen and virtually no buttons are good prototypes for this situation. An iPhone that is not turned on is an impressively designed glass brick. When it is turned on it can mutate into a telephone, a Web browser, a spirit level, a compass, a patient’s health analyser, or any one of myriad games and applications. And as we’ve already seen, the key feature of ‘learning’ how to use the iPhone is to simply play with it and discover. The future of mutable, physical objects and interfaces that Pesce describes will need to be playful in order for us to make sense of them. Those that do not will die out through lack of use in a world of thousands of alternatives, he argues. Naturally, the social and cultural implications of such a world are not lost in his argument:

Those who have grown up in a world where the physical persistence of objects is an assumed absolute might find the plastic reality of ever-mutable objects to be a disorienting nightmare; our children will know nothing but the possibility of change.


As the children of the playful world grow, as they internalize their new truths and translate them into new cultural values, an older generation is likely to react with shock and horror as they see the truths of their own culture ignored (or defied) by a generation who knows that the rules have changed, who have learned a different way of being in the world (Pesce, 2000, pp. 270-272).

It is difficult to think of play without thinking of children, of course, and that opens up concerns ranging from the future of education (c.f Polaine, 2007), the future of society, nationhood and sustainability.


In The Play Ethic Pat Kane (2004) examines the changing nature of society and a break away from the Industrial Revolution’s Protestant remit of the work ethic. Kane argues that the mind-body split that developed along with the notion of the efficient factory worker makes no sense in an economy that has moved from producing goods to producing knowledge. If an innovation consultant has her best idea at home, who does it belong to, her or her employer? The idea of selling one’s labour in terms of physical time at the grindstone dissolves in a world where one is permanently connected. We see the tension between the old centralised idea of the corporation and the decentralised world of the Internet in action when employers might discipline employees for using Facebook or blogging during their working hours, yet expect them to be reachable and respond to e-mail or mobile during evenings and weekends.


We can replace the “mutable objects” in Pesce’s description with people in the medium of culture. People are the atoms that can re-configure to meet the requirements of the moment. Connected and mobile technologies facilitate this just as surely as chemical signals of an ant colony (Johnson, 2001).


Both Kane (2004) and Pesce (2000) point to the shift in education that the Industrial Revolution created, from teaching in the home or immediate community to a regimented, very unplayful, mass education based on a theory of the transmission of knowledge from the broadcasting teacher to the receiving students. Playful discovery is largely killed off and certainly discouraged in this form of education – a form that has merely changed its technological clothes over the last 200 years but whose structure has essentially remained the same until now. Even now change is largely being driven by the generation of pupils and students rather than the governments, teachers or lecturers (Polaine, 2007). Yet as Stuart Brown (Brown & Vaughan, 2009) notes,

The neuroscience of play has shown that this is the wrong approach, especially considering that students today will face work that requires much more initiative and creativity than the rote work this educational approach was designed to prepare them for (Ibid., p. 99).

Brown points to neurological research (Gordon, Burke, Akil, Watson, & Panksepp, 2003; Panksepp, 2004) that shows play not only helps focus during learning, but also the retention of knowledge.


On a sustainability front, William McDonough and Michael Braungart’s ground-breaking book, Cradle to Cradle: Remaking the Way We Make Things (2002), rails against the Industrial Revolution’s obsession with efficiency. Whilst it has brought us many things, they argue, it has also filled the world with toxic effluent, unhealthy workplaces and homes and is unable to ever become a truly sustainable way of living. The more efficient cars become, for example, the less obvious the damage they do to the environment. But the damage is still there and it is many ways worse because of its invisibility – it is all too easy for consumers to put it to the back of their minds.


Instead, McDonough and Braungart argue for effective rather than efficient systems using the example of a cherry tree as a metaphor. An efficient, Industrial Revolution style cherry tree would grow only one cherry and that one cherry would drop and take seed and grow another tree. By contrast, of course, a real cherry tree grows many cherries and most of them do not take seed, but the waste is food. Despite all the resources it may have used whilst growing, because of its conversion of renewables in the form of sunlight and rainwater, it leaves the environment a better place than before as well as providing beauty and a habitat for other creatures and plants.


Taking “waste equals food” as their mantra, McDonough and Braungart set about creating products and systems that are also effective systems. They successfully helped a fabric factory develop a manufacturing process that resulted in the effluent from the factory being cleaner than the influent and a sofa that could simply be composted at the end of its life. During their research for the project they found that most suppliers of materials were extremely resistant to such change, even though it was highly beneficial both environmentally and financially.


Almost thirty years ago Alvin Toffler predicted most of the shifts in culture we are currently experiencing in The Third Wave (Toffler, 1981), his seminal study of the future. Whilst his predictions on space exploration and mariculture have not come to pass and the shadow of the Cold War is no longer with us, his opening paragraph is eerily prescient:

In a time when terrorists play death-games with hostages, as currencies careen amid rumours of a third World War, as embassies flame and storm troopers lace up their boots in many lands, we stare in horror at the headlines. The price of gold–that sensitive barometer of fear–breaks all records. Banks tremble. Inflation rages out of control. And the governments are reduced to paralysis or imbecility (Ibid., p. 1).

Toffler’s insight was another example of concrescence. He connected the dots of seemingly chaotic events and change happening in the world and showed how they were caused by three successive waves, each one lasting for a dramatically shorter time than the other. Dramatic upheaval ensues as cultures and world visions clash and power relationships shift when and wherever a wave sweeps over the previous wave’s culture as it rises or recedes


The First, agrarian, Wave changed society from a nomadic hunter-gatherer society to one rooted to a certain patch of land. The Second Wave was the Industrial Revolution built upon the “codes” of “standardization, specialization, synchronization, concentration, maximization and centralization” (Toffler, 1981) along with the regulatory power structures to manage them – central government (whether capitalist or socialist), the corporation and the regulators or executives of these structures. The Second Wave had far reaching effects:

Industrialism was more than smokestacks and assembly lines. It was a rich, many-sided social system that touched every aspect of human life and attacked every feature of First Wave past. It produced the great Willow Run factory outside Detroit, but it also put the tractor on the farm, the typewriter in the office, the refrigerator in the kitchen. It produced the daily newspaper and the cinema, the subway and the DC-3. It gave us cubism and twelve-tone music. It gave us Bauhaus buildings and Barcelona chairs, sit-down strikes, vitamin pills, and lengthened life spans. It universalized the wristwatch and the ballot box. More important, it linked all these things together–assembled them, like a machine–to form the most powerful, cohesive and expansive social system the world had ever known (Ibid., pp. 22-23).

Education, he suggests, was built on the same factory model and whilst reading, writing, arithmetic and  few other subjects were the “overt curriculum”, the “covert curriculum ... was far more basic. It consisted – and still does in most industrial nations – of three courses: one in punctuality, one in obedience, and one in rote, repetitive work” (Ibid., p. 29).


Critically, the Second Wave separated the consumer from the producer and created a psychic split in culture that is being re-joined by much of the social play described earlier, albeit it in a different form. On the one hand in our producer role we are told to be disciplined, task-oriented, restrained and delay gratification, writes Toffler. On the other hand, as consumers, we are taught “to seek instant gratification, to be hedonistic rather than calculating, to abandon discipline, to pursue individualistic pleasures–in short, to be a totally different kind of person” (Ibid., p. 42).


It is possible to re-think many other products and ways of living and working using the cradle-to-cradle mentality. Thinking of higher education public funding as fossil-fuel that is becoming rarer and rarer and not ever coming back, for example, makes one realise that the “do more with less” maxim of the industrial revolution is not sustainable in this context (as most lecturers and students will confirm!). What is required is a radical re-think from the old way of doing things to something with a completely different culture, much of which is driven and supported by networked, social, decentralised and constantly morphing technologies.


The Third Wave, as Toffler describes it, echoes across many valleys of society just as the Second Wave rushed through them like raging torrents.

Third Wave civilization turns out to have many features–decentralized production, appropriate scale, renewable energy, de-urbanization, work in the home, high levels of prosumption [consumers as producers], to name just a few (Ibid., p. 336).

In many cases this is a society of players rather than workers. A cultural re-think from task and efficiency-oriented systems, in which the computer and other interactive interfaces are presented as tools to get work done, to a personal and playful culture. A culture in which the consumer is also the producer through open-source projects, blogs, reviews, YouTube, Flickr, Twitter and the many similar services. A culture in which media is a personalised, niche Long Tail (Anderson, 2006) selection as opposed to top-down, mass media monoliths. A culture that is using the tool of the Internet, ironically developed for one of the most regimented and hierarchical Second Wave systems – the U.S. military – as a means to re-connect and re-build smaller villages and communities. These are communities populated by shared interests irrespective of geographical boundaries, demolishing the Second Wave notion of nation states.


Yet this is not just play as an alternative to work or “seriousness”. This is social play that has serious consequences. Open-source thinking in politics (Rushkoff & Demos, 2003) threaten government authority whilst pro-ams playing with new technologies in their own garages and basements disrupt the traditional dominance of big institution research and development (Leadbeater & Miller, 2004).


At the time of writing, Twitter, at once a giant chat space, an instant messaging tool and a micro-blogging platform has risen from back-channel to central news source for millions of Iranians – and their supporters across the globe – disenfranchised by a government which has allegedly rigged elections and made every effort to choke off mainstream media. So much so that people were turning to Twitter rather than CNN for their news coverage, even going so far as to blast CNN via the use of the #CNNfail hash tag, which peaked enough to provoke CNN to respond (Cashmore, 2009).


Twitter has emerged as a classic example of what happens when a simple interactive application that invites play and exploration, has simple rules and connects people and conversations gains enough players. It moves from being trivial to a powerful social force for change or, rather, it becomes this force whilst still retaining the so-called trivial, for this is what keeps it engagingly human.


Designing the tools, toys and spaces for the prosumer generation requires a different way of thinking about processes and usage. Instead of the straight line, task-based thinking advocated by most theoretical and methodological frameworks regarding usability and human-computer or even human-human and producer-consumer interaction, we need to think in terms of dynamic systems. Playful, networked and game-like systems encompass order and messiness, rules and free-for-all simultaneously. This kind of thinking ultimately leads to service design, which has grown out of product and interaction design. The key to service design is connected thinking as Chris Downs co-founder of pioneering service design agency Live|Work says,

We are children of the Web. Before our generation, interaction designers were dealing with interfaces. We started making Web sites in 1994, so we have our heads wired to think about networks. When you deepen an interaction beyond an interface and think about the network, you eventually end up at the service (Moggridge, 2006, p. 413).

Service design is the natural meeting point for play, interaction, sustainability and experiences for they are all the tangible touchpoints of the intangible ‘service’. “As designers, we know how to inscribe cultural meaning into objects and interfaces, but we don’t know how to inscribe cultural meaning into a service,” continues Downs (Ibid., p. 421). We do, however, know how to inscribe cultural meaning into games and play and they form an excellent and versatile lens through which to understand the complete range of interactions in our complex contemporary lives. Through play we experience our first interactions with the world and through play we can design all the ones that follow.



The references below are in APA 5th format.


37 Signals. (2006). Getting Real (PDF eBook ed.). Chicago: 37 Signals.

Aarseth, E. (2001). Computer Game Studies, Year One. Game Studies, 1(1), 05-10.

Aarseth, E. (2003). Playing Research: Methodological approaches to game analysis. Paper presented at the Fifth International Digital Arts and Culture Conference, Melbourne.

Aarseth, E. (2004). Genre Trouble: Narrativism and the Art of Simulation. In N. Wardrip-Fruin & P. Harrigan (Eds.), First Person: New Media as Story, Performance, and Game (pp. 45-55). Cambridge MA: The MIT Press.

Aarseth, E. J. (1997). Cybertext: Perspectives on Ergodic Literature. Baltimore: The Johns Hopkins University Press.

Agarawala, A., & Balakrishnan, R. (2006). Keepin’ It Real: Pushing the Desktop Metaphor with Physics, Piles and the Pen. Paper presented at the Computer Human Interaction Conference 2006, Montréal, Québec, Canada.

Akimoto, A. (2008). iPong on multiple iPod Touch. Retrieved 1st September, 2008, from


Alba, J. W., & Hasher, L. (1983). Is memory schematic. Psychological Bulletin, 93(2), 203-231.

Allenson, A., Baumann, J., Blaha, T., Bhangal, S., Cameron, A., Clayden, J. et al. (2001). Director 8.5 Studio. Birmingham: Friends of Ed.

Allenson, A., Cameron, A., Quesne, R. L., Pendrell, L., Pendrell, S., Polaine, A. et al. (1994). Anti-Rom [CD-ROM]. London: Antirom.

Allenson, A., Cameron, A., Quesne, R. L., Pendrell, L., Pendrell, S., Polaine, A. et al. (1995). Levi's In-Store Interactive Kiosk.

Allenson, A., Cameron, A., Quesne, R. L., Pendrell, L., Pendrell, S., Polaine, A. et al. (1996a). Antirom, JAM.

Allenson, A., Cameron, A., Quesne, R. L., Pendrell, L., Pendrell, S., Polaine, A. et al. (1996b). Levi's White Tab Interactive Window.

Anderson, C. (2006). The Long Tail: How Endless Choice Is Creating Unlimited Demand. New York: Random House Business Books.

Anderson, L. (1995). Puppet motel [CD-ROM]. New York: Voyager.

Apple Inc. (2008). iPod Touch TV Ad. Retrieved 1st August, 2008, from http://www.apple.com/ipodtouch/ads/.

Ashby, F. G., Isen, A. M., & Turken, A. U. (1999). A neuropsychological theory of positive affect and its influence on cognition. Psychol Rev, 106(3), 529-550.

Austin, J. (2002). The Graphics Book (D&AD Mastercraft). D&AD (British Design & Art Direction): Rotovision.

Bailey, D. (1997). A Computational Model of Embodiment in the Acquisition of Action Verbs. Berkeley CA.


Bailey, D., Feldman, J., Narayanan, S., & Lakoff, G. (1997). Modeling Embodied Lexical Development. Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society: August 7-10, 1997, Stanford University.

Ballmer, S. (2008). Ballmer's Letter to Microsoft Employees: "FY09 Strategic Update". Business Week Retrieved 28th August, 2008, from http://www.businessweek.com/technology/content/jul2008/tc20080723_438404.htm.

Barbican Art Gallery. (2002). Game On: The history, culture and future of computer games.

Bargh, J. A., & Barndollar, K. (1996). Automaticity in action: The unconscious as repository of chronic goals and motives. In P. Gollwitzer & J. A. Bargh(pp. 457-481). New York: The Guildford Press.

Barnes-Powell, T. (2006). Blurring boundaries, extending horizons and creating new opportunities. Paper presented at the Centre for Learning and Teaching in Art and Design 2006 Enhancing Curricula Conference, Lisbon.

Bartlett, F. C. (1932). Remembering: A Study in Experimental and Social Psychology. Cambridge: Cambridge University Press.

Baum, L. F., & Granger, P. (1978). The Wizard of Oz. [London]: Armada.

Bazin, A., & Gray, H. (1967). What is cinema? Berkeley: University of California Press.

Biggs, J. B. (2003). Teaching for quality learning at university. Buckingham, UK: Society for Research into Higher Education & Open University Press.

Boroditsky, L. (2000). Metaphoric structuring: understanding time through spatial metaphors. Cognition, 75(1), 1-28.

Boroditsky, L., & Ramscar, M. (2002). The roles of body and mind in abstract thought. Psychol Sci, 13(2), 185-189.

Boyd, D. (2008). Friendship. In P. G. L. Horst, D. Mahendran, K. Martinez, C. J. Pascoe, D. Perkel, A. Manion, S. Mitnick, D. Schlossberg, S. Yardi, & I. Lead (Eds.), Hanging Out, Messing Around, Geeking Out: Living and Learning with New Media. Cambridge, MA: MIT Press.

Boyd, D. (2006). Viewing American class divisions through Facebook and MySpace. Apophenia Blog Essay Retrieved 20th October, 2006, from http://www.danah.org/papers/essays/ClassDivisions.html.

Bricklin, D. (1999). Software Arts and VisiCalc: The Idea. Retrieved 8th August, 2008, from http://www.bricklin.com/history/saiidea.htm.

Brookfield, S. D. (1995). Becoming a Critically Reflective Teacher (Jossey-Bass Higher and Adult Education). New York: Jossey Bass.

Brooks, D. (2008). The Behavioural Revolution. The New York Times: Opinion Retrieved 31st October, 2008, from http://www.nytimes.com/2008/10/28/opinion/28brooks.html.

Brooks, R. A. (1991a). Elephants Don't Play Chess. Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back, 3-15.

Brooks, R. A. (1992). Intelligence without representation. Foundations of Artificial Intelligence, MIT Press, Cambridge, MA, 139–159.

Brooks, R. A. (1986). Achieving Artificial Intelligence through Building Robots. Memo from MIT Artificial Intelligence Department.

Brooks, R. A. (1991b). New Approaches to Robotics. Science, 253(5025), 1227-1232.

Brooks, R. A. (2002). Flesh and machines: how robots will change us. New York: Pantheon Books.

Brown, D. (2007). Tricks of the Mind. London: Channel 4 Books.

Brown, M. D., Stuart, & Vaughan, C. (2009). Play: How It Shapes the Brain, Opens the Imagination, and Invigorates the Soul. Avery.

Bullivant, L. (2006). Responsive environments : architecture, art and design. London: V&A.

Burton, E. (2004). Moovl [Java Application]. London: Soda Creative.

Buxton, B. (2007). Multi-Touch Systems that I Have Known and Loved. Microsoft Research Retrieved 18th August, 2008, from http://www.billbuxton.com/multitouchOverview.html.

Caillois, R. (1961). Man, Play and Games (M. Barash, Trans. 2001 ed.). Urbana and Chicago: University of Illinois Press.

Cameron, A. (1995). Dissimulations: Illusions of interactivity. Millennium Film Journal, 28(4).

Cameron, A. (1998). The Medium is Messy. Eye, 8(30), 6-7.

Cameron, A. (Ed.). (2004). The Art of Experimental Interaction Design. Hong Kong:     Systems Design Ltd.

Cameron, A. (2005). Dinner with Myron Or: Rereading Artificial Reality 2: Reflections on Interface and Art. In J. Brouwer, A. Mulder, & A. Nigten (Eds.), ARt & D: research and development in art (pp. 10-26). Rotterdam: NAi Publishers/V2-Organization.

Carter, M., & Geczy, A. (2006). Reframing art. Oxford; New York: Berg.

Cartwright, K. (1991). Shakespearean Tragedy and it's Double: The Rhythms of Audience Response. University Park, PA: Penn State Press.

Carver, C. S., Ganellen, R. J., Froming, W. J., & Chambers, W. (1983). Modeling: An analysis in terms of category accessibilty. Journal of Experimental Social Psychology, 19, 403-421.

Carver, C. S. (1996). Some Ways in Which Goals Differ and Some Implications of Those Differences. In P. Gollwitzer & J. A. Bargh(pp. 645-672). New York: The Guildford Press.

Cashmore, P. (2009). #CNNfail: Twitter Blasts CNN Over Iran Election. Retrieved 26th July, 2009, from http://mashable.com/2009/06/14/cnnfail/.

Coates, T. (2006). Greater than the sum of its parts. Paper presented at the The Future of Web Apps, San Francisco.

Cohen, P. (2008). Review: Super Monkey Ball for iPhone. Retrieved 1st Spetember, 2008, from


Cornock, S., & Edmonds, E. (1973). The Creative Process Where the Artist Is Amplified or Superseded by the Computer. Leonardo, 6(1), 11-16.

Costello, B., Muller, L., Amitani, S., & Edmonds, E. (2005). Understanding the experience of interactive art: Iamascope in Beta_space. Sydney, Australia Sydney, Australia, Australia.

Crawford, C. (2000). Understanding interactivity. San Francisco: No Starch Press.

Cringely, R. X., Gau, J., Segaller, S., Sen, P., Gates, B., Jobs, S. et al. (1996). Triumph of the Nerds [Documentary]. New York: Ambrose Video Publishing Inc.

Csikszentmihalyi, M. (1975). Beyond boredom and anxiety. San Francisco: Jossey-Bass Publishers.

Csikszentmihalyi, M. (1990). Flow : the psychology of optimal experience (First ed.). New York: Harper and Row.

Csikszentmihalyi, M. (1992). Flow : the psychology of happiness. London: Rider.

Csikszentmihalyi, M. (1996). Creativity : flow and the psychology of discovery and invention. New York: HarperCollins.

Csikszentmihalyi, M. (1997a). Finding flow : the psychology of engagement with everyday life. New York: BasicBooks.

Csikszentmihalyi, M. (1997b). Living well : the psychology of everyday life. London: Weidenfeld & Nicolson.

Csikszentmihalyi, M. (2002). Flow : the classic work on how to achieve happiness (Rev. ed ed.). London: Rider.

Csikszentmihalyi, M., & Csikszentmihalyi, I. S. (1988). Optimal experience : psychological studies of flow in consciousness. Cambridge: Cambridge University Press.

Davies, C. (1995). Osmose.

Davies, C. (1998). EphémŹre.

Dawes, B. (2006). Amazon.com Author Blog. Retrieved 2009, 5th June, from http://www.amazon.com/Brendan-Dawes/e/B001IR19QS.

Debatty, R. (2009). We Make Money Not Art. Retrieved 7th June, 2009, from http://www.we-make-money-not-art.com.

Dekel, A., Simon, Y., Dar, H., Tarazi, E., Rabinowitz, O., & Sterman, Y. (2005). Adding Playful Interaction to Public Spaces. Paper presented at the INTETAIN 2005 Conference on Intelligent Technologies for Interactive Entertainment, Madonna di Campiglio, Italy.

Dell Inc. (2008). Windows Vista Downgrade Service & XP End of Life. Retrieved 28th August, 2008, from http://direct2dell.com/smallbusiness/archive/2008/05/01/windows-vista-downgrade-service-amp-xp-end-of-life.aspx.

Derrida, J., & Bass, A. (1987). The post card : from Socrates to Freud and beyond. Chicago: University of Chicago Press.

Eco, U. (1976). A theory of semiotics (Advances in semiotics). Bloomington: Indiana University Press.

Edmonds, E. (2006). New directions in interactive art collaboration. CoDesign, 2(4), 191-194.

Edmonds, E. (2007). Creativity and Cognition Studios Website. Retrieved 1st November, 2007, from http://www.creativityandcognition.com/.

Edmonds, E., Turner, G., & Candy, L. (2004). Approaches to interactive art systems. Paper presented at the GRAPHITE '04: The 2nd international conference on Computer graphics and interactive techniques in Australasia and South East Asia, Singapore.

Eidos Interactive. (1998). Tomb raider [Videogame]. San Francisco: Eidos Interactive.

Ellison, N. B., Steinfield, C., & Lampe, C. (2007). The benefits of Facebook "friends": Social capital and college students' use of online social network sites. Journal of Computer Mediated Communication, 12(4), 1143.

Ericsson, K. A., & Simon, H. A. (1993). Protocol Analysis - Rev'd Edition: Verbal Reports as Data. Cambridge, MA: MIT Press.

Eskelinen, M. (2004). Towards Computer Games Studies. In N. Wardrip-Fruin & P. Harrigan (Eds.), First Person: New Media as Story, Performance, and Game (pp. 36-44). Cambridge, MA: MIT Press.

Fabrica. (2001). Rimbalzo [CD-ROM]. Treviso, Italy: Fabrica.

Farnham, S. D. (2008). Facebook Application Ecosystem: Why Some Thrive and Most Don't. O’Reilly Radar Report, March 2008 Retrieved 17th June, 2009, from http://radar.oreilly.com/research/facebook-app-eco-report.html.

Field, S. (1984). Screenplay: The Foundations of Screenwriting; A step-by-step guide from concept to finished script. New York: Bantom Dell/Random House.

Flanagan, M. (2006). [giantJoystick][Interactive Installation].

Frakes, D. (2008). 10.5.2 update shows Apple listens to users. Retrieved 19th February, 2008, from http://www.macworld.com/article/131902/2008/02/1052applelistens.html.

Frasca, G. (1999). Ludology meets Narratology: Similitude and differences between (video)games and narrative. Parnasso, 3.

Frasca, G. (2003). Ludologists love stories, too: notes from a debate that never took place. Paper presented at the Level Up: Digital Games Research Conference, Utrecht.

Freud, S. (1950). Beyond the pleasure principle (J. Strachey, Trans. The International psycho-analytical library ; no. 4). London: Hogarth Press.

Friedman, A. (1979). Framing pictures: the role of knowledge in automatized encoding and memory for gist. Journal of Experimental Psychology: General, 108(3), 316-355.

Gabriel, P. (1993). Xplora I: Peter Gabriel's secret world [CD-ROM]. London: Real World Multi-Media.

Gabriel, P. (1996). Eve [CD-ROM]. London: Real World MultiMedia.

Garrett, J. J. (2003). The elements of user experience: User-centered design for the Web. New York: American Institute of Graphic Arts.

Gaver, W. W., Beaver, J., & Benford, S. (2003). Ambiguity as a resource for design. Paper presented at the CHI '03: SIGCHI conference on Human factors in computing systems, Ft. Lauderdale, Florida, USA.

Gibbs, G. (1999). Using assessment strategically to change the way students learn. Assessment matters in higher education, 41 53.

Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.

Gladwell, M. (2002). The Tipping Point: How Little Things Can Make a Big Difference. London: Abacus.

Gladwell, M. (2005). Blink: The Power of Thinking Without Thinking. London: Allen Lane.

Goldman, W. (2001). Which Lie Did I Tell? London: Bloomsbury.

Gollwitzer, P. M., & Bargh, J. A. (1996). The psychology of action : linking cognition and motivation to behavior. New York: Guilford Press.

Gordon, N. S., Burke, S., Akil, H., Watson, S. J., & Panksepp, J. (2003). Socially-induced brain Ďfertilizationí: play promotes brain derived neurotrophic factor transcription in the amygdala and dorsolateral frontal cortex in juvenile rats. Neuroscience letters, 341(1), 17-20.

Graham, D. (2000). Two-way Mirror Power: Selected Writings by Dan Graham on His Art. Cambridge, MA: MIT Press.

Groos, K. (1898). The Play of Animals (E. L. Baldwin, Trans.). New York: Appleton.

Groos, K. (1901). The Play of Man (E. L. Baldwin, Trans.). New York: Appleton.


Gross, R., & Acquisti, A. (2005). Information revelation and privacy in online social networks. Paper presented at the WPES ’05, Alexandria, Virginia.

Gygax, G. (1979). Dungeon Master's Guide: Essential Reference Information for Gamemastering Advanced D&D. Wisconsin: TSR Games.

Han, J. Y. (2006). Multi-touch interaction wall. International Conference on Computer Graphics and Interactive Techniques.

Haque, U., & Pangaro, P. (2006). Paskian Environments. Paper presented at the Game Set and Match II, Delft University, The Netherlands.

Haraway, D. (1991). A cyborg manifesto: Science, technology, and socialist-feminism in the late twentieth century. In Simians, cyborgs and women : the reinvention of nature. New York: Routledge.

Harmonix Music Systems. (2005). Guitar Hero [Videogame]. Santa Monica, CA: Activision Inc. & RedOctane.

Harmonix Music Systems. (2007). City Wall.

Harris, J., & Kamvar, S. (2005). We Feel Fine. Retrieved May 11th, 2006, from http://www.wefeelfine.org/.

Harvey, A. (2006). You Mean It’s Only a Game? Rule Structures, the Magic Circle, and Player Participation in Pervasive Mobile Gaming. Paper presented at the Canadian Games Study Association 2006 Symposium.

Harwood, G., Fuller, M., Edgell, S., Ashworth Hospital Authority, & Ashworth Arts. (1996). Rehearsal of memory.

Hasegawa, T. (1996). Microphone Fiend. CD-ROM.

Heidegger, M. (1962). Being and Time. Blackwell Publishing.

Herz, J. C. (1997). Joystick nation: how videogames gobbled our money, won our hearts and. London: Abacus.

Higginbotham, W. (1958). Tennis for Two [Unpublished VIdeogame]. New York: Brookhaven National Laboratory.

Hoogstraten, S. v. (1675). Inleiding tot de Hogeschool der Schilderkunst. Rotterdam:.

Huhtamo, E. (2004). Trouble at the Interface, or the Identity Crisis of Interactive Art. Framework, The Finnish Art Review, 2004(2).

Huizinga, J. (1955). Homo Ludens: A study of the play element in culture. Boston: The Beacon Press.

Igoe, T. (2007). Making Things Talk: Practical Methods for Connecting Physical Objects. Sebastopol, CA: OReilly.

Iwai, T. (1992). Musical Insects.

Iwai, T., & Nintendo. (2005). Electroplankton [Nintendo DS Game].

Jenkins, H. (2004). Game Design as Narrative Architecture. In N. Wardrip-Fruin & P. Harrigan (Eds.), First Person: New Media as Story, Performance, and Game (pp. 118-121). Cambridge MA: MIT Press.

Johnson, B. (2007). Q&A: Will Wright, creator of the Sims. Retrieved 2nd January, 2008, from http://www.guardian.co.uk/technology/2007/oct/26/willwright.


Johnson, M., & Lakoff, G. (2002). Why cognitive linguists requires embodied realism. Cognitive Linguistics, 13(3), 245-263.

Johnson, M. K., & Raye, C. L. (1981). Reality Monitoring. Psychological Review, 88(1), 67-85.

Johnson, S. (2006). Everything Bad Is Good for You: How Popular Culture Is Making Us Smarter. London: Penguin Books Ltd.

Johnson, S. (1997). Interface culture : how new technology transforms the way we create and communicate. San Francisco: HarperEdge.

Johnson, S. (2001). Emergence : the connected lives of ants, brains, cities, and software. New York ; London: Scribner.

Joinson, A. N. (2008). Looking at, looking up or keeping up with people?: motives and use of facebook. Paper presented at the CHI 2008, Florence, Italy.

Juul, J. (2001a). A clash between game and narrative: A thesis on computer games and interactive fiction. Institute of Nordic Language and Literature Masters Thesis. Retrieved 17th September, 2008, from http://www.jesperjuul.net/thesis/.

Juul, J. (2001b). Games telling stories? Game Studies, 1(1).

Juul, J. (2005). Half-Real: Video Games between Real Rules and Fictional Worlds. Cambridge, MA: The MIT Press.

Kane, P. (2004). The play ethic: a manifesto for a different way of living. London: Macmillan.

Kant, I. (1855). Critique of Pure Reason (J. M. D. Meiklejohn, Trans.). London: Henry D. Bohn.

Kelley, T., & Littman, J. (2001). The art of innovation : lessons in creativity from IDEO, America's leading design firm (1st ed.). New York: Currency/Doubleday.

Kempton, W. (1986). Two theories of home heat control. Cognitive Science, 10(1), 75-90.

Kent, S. L. (2002). The Ultimate History of Video Games. New York: Three Rivers Press.

King, A. (2008). Website Optimization: Speed, Search Engine & Conversion Rate Secrets. New York: O'Reilly Media, Inc.

Kirschner, R., Morawe, V., & Reiff, T. (2002). PainStation.

Klooster, T. (2009). Smart Surfaces and Their Application in Architecture and Design. Basel: Birkhauser Verlag AG.

Konami Corporation. (1998). Dance Dance Revolution [Arcade Game]. Tokyo: Konami Corporation.

Krikke, J. (2003). Samurai Romanesque, J2ME, and the Battle for Mobile Cyberspace. IEEE Computer Graphics and Applications, 23(1), 16-23.

Krueger, M. (1974). Videoplace.

Krug, S. (2006). Don't make me think! : a common sense approach to Web usability (2nd ed ed.). Berkeley, Calif: New Riders Pub.

Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: The University Of Chicago Press.

Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh : the embodied mind and its challenge to Western thought. New York: Basic Books.


Lampe, C., Ellison, N., & Steinfield, C. (2007). A familiar Face (book): Profile elements as signals in an online social network. Paper presented at the CHI 2007, San Jose, CA.

Laurel, B. (1993). Computers as theatre. Reading, Mass: Addison-Wesley.

Leadbeater, C., & Miller, P. (2004). The pro-am revolution : how enthusiasts are changing our society and economy. London: Demos.

Levin, G. (2005). An Informal Catalogue of Slit-Scan Video Artworks. World Wide Web. Retrieved 2nd August, 2006, from http://www.flong.com/texts/lists/slit_scan/.

London Studios. (2005). Play 2 [Playstation 2 DVD-ROM]. London: Sony Computer Entertainment Europe.

Lovink, G. (2002). Dark fiber : tracking critical internet culture. Cambridge, MA: The MIT Press.

Lozano-Hemmer, R. (2001a). Body Movies (Rotterdam) Rotterdam.

Lozano-Hemmer, R. (2001b). Body Movies documentation video.

Lozano-Hemmer, R. (2003). Body Movies (Duisburg) Duisburg Akzente, Germany.

Mahoney, J. (2008). Super Monkey Ball for iPhone Lightning Review: Addictive, Hard as Hell. Retrieved 1st September, 2008, from http://gizmodo.com/5024031/super-monkey-ball-for-iphone-lightning-review-addictive-hard-as-hell.

Manovich, L. (2001). The Language of New Media. Cambridge, MA: The MIT Press.

Mathwick, C., & Rigdon, E. (2004). Play, Flow, and the Online Search Experience. Journal of Consumer Research, 31(2), 324-332.

Matthews, E. (2002). The Philosophy of Merleau-Ponty. Chesham, Buckinghamshire: Acumen Publishing Ltd.

Maxis. (1989). Sim City [Computer & Video Game]. Eugene, Oregon: BrŅderbund, Maxis, Nintendo Electronic Arts & Superior Software/Acornsoft.

Maxis. (2000). The Sims [Computer Game]. Redwood City, CA: Electronic Arts.

McAllister, N. (2008). Tech's all-time top 25 flops. Retrieved 28th August, 2008, from


McDonough, W., & Braungart, M. (2002). Cradle to Cradle: Remaking the Way We Make Things. New York: North Point Press.

McKee, R. (1997). Story : substance, structure, style and the principles of screenwriting (1st ed.). New York: ReganBooks.

McLuhan, M. (2001). Understanding Media (Routledge Classics). London: Routledge.

Merleau-Ponty. (1964a). Signs. Evanston, Illinois: Northwestern University Press.

Merleau-Ponty, M. (1964b). Primacy of Perception. Evanston, Illinois: Northwestern University Press.

Merleau-Ponty, M. (1964c). Sense and Non-Sense. Evanston, Illinois: Northwestern University Press.

Merleau-Ponty, M. (1990). Phenomenology of Perception. London: Routledge.

Merleau-Ponty, M. (2004). World of Perception. London: Routledge.



Microsoft Corp. (2008). Microsoft’s Annual Revenue Reaches $60 Billion. Microsoft Investor Relations News Release Retrieved 28th August, 2008, from http://www.microsoft.com/msft/earnings/FY08/earn_rel_q4_08.mspx.

Microsoft Corporation. (2008). The Mojave Experiment. Retrieved 29th July, 2008, from


Miller, P. (2008). Mojove Experiment Goes Live, Doesn't Fail To Annoy. Retrieved 29th July, 2008, from http://www.engadget.com/2008/07/29/mojave-experiment-goes-live-doesnt-fail-to-annoy/

Mine-Control. (2004). Marketing Document. http://www.mine-control.com.

Minsky, M. L. (1974). A framework for representing knowledge. Cambridge, MA: Massachusetts Institute of Technology A. I. Laboratory.

Minsky, M. L. (1975). A framework for the representation of knowledge. The psychology of computer vision. New York: McGraw-Hill.

Minsky, M. L. (1969). Semantic Information Processing. Cambridge, MA: The MIT Press.

Moggridge, B. (2006). Designing Interactions. Cambridge, MA: The MIT Press.

Montfort, N. (2004). Interactive fiction as "story," "game," "storygame," "novel," "world," "literature," "puzzle," "problem," "riddle," and "machine". In K. Salen & E. Zimmerman (Eds.), First person: New media as story, performance, and game (pp. 310-317). Cambridge, MA: The MIT Press.

Mori, M. (1970). The Uncanny Valley. Energy, 7(4), 33-35.

Moses, A. (2006). Wii breaks Xbox sales record. Sydney Morning Herald.

Moulthrop, S. (2004). From Work to Play: Molecular Culture in the Time of Deadly Games. In N. Wardrip-Fruin & P. Harrigan (Eds.), First Person: New Media as Story, Performance, and Game (pp. 56-69). Cambridge, MA: The MIT Press.

Mulder, S., & Yaar, Z. (2006). The User Is Always Right: A Practical Guide to Creating and Using Personas for the Web. Berkley CA: New Riders Press.

Muller, L., & Edmonds, E. (2006). Living laboratories: making and curating interactive art. SIGGRAPH 2006 Electronic Art and Animation Catalog, 147–150.

Muller, L., Edmonds, E., & Connell, M. (2006a). Living laboratories for interactive art. CoDesign: International Journal of CoCreation in Design and the Arts, 2(4), 195-207.

Muller, L., Turner, G., Khut, G., & Edmonds, E. (2006b). Creating Affective Visualisations for a Physiologically Interactive Artwork. Information Visualization, 651-657.

Namco. (1980). Pac-Man [Arcade Game]. Tokyo: Namco/Midway.

Narayanan, S. S. (1997). Knowledge-based Action Representations for Metaphor and Aspect (KARMA). Univeristy of California.

National Gallery of Australia. (2005). futurePlay from The House of Tomorrow.

Nelson, R. (2004, November 3rd). Sensesurround. The Age.

Negroponte, N. (2006) One Laptop per Child. Podtech Podcast Retrieved 14th July, 2009, from


Nielsen, J. (1993). Usability engineering. Boston: Academic Press.

Nielsen, J. (2001). E-commerce user experience. Fremont, CA: Nielsen Norman Group.

Nielsen, J., & Tahir, M. (2002). Homepage usability: 50 websites deconstructed. Indianapolis: New Riders.

Nintendo Company Ltd. (2006). Wii Sports [Nintendo Wii Videogame]. Kyoto: Nintendo Company Ltd.

Nintendo Company Ltd. (2007). Wii Sports Pack Combo [Videogame]. Kyoto: Nintendo Company Ltd.

Norman, D. (1998). The design of everyday things. New York: Basic Books.

Norman, D. (2003). Affordances and Design. http://www.jnd.org/dn.mss/affordances-and-design.html.

Norman, D. A. (1983). Some observations on mental models. In D. Gentner & A. L. Stevens (Eds.), Mental Models (pp. 7-14). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Norman, D. A. (2005). Emotional Design: Why We Love (or Hate) Everyday Things. New York: BasicBooks.

Novak, T. P., & Hoffman, D. L. (1997). Measuring the flow experience among web users. Interval Research Corporation, 31.

Novak, T. P., Hoffman, D. L., & Yung, Y. F. (2000). Measuring the Customer Experience in Online Environments: A Structural Modeling Approach. Marketing Science, 19(1), 22-42.

NPD Group. (2008