Revisiting the voice in media and as medium: New materialist propositions

Tags: , , ,

by Milla Tiainen


Approached with varying attention to its sensory, auditory, and signifying dimensions, over the past decades the voice has attracted repeated investigation and theorisation in research and discussions about media. The voice in media has been explored and conceptually defined by a series of undertakings, from enquiries in film studies to recent examinations of sound in the digital era. Whether overtly or more implicitly, diverse approaches both under and beyond film and media studies have also speculated about the voice itself as a medium or as involved in crucial processes of mediation. To this extent, the voice has become implicated in the very concepts of ‘media’ and ‘mediation’ – terms whose uses span a famously multi-layered range. In their recent project, Sarah Kember and Joanna Zylinska remind their readers of the manifold usages of the term ‘mediation’, not just in media studies but also in a number of coexisting domains, from Marxist theory to psychology and sociology.[1] Eva Horn refers to the cacophonous understandings and postulations linked to the term ‘medium’ as a constituting and fertile feature of the discipline of media studies. Emphasising the infeasibility of any fixed definitions and consensus, she asserts how ‘[d]oors and mirrors, computers and gramophones, electricity and newspapers, television and telescopes … water and air, information and noise, numbers and calendars, images, writing, and voice – all these highly disparate objects and phenomena fall into media studies’ purview’.[2]

Existing approaches to voice, media, and mediation provide both the inspiration and perspectives in need of reappraisal for this article. However, the task I wish to engage does not involve accounting for all the various ways in which the voice has fallen into the purview of media studies. Revisiting the voice in media with inclusive references to the different media forms, technologies, and expressions, as well as theoretical angles in relation to which it has hitherto been explored, would be an impossible job for any one essay. Even less possible would it be to reassess the voice as medium against a comprehensive mapping of its previous associations with the concepts of media and mediation across a multiplicity of (inter)disciplinary settings.

The goals of this text are more modest and specific: I want to engage three themes that can be extracted from investigations into voice and media conjunctures. The main claim of this essay is that a further thinking of these themes can advance our grasp of how voices figure in/with media and how their operations might contribute to understanding media; also, how particular conceptualisations of media and mediation may, for their part, help to reappraise voices’ manners of existing and the occasions of reality in which they participate and work. The themes this article singles out, from examinations of the voice/media coalescence with the wish of extending them, can be couched as follows: the first theme concerns the relationality of voice, while the second has to do with voice as a sensory, perceptual event. The third and concluding theme addresses the ontological elasticity of voice that refers to its ever-evolving actualisations and reinvented possibilities in mediatised milieus.

The inflections of each theme I intend to propose are theoretically supported by aspects of new materialist thinking. Only few surveys of sound and next to none with voice as their focal concern have so far made use of approaches associable with this rubric.[3] Notwithstanding these hitherto germinal appropriations in the study of sonic and auditory processes, new materialism is, on the whole, an expansively employed label. It currently gives a name to a host of projects and reoriented research agendas in a spectrum of humanities and social sciences fields, from media and cultural studies to social studies of science, feminist scholarship, and art theory. There is no way of reducing the variegated foci and (inter)disciplinary affiliations of these endeavours to a set of fixed, uniformly shared traits. As will be elaborated, I nonetheless find it feasible to excerpt from these enquiries such conceptualisations and tendencies of rethinking that encourage reconsideration of the themes this essay foregrounds.

As regards the relationality of voice, I will evoke a new materialist emphasis on emergence. The stress here lies on the interminable actualisation of reality that derives from connectivity across human and nonhuman, material, social, and semiotic elements and their ‘open series of capacities or potencies’.[4] Notions of emergence and relation that new materialist theories revisit and advocate are evoked in order to ponder whether the voice, instead of prompting interaction between terms (like self and other, human body and media technology) that are supposedly ontologically distinguishable from their relating, would be better understood as being repeatedly generated by relations while simultaneously generating further co-substantial becomings.

Constitutive relationality is also at stake in approaches to  voice as a sensory, perceptual event that can be described as new materialist. Of particular import in this case is the conceptual pair of actual and virtual, which strands of new materialist thought have revoked as part of their interest in the sensing body and the relations underlying perceptual experience. The question I will pursue is how these reciprocally presupposing terms might retune conceptions about the voice’s appeal to the sensorium in the broader context of media and cinema studies and their renewed engagement with bodily perceptual processes and ‘deeply sensual, synaesthetic’ effects of the media.[5]

My propositions regarding the third theme, the overall ontological elasticity of voice, expand on the notions of emergence and relation introduced in the frames of the first theme. The argument I wish to advance is as follows. Many investigations within and beyond media and film studies have concerned themselves with the radical technological, historical, and socio-cultural mutability of vocalic expressions, including their connections to the body, signification, space, and, to a degree, nonhuman objects and forces. When addressing the transmutations of voice, increasing attention should be paid to their occurring on a continuum of relations between human and that which is more than human, whereby the latter appellation signals activities and ‘contingencies belonging to any number of categories’ as well as the excess of ‘currently human potential’.[6] In making some suggestions about this kind of shift in the analysis of voice and media coalescences, I argue for the usefulness and context-sensitive elaboration of theorisations of the posthuman. Following Rosi Braidotti among others, this term designates a condition or heightened tendency of the contemporary world and a needed continual reorientation of critical (and creative) thought.[7] As will be elucidated, for present purposes it is justified to regard new materialist and posthuman investigations as closely-associated bodies of response to similar problems.

This article’s reflections on the chosen themes are not exclusively about theoretical reformulation. My explorations of the first two themes in particular gain impetus and shape from one selected example in each case. To paraphrase Thomas Elsaesser and Malte Hagener, the intention is to highlight and encourage processes whereby theory builds on media cultural examples while the dimensions and capacities of examples proliferate in contact with theory.[8] This contrasts with envisaging theory or notions of media on the one hand and ‘media practice’ on the other hand as two ontologically-distinct entities that can only be imposed on or offer illustrative support to one another.

Lodged in a predominantly cinematic context, the example intertwining with my new propositions about the voice’s relationality is the award-winning British feature film The King’s Speech (Tom Hooper, 2010). This movie renders a portrayal of the struggles of King George VI with his vocalic production in the face of required acts of public, media-disseminated speaking. It is, in turn, the vocal expressions of Armenian-American singer, composer, and voice experimenter Cathy Berberian that accompany my suggestions about reviewing the voice as a sensory, perceptual event. I will attend to Berberian’s renowned composition Stripsody (1966), which bears a peculiar relation to the visual medium of comic strips. My propositions on the concept of the posthuman are offered in this text mainly as extensions of the two preceding lines of argument. In the conclusions, I will demonstrate how this third theme grows out of the previous ones, helping to assess their implications and push them forward.

The undeniable difference of the examples I use in terms of medium and related scholarly perspectives will be acknowledged as much as the limits of the article allow. However, it is their intense focusing on the voice and the interlinked ways in which they encourage new materialist approaches to the concepts of voice and media that connect the examples, legitimating their co-figuring in the present text. Together, they also endorse new insights into the voice as medium that find resonance in Kember’s and Zylinska’s recent reconceptualisations of media and mediation. According to their formulation, mediation signals the primary, differing ‘process of media emergence’ whose potentiality stems from the interplay of heterogeneous factors (e.g. material, technological, institutional). Media, on the other hand, amount to ‘(ongoing) stabilizations’ of this processuality.[9] These delineations display affinities with notions of emergence, the generative character of relations, and a dynamic hybrid reality in new materialist theorising.

The next three sections of the article illuminate some of the significant ways in which the themes of relation, sensory experience, and ontological contingency have thus far featured in explorations of media concerned with the voice. The considerations of the previous handlings of each theme extend into preliminary expositions on what new materialist lines of approach may bring into their examination. The two subsequent sections then put those lines in further action through their example-based discussions of the essay’s themes and the concepts of media and mediation.

Voice and relation: From a mediator in-between to relational events

Along with such other areas as continental philosophy and feminist criticism, media and film theoretical engagements with voice have repeatedly interrogated the voice’s propensity and powers to establish relations. To offer a recent example, Norie Neumark, in her introduction to the edited volume Voice: Vocal Aesthetics in Digital Arts and Media, writes about ‘alterity’.[10] This figure conveys for her one of the key questions that the voice raises not only for media studies scholars but also across an interdisciplinary span. In Neumark’s usage, alterity signals the voice’s oft-noted fundamental disturbance of any notions of self-containment. As physical vibration and audible sounds and signs, voices irrevocably exceed the subject or body – whether organic or technological – that acts as their source. While doing so, they by the same token interrelate bodies, subjects, and spatial, material-social milieus. Paraphrasing Walter J. Ong’s early work in sound studies, Neumark thus highlights the voice as a dynamic binding vector ‘within a total sensual/spatial/temporal situation’. She furthermore defines its alterity as performative intersubjectivity: as the sounding out of ‘the physical, affective, signifying, and psychic spaces between subjects’.[11]

With respect to thematisations of relation in voice-centred investigations of media, Neumark’s formulations give rise to two remarks. Whereas the above-mentioned volume examines the voice’s alterity, as it were, within the various practices of digital media culture, from podcasting to Internet voice art, resonating acknowledgments of the dispersive and connective workings of voice can surely be found in earlier analyses of the voice in media. Such insights are discernible in Kaja Silverman’s inaugural feminist psychoanalytical study of cinematic voices, The Acoustic Mirror. In Silverman’s characterisation, voice is ‘capable of being internalised at the same time as it is externalised, to spill over from subject to object and object to subject, violating the bodily limits upon which classic subjectivity depends’.[12] While pioneering film theoretical considerations of voice with such closeness of focus and conceptual nuance that few subsequent projects in this field have paralleled, Michel Chion’s well-known mapping of cinematic voices and Silverman’s inspection of their gendered dynamics from Hollywood to experimental repertoires can, among other things, be seen as charting the specifically filmic renderings of the voice’s connective, relational character. This shows in Chion’s psychoanalytically-informed discussion of the ‘nurturing connections’ that voices (that are on these occasions often transmitted by media technology) at times provide for filmic characters. It marks his musings over the acousmatic voice’s traversing and suffusion of both the film characters’ and auditor-spectators’ experiential space. Silverman, in turn, demonstrates how cinematic manifestations of the acousmatic voice contribute to the maintenance of sexual difference as a system of hierarchical relations.[13]

The second observation to be made here is that these existing equations of voice with interrelatedness and the transgression of customarily-assumed borders echo several established notions of media and mediation. There are some resonances to be traced from Neumark’s alterity, with which she rearticulates bodily excess and spatio-temporal dispersion as the defining traits of voices, all the way back to Marshall McLuhan’s postulations about the voice as a medium and, therefore, as an extension of man.[14] Descriptions of the in-between spaces that the voice invokes bear, in turn, some traces of such traditional understandings of mediation where this term denotes negotiating or intervening ‘third’ factors between entities.[15]

Now, one of the signal features of new materialist and intellectually-proximate projects in the current humanities and social sciences consists of resuscitations and development of such views of relation that may, alongside inflecting other paths of research, expand ideas about the relationality of voice. Through returns to a set of thinkers from Gilles Deleuze and Félix Guattari to William James, Alfred North Whitehead, and Gilbert Simondon inter alia, the new materialist approaches initiated over the past two decades typically aim at dispelling an arguably persistent tendency regarding notions of relation that is rooted in the so-called substance metaphysical  traditions of Western thought. This is the conceiving of relation as an occurrence that arrives after and merely supplements the primary individuality of the relating terms.

As a means of reconfiguring this ontological scheme of separateness, we can find in the writings of new materialist theorists – for example, Karen Barad, Rosi Braidotti, Jane Bennett, and Manuel DeLanda – an array of conceptual suggestions, from Deleuze-Guattari’s assemblage to Barad’s quantum physics-derived entanglement, that advance an understanding of relations as generative and regenerative of the relating entities’ very individuality.[16] To deploy Simondon’s vocabulary as another available and currently reused mode of conceptualisation, relations are thus reconceived as a pre-individual structuring condition for their participants’ co-substantial individuations, which are always provisionally attained states of being. Relations  are where the entities’ qualities, capacities, and self-relations (connections to their respective pasts and unfurling toward the future) jointly emerge and modify. While having a situational and processual composition, relations are seen to constitutively inform the existence and potential for further existence of that which they relate.[17]

Despite the various theoretical impulses and foci of new materialist projects, their elaborations of the above views involve shared aims. This is what arguably lends utility to new materialist reappraisals of relation. The aims at stake concern efforts to conceptually and analytically reclaim the crucial reality-making role of materialities without assigning them a self-contained identity or ontological primacy. The already-known motto reiterated across strands of research that can be called new materialist is that humanistic and social scientific studies of contemporary reality and renewed considerations of ontology should pay growing heed to forms and modes of materiality, from the operations of technological systems to natural forces and ‘evolving corporeal practices’.[18] What the constitutive views of relation help to avoid is the positing of these modes as self-standing existential layers or determining factors. The features of material phenomena and their ways of affecting other aspects of the real rather occur through their complex co-implication – relationality – with precisely such other aspects, whether these are social, linguistic, representational, scientific, or artistic.

To return to the main concerns of this article, these conceptions of relation as productive of the very constitution of relating terms propose at least two reconfigured perspectives for the study of voice. In their wake it is not quite adequate to say, as Neumark puts it, that the voice sounds out ‘physical, affective, signifying, and psychic’[19] spaces between subjects, bodies, and milieus. To an extent, this phrasing still implies pre-constituted, only secondarily-connected identities. Therefore, the voice would be better regarded as a processual factor that takes part in and initiates relational events where the involved entities and their dimensions undergo veritable transition – where they become anew.

Furthermore, theorisations of relational emergence can enhance understandings of the voice itself as an event, in the sense that as a sonic phenomenon it has no permanence but repeatedly occurs as sounds. This aspect has preoccupied various accounts of voice in areas from philosophy and sound studies to film and media theory. Following new materialist notions of relation, vocal expressions must result from the co-constitutive interplay of many elements that are necessarily heterogeneous and, hence, also material in kind. These can encompass anything from the habits of the vocalising bodies to the media technological and institutional settings of the given vocal events. The perspectives suggested here have implications for what the voice could be seen to mediate and whether or how it could be considered a medium.

The relevance of approaching both the emergence of voice and the connections it induces as relational events will be probed below in more detail with The King’s Speech. This film’s engagements with voice easily invite associations to such ideas about the vocalising individual, the connective powers of voice, and the connections between voice and media, whose familiarity and conservative tones seem a far cry from new materialist stances. However, the film simultaneously, perhaps surprisingly, contains moments where the voice and its effects come across as a matter of hybrid relational emergence.

The virtual in the sensory?

Like relation, questions of sensation, sensory experience, and perception have figured frequently enough in previous film and media theory approaches to voice. It can indeed be argued that their importance was laid down at the formative stages of these enquiries. This is evident in Chion’s The Voice in Cinema when he announces the ambition of his examinations to lie in their attending to ‘the medium of the voice itself’, to its ‘materiality’.[20] This challenges the tendency of collapsing voice with speech or concentrating solely on its role as the carrier of dialogue with regards to linguistic expression and signification. Claudia Gorbman further notes in the English translation of Chion’s study how ‘it is the voice – not as speech, not as song’, but as all the qualities that are left afterwards, including the voice itself ‘as a technological medium’, that constitute the subject matter of Chion’s project.[21]

Quite clearly, the qualities indicated by Gorbman and Chion coincide with the sensorial aspects and (media-manipulated) characteristics of voice. They refer to the irreducibility of such facets of voice as manner of emission (screaming, whispering), sensual impressions (evoked by intonation, timbre, recording technique, etc.), and the spatial experiences that specific vocalisations invoke to its function as the vehicle of language and mediator of symbolic meanings. By implication, the stressing of these qualities underscores the capacities of voice to create experience – here, cinematic or mediatised experience – in both sensorial and signifying registers. In spite of their typical simultaneity and interdependent proceeding, these dimensions do not yield to a common identity.

A further issue highlighted by both Chion’s and later discussions is that attempts to foreground and analyse the material and sensory qualities of voice almost inevitably extend to such relations in which these features are embroiled. Existing accounts have often attended to two types of relation in this respect. Partly overlapping, these figure between voice and the body, and between the voice and (medium-specific) visual events. As Neumark notes, the relation of voice to embodiment readily ‘raises the question of aesthetics’.[22] The term aesthetics is defined by her in this instance as a form of cognition achieved through the ‘whole corporeal sensorium’ the different senses compose. To expand on Neumark’s argument, the aesthetic processes the voice instigates while operating in connection with particular media always already bear the marks or ‘writing’ of its source body.[23]

When considering the visual aspect of voice-body relations and the relationships of voice to wider fields of visual activity, it is again such pioneering studies as those by Chion and Silverman that can be credited with introducing influential notions in this regard. While arising from distinctive cinematic contexts, these projects illustrated the powerful ways in which sound-image synchronisation, the visible anchoring of voices to (on-screen) bodies or the lack thereof, affect the experienced sensuous qualities and consequent symbolic associations of voice. These insights have subsequently prompted applications in film studies and beyond.

The central role of materiality, corporeality, and sensorial impact in vocal events and the broader multi-sensory arrangements as part of which voices operate are, then, not unfamiliar concerns for examinations of voice in the media and its definitions as a medium. Moreover, the new surge of attention to cinema and other media as domains of the senses[24] – drawing from such thinkers as Deleuze, Maurice Merleau-Ponty, and Jacques Rancière – provides an available framework for further study into voice and sensory experience. New materialist theorisations, nonetheless, may contribute importantly to these developments. In the present context, the contributions in question pertain to the understandings characteristically advanced by neomaterialist engagements with the body and experience. What these approaches insist on is that reclaiming the irreducibility of the material and sensorial threads of phenomena and experience in view of their coding within signification does not entail that  we consequently assume matter and sensuous experiencing as something fully unmediated and completely given at the level of their concrete actuality. By drawing on the concepts of actual and virtual most prominently developed by Deleuze and Henri Bergson, new materialist and related projects by the likes of Brian Massumi, Elizabeth Grosz, and others have reasserted, in contrast, the temporally and qualitatively intricate make-up of our embodied becomings and experiences in the material world.[25]

To put it in basic terms, virtual, here, stands for the real involvement of both past events and the as yet undetermined future in any perception that particular bodies/minds have of things in a given here and now. The past enriches the attributes of that which bodies/minds actually encounter. This occurs in the form of associations – subliminal or more consciously perceived – to the ‘archives’ of the involved perceivers’ previous experience and through the ways the past conditions their very capacities to experience. The future, meanwhile, involves the variation these experiential processes will undergo, as relations to what is sensed and experienced progress or new encounters take place. Over the course of such changes, each relational event, to cite Massumi, ‘takes up the past differently’.[26]

These reappropriations of the concept of virtual endorse the new materialist argument according to which matter – whether this is the materiality of human sensing bodies or, say, that of the media images and sounds they connect with – is ‘always something more than “mere” matter’.[27] In excess of their extensive and calculable properties, forms of materiality are endowed with open-ended activeness and productivity. These spring from the relations with other material entities as well as memory, thought, and more that they are able to enter, and from the affects they generate and receive. The vivid inhering of the past in the present that the Deleuzian-Bergsonian notion of virtual presupposes suggests fertile ways to explore how expressions that apparently pertain to one main sense modality, such as the auditory in the case of vocal emissions, can evoke virtual yet real intermodal experiences through activating our histories of perceiving vocal sounds in conjunction with such aspects as bodily gesture and visually-inferable affective states. The benefits of the conceptual pair actual/virtual for considering the links of vocalic expression to sensory experience, singular media cultural contexts, and the concept of media will be exemplified below with Cathy Berberian’s Stripsody.

Vocalising the posthuman

Let us finally turn to notions about the ontological elasticity of voice detectable in existing scholarship on voice and media. Be it the relations of voice to the body, space, and sexual difference, or the associated sensorial and signifying experiences that vocal enunciations elicit, previous accounts of voice in film and media studies (alongside those in musicology and cultural theory) have regularly stressed the radical historical, contextual, and technological contingency of these characteristics of voice. Some quick illustrations of this awareness must suffice here. Elsaesser and Hagener recapitulate in their new sense-orientated approach to film theory how the technological set-up of sound cinema, via separating the image track and the sound track, profoundly shattered the link between (vocal) sounds and their (human bodily) origin that may still appear obvious to us in everyday life. At the same time, this interrupted relation was subjected to diverse kinds of cinematic reconstruction.[28]

Through examining varying image-sound collaborations and the role of specific media and cultural devices from radio to portable mobile media in the production and consumption of vocal expressions, media and film theoretical approaches have also contributed to the study of what Steven Connor has termed the ‘vocalic space’. This term denotes the historically, technologically, and socio-culturally mutating ways in which the voice emerges as a spatial event and ‘actively procures space for itself’.[29] Neumark summarises diverse perspectives on the voice as a transforming phenomenon overdetermined by social, cultural, and material factors. Among these, she encourages analysis of the complex relationship between technologies of sound recording and mixing for broadcast and other media, and the development and effects of particular vocal techniques (modes of air expenditure, projection, and bodily activity).[30]

Hence, if the vibrations of voice as a sound event melt in the air, research on the voice and media has already shown that there is nothing solid either in the aesthetic guises the voice assumes, the milieus where it operates, or the ideas of the human it is connected to. Insofar as these kinds of understandings are widespread, it could be argued that media theoretical approaches to voice have by now established noteworthy ways of conceiving the voice beyond narrowly human-centric vistas, as varied processes of vocalising the posthuman. To draw on Philip Brophy, this concept refers to how the voice has, even before the rise of digital or earlier media apparatuses and cultures, possessed the capacity to ‘subsume nonhuman appellations and contort into multiple characterizations beyond itself’.[31] Vocalic expression and the experimentations with it have recurrently developed in relation to animal, natural, and technological, machinic (sonic) milieus. Concomitantly, the bodily capacities and organisations interlinked with voice production have always been fluid, amenable to technique-bound and environment-related changes.

New materialist approaches – now in tandem with notions of the posthuman – can again expand already existing insights. As Diana Coole’s and Samantha Frost’s edited volume New Materialisms demonstrates, new materialist lines of thought display an ethos definable as posthuman(ist) in at least two senses. Neither equals utopian or dystopian visions about the surpassing and annihilation of the human. Through their revised ontological models premised on constitutive relationality, these perspectives replace the elevation of the human into a higher plane traditionally legitimated by the supposedly unique human abilities for self-awareness, meaning, and culture with the full immersion of human practices and dispositions in interrelations among material, social, human, and nonhuman components.[32]

Also, by extending the ‘location and nature’ of agency[33] onto technologies and other nonhuman entities, new materialist thinking poses a stimulating challenge to previous examinations of voice and media. It compels the development of such analytical means which ensure that media and media technologies are conceived neither as mere instruments for human vocal performance and invention nor as frameworks that predetermine the set of possible forms voices may obtain within their confines. The suggested focus rather lies in how both specific media and voices exercise their capacities, individuate anew, and can actualise in unprecedented ways through the generative force of their relatedness. I will return to this idea in the conclusions.

The King’s Speech and the emergent medium of the voice  

In her book Cinema and Sensation, Martine Beugnet joins a number of theorists whose work has appeared over the past decade in pursuing anew the argument that the material qualities and sensory impact of film (compositions of movement, sound, light, colour, texture) crucially co-exist with and pre-empt its ‘construct as narrative process, system of representation, or articulation of an ideological discourse’.[34] While Beugnet, by studying a batch of recent French films, explores how these aspects accompany and occasionally overrule the narrative and representative functions of cinematic works, she puts forth a reminder that ‘a sensual apprehension’ of film is, despite its potentially powerful effects, an often sidelined dimension of audience engagement. This applies especially to mainstream feature films and our familiarity with their formats. As reflected in film theory too, we have become accustomed to mostly attending to these films in terms of plot, narrative logic, and character development.[35]

Usefully, both sides of Beugnet’s argument – the conventionalised primacy accorded to representative and narrative functions, and film as a material, sensuous entity, a sense of which can be regained even in relation to mainstream movies – lend support to examining the appearances of voice and media in The King’s Speech. It is within the intertwinement of these aspects with their overlapping but partly incompatible implications that the presences of voice and media in this film arise. To start from the film’s narrative and representational contents, the relationship between voice and technical sound media is central to its main storyline and the attendant actions of its protagonists. The King’s Speech tells the story of King George VI (played by Colin Firth), who became the ruler of the British Empire in the mid-1930s after the abdication of his brother Edward VIII due to the latter’s marriage to Wallis Simpson. In the beginning of the film, this main character is still the Duke of York. The film narrates the attempts of George – or, to use the nickname by which he is mostly referred, Bertie – to overcome his stammer. As historical source material, including audio-visual recordings, indicates, this condition characterised the actual George VI’s voice productions.

The film approaches Bertie’s stuttering in psychologised terms occasionally coloured by popularised psychoanalytical notions. Not unsympathetically, his stammer is associated with his personal history and consequent anxieties about assuming a public, authoritative voice. At the same time, the stammer is portrayed starkly as an impediment, as a disruptive exception from the vocal norm, particularly in its (male) upper-class and publicly manifest guises. Important for the current analysis, the need to eliminate or manage this impediment is shown to become increasingly pressing in the case of Bertie because of his expanding responsibilities of addressing the national audience not only at localised ceremonial and political events, but also through the radio, the then relatively-new sound broadcast medium. Painfully aware of the extended spatial reach and socio-political weight that his vocal deliveries acquire because of this medium, Bertie, encouraged by his wife Elizabeth (Helena Bonham-Carter), embarks on voice therapy and training sessions with speech therapist Lionel Logue (Geoffrey Rush). This is in order to rid his vocalisations of the involuntary repetitions and discontinuities comprising the symptoms of the stammer.

While centring on the evolving relationship between these two men and on a series of radio speeches Bertie prepares for, the story of the film assumes characteristics of a progress or redemption narrative. Punctuated by conventional setbacks (failed speeches after seemingly successful rehearsing, tensions between Bertie and Lionel), the movie tends toward what is envisaged as a gratifying resolution, which it at least partly reaches. By the end of the film, Bertie exhibits improved capacities of bodily and mentally controlling the sonic and signifying expressions of his voice. In the film’s late scenes, these capacities are moreover pictured as transforming the kinds of affect (changes of state) and experiences that Bertie’s vocalisations inspire in the radio audiences: attentiveness, feelings of being moved. Growing trust in the performative nuances and nationalist messages of the King’s speech become the listeners’ dominant visualised states.

Quite often, the film’s depictions of voice and media appear to reiterate tradition-seeped or (previously) widely-established assumptions about both. However, I wish to suggest that specific features of the film encourage other, more minoritarian,[36] understandings of these phenomena and concepts. To start from the more familiar perspectives, certain aspects of the film call forth the premise famously excavated by Derrida from the Western metaphysical tradition. According to this idea, vocal enunciations – or hearing and feeling oneself speak – epitomise the individual subject’s unmediated presence to itself, or the potent illusion regarding this kind of self-derived and self-standing existence.[37] With respect to The King’s Speech, this linking of voice to the subject’s coinciding with itself is implied by the statement on the film’s DVD case about how the movie concerns ‘one man’s quest to find his voice’, and by Bertie’s exclamation ‘I have a voice!’ at a particularly challenging moment of his vocal training. From the perspective at stake, it is not just to acquire a voice capable of appropriately fluent expressive patterns and media-transmitted mediations of linguistic content that Bertie must strive to conquer his stammer. This is also crucial for his fuller acquisition of autonomous selfhood.

With regard to notions of medium, there are detectable signs of both determinist and instrumentalist views in the film’s portrayals of the relationship between Bertie’s deliveries and radio broadcasting. A technologically-determinist view seems to surface when Bertie’s father, King George V, dryly remarks in reference to the increased stress the rise of radio has put on skilled vocal performance that  this medium is requiring that ‘we’ (royal family members and other holders of public political office) ‘become actors’. Toward the end of the film, the radio equipment and broadcasting situations become framed differently. Instead of their previous determining dominance, they now appear more like a vehicle for Bertie’s changed vocal performances – as a medium partly at his command. To expand on Kember’s and Zylinska’s discussion, these understandings of media, both of which The King’s Speech occasionally adheres to, are connected in that each tends to erect a relatively static model in which mediums, their users, and audiences pre-exist one another as originally separate entities. They are only brought to interaction by the intermediary layer of mediation, represented in our example by broadcast vocalisations and speech acts.[38] Moreover, The King’s Speech involves remarkably conservative and nostalgic visions about media audiences as a nation-wide ‘listening community’, the cohesion of which expertly-devised instances of (vocal) media content can strengthen.[39]

Nonetheless, particular details of the film’s narrative and aesthetics present the voice’s operations and connections to media in more emergent, constitutively-relational terms. From the angle of narrative logic, Bertie’s stammer may well occupy the function of an obstacle that the film’s protagonists have to encounter and work through in order for them to undergo transformation and for the portrayed circumstances to reach a new state that then provides closure. Alongside its narrative positioning, Bertie’s stuttering voice is constructed in The King’s Speech as an audio-visual event with such detail that the impact of these depictions gains a partial independence from the meanings the voice is assigned in the film’s story. This pushes the spectatorial experience in alternative directions. What these audio-visual constructions cumulatively highlight is the processes of emergence and the heterogeneous interacting factors – or constellations of pre-individual relationality – that condition the individuations of a voice. The aesthetic features that emphasise the first aspect include close-ups of Bertie’s parting and fluttering lips and visibly activated body before and during acts of enunciation – whether attempted or successful. These visual means signal how the voice necessarily and repeatedly takes form, how it arises into actuality. They intensify the effect of the vocal sonorities on the soundtrack.

Admittedly, these portrayals often seem to voyeuristically zoom into the insistent and socially-stigmatising bodily and vocal signs of Bertie’s stammer. Still, because of the irreducibility of their power to the film’s plot, the presentational techniques in question may have other effects. They can draw our attention more generally to the character of voices as emergent events that always involve a new – and partly unpredictable – coming together of body, subject, sound, the surroundings onto which the vocalisations spread, and the technical media potentially involved in modulating and disseminating them. This realisation can, in turn, inflect understandings about the political nature of voice beyond the ghost of the metaphysics of presence (the association of voice with autonomous self-expressing individuality) and any remainders of simplistic sender-medium-receiver models.

Through suggesting that the new skills Bertie obtains in terms of broadcast voice usage and the associated enactments of authority are desirable outcomes that call for elated reactions, The King’s Speech includes nostalgia for clearly-identifiable centres of socio-political power (and the role of media in establishing these). However, the film also gestures toward a politics of relational emergence. This occurs through invocations of the constitutive force of relations. Comprised again of particular audio-visual movements, these portrayals concern both the relationship between Bertie and Lionel as well as Bertie’s mediatised performances. Repeatedly, the camera follows the ways Bertie engages in or subjects himself to the physical exercises Lionel introduces and performs with him. On the soundtrack, the two men jointly try out different voice production techniques. Here, it is the moving relation between these characters and sound-making bodies with their respective pasts and inclinations – a relation that enhances new vocal capacities, but also disciplines and includes vulnerability that may diminish capacity – which is pictured as conditioning the very actualisations of Bertie’s voice.

In the speech scenes the camera movements between Bertie’s face and body, the deployed microphones, radios, the voice therapist, sound engineers, audiences, and the socio-material settings of the performances implicate all these human and more-than-human components in an interlinked capacity. Also, albeit increasingly in a complex manner, the film’s aesthetic means promote the notion that the relations between the depicted agents and elements – material, social, and semiotic – form the condition for how Bertie’s vocalisations take shape and have impact. The camera movements serve to conjoin institutional expectation with the performing body and its accompanying mental processes. They interlink media apparatuses and knowledge with adjusted practices of voice training and the emergent experiences of listeners.

If it makes sense to consider the voice itself as a medium or instance of mediation, in these moments of The King’s Speech voice does not merely mediate linguistic meaning or the vocalising body’s material properties and labour. To draw on Kember and Zylinska, the processes of mediation from which the voice as a medium temporarily arises are more hybrid and complicated; they encompass technological, cultural, corporeal, psychological, and other interrelating determinants.[40] The King’s Speech contains readily-familiar narrative and aesthetic ingredients. Nonetheless, the ways some of its features resonate with new materialist propositions and recent media theoretical ideas in the proximity of neomaterialist thought enable fresher insights into the significance and ontology of voice as a medium and how these can be cinematically portrayed.

Stripsody and multi-sensory voice   

Inasmuch as voices emerge from relations while the ensuing vocal expressions amount to further relational events, it can be claimed that Cathy Berberian’s (1925-1983) vocal practice specifically capitalised on the potential for newness and surprise (reconfiguring relationships) which the emergent character of the voice entails. Many vocal pieces in her repertoire that her performances made famous might indeed be described as an art of emergence in this regard. The extended vocal techniques to the development of which Berberian contributed are, overall, a case in point here. Their premise lies in the discovery of new vocalic expressions, particularly vis-à-vis the traditions of mainstream ‘classical’ and operatic singing, through newly-fashioned relations between the performer’s body and such modalities of sound that mostly fall outside of conventional Western definitions of musical vocalisations (e.g. speech, whispering, labial and guttural sounds).[41]

Against this backdrop, Berberian’s composition-performance Stripsody constitutes a rather singular example. While containing a cavalcade of elements that represent extended vocal techniques, its peculiarity pertains to what it drew its inspiration from, or the relations that underpinned its emergence. As the amalgamation of the words ‘rhapsody’ and ‘strip’ in the title intimates, Berberian’s blueprint when composing Stripsody was to give an actual and exclusively vocally-produced expression for a range of sounds that are implied by familiar scenarios in comic strips. These implied sonorities include yells, creaks, explosions, animal sounds (e.g. birdsong), and the acoustic signs of modern transportation technologies. To draw again on Philip Brophy’s reconsiderations of voice in posthuman terms, Berberian’s method with this piece could be portrayed as one of summoning forth new ways in which the (classically-trained) singing voice and vocal expression can more broadly ‘contort into multiple characterizations’ beyond their more customary, expected actualisations. Clearly, the making and performances of Stripsody in this respect also involve the subsuming of ‘nonhuman appellations’.[42]

Another manner of describing the piece in the current framework is that in it, Berberian endeavours to provide a soundtrack to visual presentations that typically imply dynamism: movement, energy, affective states. Although the attendant or potential sounds of these presentations are indicated visually, they do not possess an actual sonic form in the context of their original medium. However, it is not that the varied voice production styles, intonations, timbres, and dynamical shifts of Stripsody merely complement and thus serve a number of medium-specific visual scenarios. Arguably, their relation to such scenarios or visual perception is more active. All but inevitably, the piece’s vocals provoke actually absent, yet experientially real visual impressions as part of the perceptual processes of their listening audiences. These impressions may relate to scenes of action and emotion, movement styles, volumes and materialities of objects, bodily gestures and comportments. This image track, if you will, might be predicated in part on the references to sound in comic book visuals that Berberian originally worked from. Still, its qualities will ultimately depend on the wider potential to hear sounds in association with optic, kinaesthetic, tactile, and spatial characteristics that the listener’s multifarious histories of sensation and perception give rise to.

This leads to the second distinctive feature of Stripsody. Occasions of listening to this piece (versions by Berberian and others can be found on YouTube) are likely to demonstrate how the sensorial, experiential power of voices is not restricted to their much-studied cooperations with other sounds as well as visual (and accompanying tactile and further) qualities – whether we think of performing arts, film, video, and other technical media, or combinations of these. The voice also displays here a vibrant capacity to mobilise the inter-implication of the senses through its very sonorities alone. It is the virtual but effective persistence of past multi-sensory encounters in the perceptual present that enhances these instances of co-implication while contributing to their evocative force. At the moment, intermodality, synaesthesia, and the multi-sensory spectator/media user figure once again as important film and media theoretical concerns.[43] However, the benefits of these revitalised notions for (re)considering the voice in media and as a medium remain mostly unexplored. This applies particularly insofar as the perspectives at stake are interlinked with new materialist returns to the complexity of materially-based aesthetic experiences at the crossroads of actual and virtual.

What Berberian’s Stripsody exemplifies is how the process of mediation – in the sense of emergence – from which the voice arises as a temporary result, or medium, may include relationality between such areas that seemingly occupy mutually distant cultural locations and address different sense channels. In this case, these comprise avant-garde singing and the (partly-clichéd) expressive elements of comic book imagery. Simultaneously, the piece illustrates how any expressive practice that is apparently linked to a particular sense modality always already activates an entanglement of the senses.

While experimenting with the specific powers of voice to implicate other senses alongside hearing, through sound and remixes Stripsody invokes familiar ingredients from popular visual culture. It may help us realise our own capacities of seeing through sound and the embedded nature of these capacities in our past encounters with media imageries. Moreover, it encourages awareness about how it is not only the signifying but also the sensorial aspects of artistic and mediatised expressions that are mediated by past perceptions, and how the virtuality of sense perception means that the form of the object – like a voice – ‘is the way a whole set of active, embodied, potentials appear in present experience’.[44]


The aim of this article has been to engage with and introduce new ideas to discussions of voice and media through reconsidering three analytical themes. Each appears frequently in interrogations of voice in media cultural and theoretical settings. Important both for their relevance to voice’s coalescences with specific media and for their centrality in the wider literature about voice that traverses domains from film and media studies to musicology, philosophy, and cultural history, these themes concern the connective force of voices, the material and sensorial aspects of voice in excess of its role as a medium for linguistic signification, and the plasticity in how the voice relates to body, physical-social space and the very notions of the human in developing (media) cultural and technological milieus.

Through these angles, the essay’s intended contribution to studies of voice/media connectivity has sprung from new materialist theoretical perspectives previously unused in this context. I attempted to show how the prioritising of the productivity of relations over the supposedly pre-existing individuality of the relating terms that neomaterialist approaches to matter and reality advocate promises revamped tools for explorations of voice. These conceptions provide new ways to consider the intrinsic interconnections between bodies, selves, spaces, corporeality, and technology that voices as sound events entail. This interconnective power has preoccupied film and media scholars, whether it comes to the role of voice in media (e.g. cinema) or its workings on media audiences.

With my examples, I hoped to indicate how notions of relation as constitutive and generative may reveal fresh aspects about the media portrayals of voice (The King’s Speech), the development and perception of vocal expressions in contact with media (Stripsody), and the possibilities of conceiving the voice as medium. I also endeavoured to exemplify how the conceptual coupling actual/virtual that neomaterialist and associated projects are reusing and which in itself implies a co-constitutive relation between these aspects facilitates refined understandings of the currently revisited questions of sensory perception and of voice as a sensory medium.

Ultimately, the premises regarding the relational basis of individuation and the fundamental heterogeneity of interrelating factors propose posthuman understandings of voice and its connectedness with media. Insofar as relations are constitutive, the voice cannot be confined to exclusive or self-evidently dominating humanness in isolation from the connections its forms and affectivity bear with technological capacity and natural life. Though some investigations have surely acknowledged the embroilment of vocalic expression in these kinds of relations, posthuman(ist) theory enables more detailed, conceptually-rigorous examinations of this relationality.

Insofar as relations are ever-emergent, the voice is posthuman in a further sense: its manifestations have repeatedly exceeded the by-then actualised human vocal potential. Similarly, future vocal expressions will, with certainty – albeit unpredictably in terms of their specificity – exceed the ‘currently human potential’.[45] Even a mainstream narrative film such as The King’s Speech provides perspectives into how vocal sounds and a vocaliser’s bodily and mental processes intrinsically interlace with sound transmission media. Concomitantly, the bond between body and voice is displayed as something that is not given but rather emerges through dynamic interconnections with cultural, technological, and environmental factors. Meanwhile, Berberian’s Stripsody works to expand the human vocal potential in the company of media images and nonhuman sounds.

Propelled by neomaterialist and posthuman insights, my final theoretical and methodological proposition is that the voice could be explored increasingly in terms of scalar entanglement. While building on Félix Guattari’s three ecologies, this concept, recently introduced to media studies discussions by Sy Taffel, ‘encourages transversal thinking across’ the ontologically ‘relational scales’ of representational content and social meaning, organic bodies, and other forms of material agency (from media software and hardware to the natural resources exploited in their manufacturing).[46] Its analytical potential ranges from the types of cases highlighted in this article to the examination of more obviously posthuman actualisations, such as the well-known vocal performances of HTML code.[47] If attuned to specific instances, this kind of transversal thinking has much to offer for the study of voice in media and as a medium.


Trained as a musicologist (PhD, University of Turku, Finland), Milla Tiainen is Lecturer and Course Leader for Media Studies at Anglia Ruskin University. During 2013-2014, she is working as postdoctoral researcher in the Academy of Finland-funded project Deleuzian Music Studies. Tiainen’s current research interests include the voice in contemporary artistic practice, media culture and theory, theories of affect, rhythm and the body in movement, sound and performance studies, and new materialist approaches in cultural/media studies and feminist thought. She has published widely in the areas of music scholarship and cultural theory. Her work has recently appeared or is forthcoming in such publications as the edited volume Carnal Knowledge: Towards a ‘New Materialism’ through the Arts (I.B. Tauris, 2013) and Body&Society. She is working on a book about a new Deleuzian approach to musical performance (under contract with University of Minnesota Press).


Alaimo, S. and Hekman, S. Material feminisms. Bloomington: Indiana University Press, 2008.

Beugnet, M. Cinema and sensation: French film and the art of transgression. Edinburgh: Edinburgh University Press, 2007.

Biddle, I. and Thompson, M (eds). Sound, music, affect: Theorizing sonic experience. London: Bloomsbury Academic, 2013.

Birdsall, C. Nazi soundscapes: Sound, technology and urban experience in Germany, 1933-1945. Amsterdam: Amsterdam University Press, 2012.

Braidotti, R. Transpositions: On nomadic ethics. Cambridge: Polity, 2006.

_____. The posthuman. Cambridge: Polity, 2013.

Chion, M. The voice in cinema, translated by C. Gorbman. New York: Columbia University Press, 1999.

Connor, S. Dumbstruck: A cultural history of ventriloquism. Oxford: Oxford University Press, 2000.

Coole, D. and Frost, A (eds). New materialisms: Ontology, agency, and politics. Durham: Duke University Press, 2010.

Derrida, J. Of grammatology, translated by G. C. Spivak. Corrected edition. Baltimore: The John Hopkins University Press, 1998.

Elsaesser, T. and Hagener, M. Film theory: An introduction through the senses. New York: Routledge, 2010.

Goodman, S. Sonic warfare: Sound, affect, and the ecology of fear. Cambridge: MIT Press, 2009.

Gorbman, C. ‘Translator’s Note’ in The voice in cinema by M. Chion. New York: Columbia University Press, 1999.

Hemmings, C. ‘Invoking Affect: Cultural Theory and the Ontological Turn’, Cultural Studies, Vol. 19, No. 5, 2005: 548-567.

Herzog, A. Dreams of difference, songs of the same: The musical moment in film. Minneapolis: Minnesota University Press, 2010.

Horn, E. ‘There Are No Media’, Grey Room, 29, Winter 2008: 6-13.

Kember, S. and Zylinska, J. Life after new media: Mediation as a vital process. Cambridge: MIT Press, 2010.

Kirby, V. Quantum anthropologies: Life at large. Durham: Duke University Press, 2011.

Leys, R. ‘The Turn to Affect: A Critique’, Critical Inquiry, Vol. 37, No. 3, Spring 2011: 434-472.

Manning, E. Relationscapes: Movement, art, philosophy. Cambridge: MIT Press, 2009.

_____. Always more than one: Individuation’s dance. Durham: Duke University Press, 2013.

Marks, L. The skin of the film: Intercultural cinema, embodiment, and the senses. Durham: Duke University Press, 2000.

_____. Touch: Sensuous theory and multisensory media. Minneapolis: Minnesota University Press, 2002.

Massumi, B. Parables for the virtual: Movement, affect, sensation. Durham: Duke University Press, 2002.

_____. ‘The Thinking-Feeling of What Happens’, Inflexions, Vol. 1, No. 1, May 2008: 1-40.

_____. ‘Of Microperception and Micropolitics’, Inflexions, No. 3, October 2009: 1-20.

_____. ‘Prelude’ in Always more than one: Individuation’s dance by E. Manning. Durham: Duke University Press, 2013.

Neumark, N. ‘Introduction: The paradox of voice’ in Voice: Vocal aesthetics in digital arts and media edited by N. Neumark, R. Gibson and T. van Leeuwen. Cambridge: The MIT Press, 2010.

Potter, J. Vocal authority: Singing style and ideology. Cambridge: Cambridge University Press, 1998.

Silverman, K. The acoustic mirror: The female voice in psychoanalysis and cinema. Bloomington: Indiana University Press, 1988.

Sobchack, V. Carnal thoughts: Embodiment and moving image culture. Los Angeles: University of California Press, 2004.

Taffel, S. ‘Scalar entanglement in digital media ecologies’, NECSUS, No. 3, Spring 2013,

Tiainen, M. Becoming-singer: Cartographies of singing, music-making and opera. Turku: Turku University Press, 2012.

Vila, M. Cathy Berberian, cant’atrice. Paris: Fayard, 2003.

Wetherell, M. Affect and emotion: A new social science understanding. London: Sage, 2012.

[1] Kember & Zylinska 2012, pp. 19-21.

[2] Horn 2007, pp. 7-8.

[3] On explorations of sound that draw on or display affinities with new materialist theorising, see Thompson & Biddle 2013; Tiainen 2012; Goodman 2010.

[4] Coole & Frost 2010, p. 10.

[5] Beugnet 2007, p. 2. For both well-known and subtly-constructed examples of this revived interest in bodily, sensory experiencing and the materiality of media, see Herzog 2010; Del Rio 2008; Sobchack 2004; Marks 2002; Marks 2000.

[6] Massumi 2013, p. xxiii.

[7] Braidotti 2013, pp. 1-12.

[8] Elsaesser & Hagener 2010, p. 8.

[9] Kember & Zylinska 2012, p. 21.

[10] Neumark 2010, pp. xix-xx.

[11] Neumark 2010, pp. xix, xvi.

[12] Silverman 1988, p. 80.

[13] Chion 1999, pp. 17-29, 61-66; Silverman 1988, pp. 5, 38-41.

[14] McLuhan 1999, pp. 78-79.

[15] See Kember & Zylinska 2012, p. 19.

[16] See Braidotti 2006; DeLanda 2006; Barad 2007; Manning 2009; Bennett 2010; Kirby 2011.

[17] For recent appropriations of Simondon’s concepts in theorisations of relationality, see Manning 2013.

[18] Alaimo & Hekman 2008, p. 3.

[19] Neumark 2010, p. xvi.

[20] Chion 1999, p. 1.

[21] Gorbman 1999, p. xi.

[22] Neumark 2010, p.xvi.

[23] Neumark 2010, p. xvi-xvii.

[24] Beugnet 2007.

[25] See Massumi 2002; Grosz 2005. It should be noted here that Massumi’s statements about sensation and affect as pre-conscious, pre-individual forces impinging on the body have drawn critique from several scholars across many disciplines. A key target has been the possible overemphasis Massumi places on the newness and autonomy of sensorial, affective forces in terms of their antecedence and ultimate irreducibility to the social and subjective meanings through which their effects are perceived and captured as part of the sensing body’s conscious, individualised experience. Critics have claimed that Massumi’s approach problematically reinstates a dichotomy between bodily happening/unrestricted potential that is equated with sensation and affect, and consciousness/socio-cultural narrowing down of said potential that is equated with the meanings and categories assigned to the registered effects of bodily becoming. These critiques have called for more continuity between previous and already signifying subjective experiences and new sensorial, affective becomings, as well as between the supposedly conscious and pre-conscious layers of our existence. See Hemmings 2005; Leys 2011; Wetherell 2012, pp. 53-67. There is some validity to this criticism. However, particularly in the more recent texts where Massumi elaborates his views on affect, sensory experiencing, and perception, clear attention is given to the ways in which bodies experience and become affected through  complex relations between their past and present encounters, milieus, and states. See Massumi 2008 and 2009.

[26] Massumi 2009, p. 2.

[27] Coole & Frost 2010, p. 9.

[28] Elsaesser & Hagener 2010, pp. 129-133.

[29] Connor 2000, p. 12.

[30] Neumark 2010, p. xxii.

[31] Brophy 2010, p. 361.

[32] Coole & Frost 2010.

[33] Coole & Frost 2010, p. 9.

[34] Beugnet 2007, p. 14.

[35] Beugnet 2007, p. 5.

[36] See Deleuze & Guattari 1987, pp. 105-106.

[37] Derrida 1976.

[38] See Kember & Zylinska 2012, pp. 20-21.

[39] I borrow the illustrative term ‘listening community’ from Birdsall 2012.

[40] Cf. Kember & Zylinska 2012, pp. xv, 3, 21-22.

[41] On Berberian see Vila 2003. On extended vocal techniques see Potter 1998, pp. 54-55, 170, 178.

[42] Brophy 2010, p. 361.

[43] See Elsaesser & Hagener 2010, p. 130.

[44] Massumi 2008, p. 4.

[45] Massumi 2013, p. xxiii.

[46] Taffel 2013.

[47] See net artist Igor Stromajer’s project Oppera Teorettikka Internettikka, originally streamed live on the Internet in 1999.