Video essay, videographic criticism, polymedial essayism, polymodal essayism
by Sureshkumar Sekar
On the community of academic video essayists practicing videographic criticism and their definition of a video essay, Oswald Iten says,
our definition should never limit what someone could or should do in a certain field… we should remember we as video essayists are part of a wider movement in the media landscape and therefore, I think, if someone defines their own work as a video essay, we should keep our minds open enough to consider it, and if necessary, broaden our informal understanding of what a video essay can be according to it.[1]
Openness is letting go of the idea that a definition of videographic criticism could or should account for audiovisual essayistic practices in all academic disciplines. Openness is accepting a YouTube video featuring an astrophysicist reading a written essay in front of a camera as a video essay, if the author chooses to call it so, and their intended audience accept it as one. It may not be a videographic work, but that does not mean that it is lesser or lacking. Openness is, instead of saying that apple is not orange or apple is better than orange, finding the basic characteristics that make them both belong to the same category – fruit.
In this paper, I propose that the authors of both the astrophysicist’s YouTube video and an academic video essay published in the journal [in]Transition are practitioners of polymedial essayism; and, as we shall see, the practice can also be called polymodal essayism. Using theories and concepts from intermedial and multimodal studies and with a detailed account of the decisions I made as a practitioner while arranging and orchestrating units of meaning potential in one of my video essays, I explain how I practice polymedial essayism.
Video essay and videographic criticism
A video essay is understood as an ‘an audiovisual media object that critically reappropriates audiovisual media objects’.[2] Videographic criticism is the practice of ‘creating videos that serve an analytic or critical purpose, exploring and presenting ideas about films and moving images via sounds and images themselves’.[3] [in]Transition, the first peer-reviewed academic journal of videographic film and moving image studies, says: ‘Practitioners of these forms… explore the ways in which digital technologies afford a new mode of carrying out and presenting film and moving image research.’[4] Videographic practice is both carrying out research and presenting research, both ‘a methodology and communication mode’.[5] It is a method of analysing audiovisual media – not from a distance, but by directly, intimately, engaging with the media on the timeline of editing software, and while at it, producing an audiovisual narrative by appropriating, arranging, and rearranging elements of the same footage in such a way that the output presents the findings and failures of the analytical process.
Academic video essay is ‘an ontologically new scholarly form’,[6] and it is not about the translation of written film studies into audiovisual ones. Jason Mittell, one of the founders of the annual workshop of videographic criticism at Middlebury College, says ‘don’t come into this trying to make a video adaptation of an essay you have already written, or research you have already done, because it really curtails the possibilities of discovery… this is not another way of expressing the idea… this is another way of thinking’.[7] This approach has been producing inspiring, innovative, evocative knowledge output, or knowledge effect, in audiovisual form in film and media studies. It is, however, interesting that the most nominated single work in the BFI Sight and Sound Best Video Essays of 2023 poll,[8] Richard Misek’s A History Of The World According To Getty Images,[9] is one that is the outcome of research the author had ‘already done’. Misek says:
My research process has followed several paths: research into the corporate history of Getty Images, including various past and present controversies around alleged fraudulent assertions of intellectual property and copyright trolling; research into the scope of Getty Images’ newsreel archive through an extensive search of its online catalogue; and research into the provenance of a small number of video clips that I purchased from its catalogue.[10]
All the research and discovery seem to have happened outside the editing software. While making the film, Misek discovered not what to communicate, but perhaps only how to communicate what he had already discovered. Similarly, in several other academic disciplines, video essay can only be an audiovisual adaptation of a written text, a communicative outcome of research already done.
Video essay, but not videographic criticism
Practitioners of videographic criticism never claimed that their tentative definitions and tenets apply to all audiovisual work emerging from all academic disciplines. Videographic criticism is largely only about the subjects within the remit of film and media studies. The objects of analysis – ‘audiovisual media objects’, ‘ideas about film and moving images’, ‘film and moving image research’ – mentioned in all the definitions quoted above clearly situate the videographic practice within film and media studies. The basic exercise the workshop at Middlebury College begins with involves participants working with audio and video material from one film.
A typical PechaKucha is an oral presentation format that has strict parameters for the timing of slides: 20 slides lasting exactly 20 seconds, each auto-playing, resulting in a presentation lasting precisely 6:40… Our videographic variant consisted of 10 video clips of precisely six seconds each, coupled with a continuous minute-long audio segment, all from the same film.[11]
What is the basic exercise a linguist with little interest in films can do to learn how to disseminate their research in video essay form? Such scholars from other disciplines, who are interested in the video essay form, are most likely to stumble upon the scholarship emerging from film and media studies. These scholars, however, should know that these academic texts, tutorials, and ongoing discussions on videographic practice are not intended to limit what they could or should do when making video essays on subjects related to their discipline, subjects that do not involve analysing any audiovisual media object.
A linguist looking to make a video essay about the grammar of a written language need not have to consider using the video essay as a way of thinking or the process of making it a method of discovery. For example, if Rishi Rajpopat, who made ‘Sanskrit’s “language machine” work for the first time in 2500 years’,[12] chooses to make a video essay about it, he can only translate or adapt his written thesis and will use the audiovisual form only as a ‘means of expression’ and not a ‘means of thinking’[13]. This video essay based on Rajpopat’s thesis may not be considered a videographic work, but it could still be a video essay that is effective, evocative, engaging, enriching, and entertaining to its intended audience and to audiences in general, and all such works are a part of the ‘wider movement in the medial landscape’.[14] The audiovisual language, like written and spoken language, is common to all and can be used by scholars in all academic disciplines. So, the form and aesthetic of video essays will be different for different academic disciplines and for different communicative contexts. In the context of YouTube, a group dedicated to video essays (mostly published on YouTube) on Reddit defines a video essay – a video essay that is allowed to be shared or published in the group – as follows:
- A focused essay read aloud over relevant video accompaniment which seeks to argue a defensible position.
- A focused essay read aloud over relevant video accompaniment which seeks to explain and interpret a topic…[15]
Arun Maini, who reviews smartphones, laptops, and other gadgets on his YouTube channel Mrwhosetheboss (17.9 million subscribers), says that his ‘job is basically [making] video essays’.[16] Given the wide variety of topics on which video essays are being made and published on YouTube, there seems to be a wider movement of audiovisualisation of knowledge happening now. The written and spoken word seem to be an integral, indispensable part of this ever-accelerating audiovisualisation process. However, as James McDowell says, in YouTube video essays meanings are created not only by the written and spoken word, but by ‘the interplay between the words and the imagery, the narrative, the acting, shot composition, music, costume, lighting, editing, structure, visual metaphor and so on’.[17] These constituent elements are called semiotic modes and semiotic resources in multimodal studies, and basic and qualified media types in intermedial studies. Based on the terminology from intermedial and multimodal studies, I term the wider practice of producing a video essay on any subject Polymodal Essayism, if one prefers to use the terminology from multimodal studies, or Polymedial Essayism, if one prefers to use the terminology from intermedial studies.
What is polymedial essayism and polymodal essayism?
Polymodal essayism is the practice of orchestrating a media product in which the essay emerges from the interactions between units of meaning potential in multiple communicative modes temporally unfolding together in multiple layers, modes such as written and spoken words, still and moving images, and sound and music. Polymedial essayism is the practice of orchestrating a media product in which the essay emerges from the interactions between units of meaning potential in multiple media types temporally unfolding together in multiple layers, media types such as written and spoken words, still and moving images, and sound and music.
By this definition, even an audio essayist is a practitioner of polymedial essayism or polymodal essayism; they could orchestrate a media product using sound effects, music, spoken word, and multiple speaking voices. Before I explain the practice further, I shall first introduce Lars Elleström’s vocabulary of terms that helps to dissect any media product into its basic constituent building blocks. Henceforth, I use only the term polymedial essayism.
Modes, modalities, media types, and media products
The elements that come together to create meaning potential in a polymedial essay could be any of these: written word; typeface, font size, colour, and other effects applied to the written word; topography, layout, arrangement of information; spoken word, sonic attributes of the voice(s) and the speakers’ accents; charts, graphs, and infographics; still image and still image set in motion; moving image; posture, head movements, hand gestures, and facial expressions of people in the moving image; movement of the camera and movement of the objects within the moving image; speed and direction of movement of all that is moving in the moving image; mise-en-scène, camera angle, lighting, editing techniques and transitions; sound effects, silence, music, instruments used in a piece of music; and so on. These elements are categorised, and the categories named and defined, albeit differently, in intermedial studies and multimodal studies. Lars Elleström’s set of terminologies are: media product, media modalities, modality modes, and media types.
A video essay on Vimeo, a film screened in a cinema hall, the film’s music streamed on Spotify, a rock music concert playing on television, an article in a newspaper, and this essay you are reading now on a flat surface – each of these is a media product. A media product is ‘a single physical entity or phenomenon that enables inter-human communication… that enables the transfer of cognitive import from a producer’s mind to a perceiver’s mind’.[18]
Television is not a media product; the wildlife documentary the device broadcasts is. Elleström gives this device a name: a technical medium of display, which is ‘any object, physical phenomenon or body that mediates sensory configurations in the context of communication; it realises and displays the entities that we construe as media products’.[19] The screen and the sound devices in a cinema hall, the device we use to stream the music available on Spotify, the screen and the speakers in a television, the moving bodies of the musicians in a concert hall are each a technical medium of display.
All media products have four fundamental traits. Elleström calls these traits ‘media modalities’ or ‘modalities of media’ and they ‘form an indispensable skeleton upon which all media products are built’.[20] The four traits/media modalities are: material modality, spatiotemporal modality, sensorial modality, and semiotic modality.
For something to acquire the function of a media product, it must be material in some way, understood as a physical matter or phenomenon. Such a physical existence must be present in space and/or time for it to exist; it needs to have some sort of spatiotemporal extension. It must also be perceptible to at least one of our senses, which is to say that a media product has to be sensorial. Finally, it must create meaning through signs; it must be semiotic.[21]
An object ceases to be a media product if it does not possess one of these four modalities. While these four media modalities are the ‘categories’ of basic media traits, modality modes or simply modes are the different possible traits in each category.
On semiotic modality, Elleström says:
the perceived sensory configurations are meaningless until one understands them as representing something through unconscious or conscious interpretation… all objects and phenomena that act as media products have semiotic traits… By far the most successful effort to define the basic ways in which to create meaning in terms of signs has been Charles Sanders Peirce’s foundational trichotomy icon, index, and symbol… Icons stand for (represent) their objects based on similarity, indices do so based on contiguity, and symbols rely on habits or conventions.[22]
A video essay acquires the function as a media product because it materially exists or is realised in solid mode, unfolds temporally on a two-dimensional surface (width and height), is perceptible through sight and hearing, and the sensory configurations create meaning in a perceiver’s mind through symbolic, iconic, and indexical signs. A written essay acquires the function as a media product because it materially exists in solid mode, unfolds temporally on a two-dimensional surface, is perceptible through sight, and the sensory configurations create meaning in a perceiver’s mind through mostly symbolic signs. There is a difference between the temporality of a video essay and a written essay. Elleström says:
some media types, such as visual, verbal (symbolic) signs on a flat but static surface (such as printed texts), are conventionally decoded in a fixed sequence, which makes them second-order temporal, so to speak: sequential but not actually temporal, because the physical matter of the media products does not change in time.[23]
The meaning of the term mode, in intermedial studies or as Elleström defines it, is entirely different from what the term means to scholars in multimodal studies. Modes, in multimodal studies, are ‘systems of semiotic resources’.[24]
Semiotic resources are the actions, materials and artefacts we use for communicative purposes… together with the ways in which these resources can be organized. Semiotic resources have a meaning potential, based on their past uses, and a set of affordances based on their possible uses, and these will be actualized in concrete social contexts… [25]
For example, in a system of semiotic resource that is written text, typeface, font size, and colour applied on the words are semiotic resources that have potential meaning. In an academic paper such as this, while the written word is the dominant mode, examples of semiotic resources are italicising a word for emphasis and a citation format that indicates that a certain passage is quoted from elsewhere.
What is referred to as mode in multimodal studies, Elleström (2021) calls media types – basic media types and qualified media types.[26] While image, moving image, sound, music (organised sound), and text are basic media types, qualified media types are those that are qualified and defined by ‘certain functions’ or how we use them ‘in a certain way at a certain time and in a certain cultural and social context’,[27] that is by ‘context, convention, and history’.[28] In rock music, for example, the basic media type music is qualified with the name of the genre rock that is defined by its context, convention, and history. In the academic essay, text (basic media type) becomes an essay and not a short story because it follows certain accepted conventions; this essay then becomes an academic paper and not a magazine article because it follows certain conventions agreed by the community of academic scholars.Then when this author records themselves reading this written work in front of a camera, they are utilising the new possibilities offered by the visual, temporal, iconic, and indexical modes of the audiovisual medium. This is a polymodal essay or polymedial essay because they could add more layers of meaning to the orchestration, that is, more elements and units of meaning potential – their embodied self, gaze, facial expression, attire, tone of voice, the pace at which they read the text, pauses between words, and emphasis on certain words, etc.
Having discussed all the necessary terms from intermedial and multimodal studies, it must be noted that these categorisations and their definitions are not without problems. Several communicative texts may not neatly fit into these categories. Elleström himself admits that ‘[r]easoning in terms of types can involve several pitfalls.’[29] Discussing these pitfalls, however, is beyond the remit of this paper. Furthermore, this is not a comprehensive literature review of all the concepts in intermedial and multimodal studies. I have discussed only the terms and concepts that together make an adequately consistent and coherent framework for the purpose of defining and demonstrating the practice of polymedial essayism.
I use the terms polymodal and polymedial and not multimodal and multimedial – both the prefixes poly (from Greek polus) and multi (from Latin multus) mean many or more than one – because there are several different meanings and definitions ascribed to the terms that come with the prefix multi. According to Elleström, all media products are by default multimodal because they always exist in four modalities: material modality, sensorial modality, spatiotemporal modality, and semiotic modality. In multimodal studies, however, a webpage is multimodal because it typically contains information in more than one mode (a system of semiotic resource), say written words and still images. The practice of arranging the elements of a written essay on a webpage, however, is entirely different from the practice of orchestrating what I call a polymedial essay. The units of meaning potential on a multimodal website are largely static and not temporally symbiotic, whereas in a polymedial essay units of meaning potential temporally unfold simultaneously. As already mentioned, a block of written text on a printed page or a webpage is ‘sequential but not actually temporal’, but written text in a video essay is temporal. Hence, to emphasise this difference, we need a new term. The terms polymodal and polymedial are relatively more distinctive, and these terms also help to suggest the proximity to the practice of writing polyphony in music, which means more than one unit of musical meaning (melody) temporally unfolding simultaneously. Just as a composer would orchestrate a musical idea considering the affordances of different instruments in an ensemble, in a polymedial essay the author orchestrates an essay considering the affordances of different communicative modes or media types.
With this set of terms and concepts, I will now illustrate, using an excerpt from one of my video essays, how I practice polymedial essayism. I will use a video essay on a subject that does not involve analysing any audiovisual media object. I have chosen an excerpt from my video essay Self-isolation: Ethics, and Extraction of the Self, in Auto-Netnography; it is a video essay about auto-netnography, a research method, and the ethical questions related to using one’s own social media data for auto-netnographic study. First, some context, for the communicative context does inform the narrative and creative choices made.
Context
I created this video essay on auto-netnography for an academic conference organised by the University of Birmingham, a conference that was held online from 8-10 September 2021. The title of the conference was Information Overload? Music Studies in the Age of Abundance.
I had already presented a paper in video essay form at another conference before, and as Eric Faden said in his A Manifesto for Critical Media, the video essay I presented ‘immediately injected some life into conference proceedings’.[30] It was received well. So, I decided to make another video essay for the Birmingham conference. I was certain that most in the audience would not have seen a presentation in the form of a video essay. I thought that the novelty of the form might help to create an impact, but this also means that, formally, the video essay cannot be too elliptical, experimental, or challenging. Some knowledge about the audience, their context, and their expectations is crucial to the process of constructing a video essay. For example, I assumed that, given the title of the conference, most of the scholars attending would have some basic knowledge about the ethical questions related to using public social media data for academic research. This is the reason I start the video essay the way I do, in media res, from the middle of a sentence in a scholar’s talk.
Practicing polymedial essayism or polymodal essayism
The original video essay is twenty minutes long, but I will discuss here only the first two minutes and twenty seconds of it. I have divided this excerpt into four segments to make the analysis of the production process manageable. For each segment, to give a better sense of how I created or engaged with the material, I provide a screenshot of the timeline from the editing software. Each vertically stacked strip seen in the screenshot is a layer of media. In the analysis, for each layer of media, I identify the units of meaning potential, the modes or media types, and the semiotic resources I have created, manipulated, or used as already present (in the case of found footage); these are presented in a separate table for each segment. Then I offer an analysis of how the essay emerges from the interactions between the units of meaning potential in these multiple layers.
Segment 1: Intro (00:00 – 00:27)
The footage of Prof. Robert Kozinets speaking about the ethics of collecting and using public data is on the left of the visual frame, and the title and the details of the author of the video essay unfold on the right. The original footage – which is an online talk Prof. Kozinets delivered under the title Netnography in the age of COVID-19 – includes the faces of two other academic scholars who are listening to this talk.[31] I cropped the visual to keep only the part that is relevant to this video essay, and this also freed some space in the frame. The other elements in this footage are kept as is. It is one continuous clip that is not sped up or slowed down, neither are any of the visual attributes of the cropped clip modified. As I start immediately with the visual of an individual speaking, I use the remaining space in the frame and the time the speaker takes to make their point to provide the conference audience a sort of an opening credits: the title of the paper/video essay and the details of the author. The written text on the right does not appear all at once, it appears one word at a time. This is so that the audience can both hear what Prof. Kozinets is saying and read the written text that is unfolding.
The written word is orchestrated to play alongside the spoken word, and though the two layers are delivering different information, they are connected by the utterance and appearance of the word ethics in both and by the context of the conference. If the text on the right appeared all at once, it would seem cluttered and confusing, because the audience does not know how long the text will stay on screen. They might want to read all the words quickly before it disappears, and in the process miss what Kozinets is saying. When it unfolds one word at a time, however, there is, at any moment, only one new written word they will have to read and process. After the first few words appear, the audience might intuitively guess the pace at which the words appear and adjust the pace at which they read accordingly. The overall speed of the unfolding text is set in such a way that the whole text appears a few seconds before Kozinets finishes his statement. So, for about four seconds towards the end of this segment, the audience could see all the written text on the visual. Though the colour of all the text appearing on screen is the same, their position on screen and the size of the text are chosen in such a way that the audience can distinguish one type of information from the other. The name ‘Prof. Robert Kozinets’ is placed right below the visual of him speaking. The title of the paper appears at the top right and in a larger font size than that of the details of the author that appears at the bottom right. The unfolding of all the units of meaning potential are carefully calibrated to play alongside each other so that, while adding some energy and dynamism to the visual, the information delivered is as clear as it can be.
Segment 2: Social Media Data (00:27 – 00:57)
The last three words spoken by Prof. Kozinets from the previous segment are used as a linking unit of meaning potential to this segment. As he utters the three words, all the visual elements of the previous segment disappear and the static text ‘SOCIAL MEDIA DATA’ appears at the centre. As the other text that is about to appear will be in white, I made this text at the centre non-white to make it look more distinct and significant, for the video essay is in fact about social media data. The background is not blank or black. As the visual of this segment is mostly just static text appearing one after the other, I needed something to keep the visual slightly busier. I wanted some unintrusive movement in the background. I chose stock footage that has blurred dots of white light of different sizes emerging at random and flying in different directions. This footage plays through the entire length of this segment. The text ‘SOCIAL MEDIA DATA’ is visible almost throughout.
Then I begin the voiceover narration with ‘well, it is not that simple’ as a response to Prof. Kozinets’ statement about using public data and social media data for academic purposes. Through voiceover, I introduce my research topic, the methodology I intend to use, and the potential ethical questions the chosen methodology raises. As I utter some of the key terms, the corresponding text synchronously appears on screen. To keep it legible and orderly, the terms that are research methods (Auto-Netnography and Netnography) appear above the text ‘SOCIAL MEDIA DATA’ and the terms that are about ethics (Informed Consent and Confidentiality) appear below. All the words are in white and are of equal size, but their positions on the screen are different. I end this segment deliberately with a question (what is netnography?) and it works as a linking unit of meaning potential to what follows in the next segment, with Prof. Kozinets’ explanation of netnography unfolding as an answer to my question.
Segment 3: Netnography (00:57 – 01:29)
In this segment, I have used a YouTube video featuring Prof. Robert Kozinets without any modification. All the visual graphics, animation applied on the written text, the use of still images in the background, and the background music come with the chosen footage. I only added the name ‘Prof. Robert Kozinets, Proponent of Netnography’ at the bottom. The visual in this block is much busier than the other parts of the video essay, and it might seem too vibrant and colourful, especially coming after the previous segment which has mostly static text in the visuals. However, variations in the visual composition every twenty to thirty seconds help to keep the audience engaged. For each sentence or phrase uttered by Prof. Kozinets a relevant still image, written text, or some animation appears alongside the visual. No word (a symbolic sign) is left untranslated to signs of other types.
When Kozinets mentions that he founded Netnography, the covers of the books he has written on the subject appear in the background. When Prof. Kozinets says ‘more focused on meaning than on precision’, this comparison is emphasised in the visual with the first part of the text ‘more focused on meaning’ appearing on the left and the second part ‘than on precision’ appearing on the right, with Kozinets appearing in the middle and with the words ‘meaning’ and ‘precision’ italicised for further emphasis. I follow Kozinets’ description of netnography with my voiceover in which I say, ‘and auto-netnography is making meaning of one’s own interactions on the internet’. To make the transition to my voiceover seamless, I use in my definition of auto-netnography some of the same words (interactions, meaning) Kozinets uses in his last sentence. Moreover, I used the colourful animated graphics from Kozinets’ footage as the visual that accompanies my voiceover, and this too is to make the transition to my voiceover seamless.
Segment 4: My own data (01:34 – 02:20)
In this segment, the orchestration of the units of meaning potential in the multiple modes/media types is entirely based on the auditory text (voiceover), so the following presentation of the analysis is different from what I have done for other segments; with this, I explain how I orchestrated the visuals that accompany each of the seven parts of the voiceover.
1. Voiceover: ‘My research is related to film scores. And I founded a website where I have been writing exclusively about film scores for the past 20 years.’
Visual: The website being spoken about appears on screen, and it appears not as a static image. There is some movement in the visual; it is being scrolled up and down. This is to give an overall sense of what the website looks like. I did not intend for the audience to read any specific text (which appears small) on the website.
2. Voiceover: ‘My online presence elsewhere too is all about the music I love.’
Visual: The ‘elsewhere’ is a Facebook page. So, it appears on screen, and again, the page is being scrolled up and down. Some movement in the visual keeps the audiences’ eyes engaged.
3. Voiceover: ‘So, my blogs, Tweets, Instagram, and Facebook posts have now become the data for my Auto-netnography.’
Visual: As the names of the social media platforms are mentioned in the voiceover, the icon representing the respective social media appears on screen. The icons are overlaid on the image that is already depicting the visuals of my website and the Facebook page. This makes the screen seem cluttered, a sludge of several signs and colours for a few seconds; this is to introduce some variation in the combination of elements in the visual. Not everything seen needs to be too precise and neat.
4. Voiceover: ‘This is my own personal data. I can download them, analyse them, quote them directly citing link to the original entry on the social media platform.’
Visual: While the icons continue to appear on screen, to make the visual quieter with minimal objects and movements, I reuse the stock footage I used in Segment 2 and let it play in the background. Below the icons some text appears. I introduce a variation in the interaction between the auditory text (voiceover) and the written text on screen. The text ‘I can do whatever I want with it’ on screen is intended to complement the voiceover saying ‘I can download them, analyse them, quote them…’. This makes the interaction between the words on screen and the words in the voiceover more dynamic. This is unlike what happens in Segment 2 where the words shown on screen are the same as the words spoken, and the words’ appearance on screen were precisely timed to synchronise with their utterance in the voiceover.
5. Voiceover: ‘It is all ethical. Isn’t it? I can do whatever I want with my own social media data.’
Visual: The text on screen is ‘It is ethical. It is all MY data.’ The text complements the words in the voiceover without repeating it.
6. Voiceover: ‘Or, Can I? Hmmm…’
Visual: A big question mark at the centre of the screen.
7. Voiceover: ‘Perhaps not. Well, again, it is not that simple.’
Visual: The question mark disappears. As I recapitulate the phrase ‘it is not that simple’, the phrase the video essay begins with, the central thesis of the paper appears as text on screen: ‘Your social media data is not yours alone’. This sentence is in the abstract and published in the conference programme.[32] It is this counterintuitive idea that I illustrate in the rest of the entire video essay. I, however, do not say these words out loud in my voiceover – until the end of the video essay.
With this, I have illustrated how I, a practitioner of polymedial essayism, orchestrated elements and units of meaning potential from different modes or basic and qualified media types (written text, auditory text, still image, moving image, original footage, found footage, stock footage) and semiotic resources (text animation, position of text on screen, position of image on screen, typeface and size and colour of written text, pauses between words and emphasis on selected words in the voiceover) to construct the polymedial essay Self-isolation: Ethics, and Extraction of the Self, in Auto-Netnography.
I could have also used a videographic essay to illustrate polymedial essayism. An example of me thinking and discovering entirely on the timeline of editing software is the video essay with no spoken words Screen Stars Dictionary: Aishwarya Rai Bachchan see under ethereal.[33] I imported some of my favourite music videos featuring Aishwarya Rai Bachchan into the editing software, added a music track I thought might work, and started editing. Everything in the video essay – the narrative, the sync points between the music and the moving images, the veil being a recurring motif in the selected videos, the only written text that appears on screen – emerged in the editing process. This videographic work is also a polymedial essay, for I construct the piece by arranging and orchestrating multiple elements of meaning potential: music, movements in Aishwarya’s eyes, movement of her body, movement of the camera in the shot, movement of a piece of cloth, the posture and stillness of the actor, the point where I cut from one clip to the next, and the written word.
Polymedial essayism, polymodal essayism, perceived quality of output
Polymedial essayism is what I practiced when I made my first video essay in 2009 using the video editing software Windows Movie Maker. This was a video essay in which I translated a written analysis into an audiovisual one. I made one mode or media type (auditory text in the form of a voiceover) interact with another (a film clip) to give the readers of my blog on background scores in films a new form of analytical text. These are some of the comments viewers left on the blog post: ‘Nice work. Sounds like a DVD commentary.’; ‘Thank you for making me listen to the music with concentration’; ‘For some reason, I had never paid so much attention to… bg scores… Your compilation was an eye opener. Now I could appreciate… better.’; ‘man pls explain sharp and crispy… mani ratinam padam pathi pesum podu ippadiya illuthu illuthu pesuv [so much stuttering, so many umms and ahhs… Is this the way to speak about a master filmmaker]’.[34] The video essay was useful to some of its intended audience members. Though I now find this work utterly unwatchable, it does not change the fact that the practice that produced the output is what I have now termed polymedial essayism. The same can be said about videographic criticism. That a videographic essay – an audiovisual media object that critically reappropriates audiovisual media objects – is uninteresting to its intended audiences does not change the fact that the practice that produced the output was videographic criticism.
The perceived quality of the output produced by the practitioner is irrelevant. Polymedial essayism is not about producing only good video essays, ones that are evocative and unforgettable; it is about producing video essays, ones that may even be incoherent and unwatchable. The art of making a good video essay or a polymedial essay is an entirely different topic. The number of possible ways of orchestrating multiple units of meaning potential in a polymedial essay are infinite. The orchestration techniques that are repeatedly used by the practitioners and the techniques that always yield effective and engaging outcomes will have to be explored further, and these techniques could be different for different subjects and academic disciplines, for the audiovisual possibilities available to a musicologist and a microbiologist are not the same. Therefore, irrespective of the perceived quality, the author who is translating or adapting a written essay into an audiovisual one – into a PechaKucha, an illustrated lecture, a TED talk on YouTube, or a recorded PowerPoint presentation – is a practitioner of polymedial essayism. This audiovisual translation need not always involve complex combinations of music, sound, animation, effects, and editing techniques and transitions.
Consider, for example, the videos on the YouTube channel Academy of Ideas; they can be made almost entirely using Microsoft PowerPoint. These are videos that examine ‘the ideas put forth by humanity’s greatest philosophers, psychologists, and economists’.[35] Most of the videos on this channel contain just three modes or media types: written word, spoken word (voiceover), and still image. Save for a few exceptions, the author has not used clips from any existing audiovisual media object. In the videos, the author reads a written essay aloud. When quoting others’ work, the quoted text appears on screen as static text, simple and clean with no animation or additional effects. When the author reads their own text, a painting or still image that is relevant to the subject appears on screen. They have been following this method of orchestrating the three modes or media types and the associated semiotic resources fairly consistently in almost all of their videos. At the time of this writing, the Academy of Ideas channel has 1.67 million subscribers and their videos have been watched over 114 million times.[36] There are thousands of such successful channels on YouTube producing video essays using a wide variety of combinations of elements and units of meaning potential in different modes/media types.
In the wider media landscape, audiovisual language is being used in a number of different ways to produce effective, engaging, and enriching video essays about multifarious subjects, even subjects that do not involve analysing any existing audiovisual media object. All this is to say that, videographic or not, a PechaKucha by itself (see Shawn Kanungo’s I Love the Spelling Bee[37]and Why Indians Always Win the Spelling Bee[38]) can be a video essay, and the author who produces it is a practitioner of polymedial essayism.
Author
Sureshkumar P. Sekar holds a PhD from Royal College of Music, London, an MA in creative writing (biography and creative non-fiction) from University of East Anglia, Norwich, and a B.Tech in mechanical engineering from National Institute of Technology, Trichy. His research interests are audiovisual culture, audience experience, video essay, fandom, and film music. His video essays have been screened at over 30 academic conferences and some of his peer-reviewed academic video essays – published in journals such as [in]Transition, Tecmerin, Alphaville, and Music, Sound and the Moving Image – have been included in the best video essays of the year poll in Sight and Sound magazine. His work has been nominated for the Learning on Screen Award, the Adelio Ferrero Award, and was runner-up in the Andrew Goodwin Memorial Prize 2022. His paper ‘Intense Affect, Feeling, and Emotions: Audience Experience in Film-with-Live-Orchestra Concerts’ was announced the runner-up for the Claudia Gorbman Graduate Student Writing Award 2023.
References
‘About [in]Transition’, [in]Transition: https://mediacommons.org/intransition/about (accessed on 4 February 2024).
‘About’, Academyofideas.com: https://academyofideas.com/about/ (accessed on 4 February 2024).
Academy of Ideas: https://www.youtube.com/@academyofideas (accessed on 4 February 2024).
Almeroth-Williams, T. ‘Solving grammar’s greatest puzzle’, Cambridge University, 2022: https://www.cam.ac.uk/stories/solving-grammars-greatest-puzzle (accessed on 4 February 2024).
Anon. ‘Information Overload? Music Studies in the Age of Abundance Conference Programme’, University of Birmingham, 2021: https://www.birmingham.ac.uk/documents/college-artslaw/music/events/2021/io-final-programme.pdf (accessed on 11 May 2024).
Bruhn, J. and Schirrmacher, B. Intermedial studies: An introduction to meaning
across media. Oxon: Routledge, 2021.
Elleström, L. ‘The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations’ in Beyond media borders, volume 1: Intermedial relations among
multimodal media, edited by L. Elleström. Cham: Springer Nature, 2021: 3-86.
ENGAGE @ Salford University. ‘NETNOGRAPHY IN THE AGE OF COVID-19 – with Prof Robert Kozinets’, 2021: 45:29-45:50: https://www.youtube.com/live/Y47wk_P3jII?si=Pf0ZyuqwptvaSv83 (accessed on 4 February 2024).
Faden, E. ‘A manifesto for critical media’, Mediascapes, 8, 2008.
Grant. ‘The Shudder of a Cinephiliac Idea? Videographic Film Studies Practice as Material Thinking’, Aniki – Portuguese Journal of the Moving Image, 1, 1, 2014: 49-62; http://sro.sussex.ac.uk/id/eprint/47473/1/59-204-1-PB.pdf (accessed on 4 February 2024).
Iten, O. ‘Episode 32. Openness & Videographic Criticism [Audio podcast episode]’, The Video Essay Podcast, 2022: 39:00-41:00; https://podcasts.apple.com/in/podcast/the-video-essay-podcast/id1474512070?i=1000588250364 (accessed on 4 February 2024).
Jensen, S. Musicalized Characters: A study of music, multimodality, and the empiric child perspective on mainstream animation, PhD dissertation. Växjö: Linnaeus University Press, 2021.
Kanungo, S. ‘Why Indians Always Win the Spelling Bee’, Facebook, 2017: https://www.facebook.com/watch/?v=1305703526193747 (accessed on 11 May 2024).
Keathley, C. and Mittell, J. ‘Scholarship in Sound & Image: A Pedagogical Essay’, Videographic Essay, 2019: http://videographicessay.org/works/videographic-essay/scholarship-in-sound–image?path=contents (accessed on 4 February 2024).
Kiss, M. ‘Videographic criticism in the classroom: Research Method and Communication Mode in Scholarly Practice’, The Cine-Files, 15, 2020: http://www.thecine-files.com/videographic-criticism-in-the-classroom/ (accessed on 4 February 2024).
Kozinets, R. Netnography: Doing ethnographic research online. Sage Publications, 2010.
Lavik, E. ‘Notes on the Scholarliness of Videography’, The Cine-Files, 15, 2020: http://www.thecine-files.com/notes-on-the-scholarliness-of-videography/ (accessed on 4 February 2024).
Maini, A. ‘From Zero to 10 Million YouTube Subscribers – Mrwhosetheboss’, YouTube, 2021: 1:03:50 – 1:04:10; https://www.youtube.com/watch?v=TQqKOi8UGG4 (accessed on 4 February 2024).
Meadows, Q., Trocan, I., and Webb, W. ‘The best video essays of 2023’, bfi.org, 2023: https://www.bfi.org.uk/polls/best-video-essays-2023 (accessed on 4 February 2024).
Misek, R. ‘A History Of The World According To Getty Images’, [in]Transition, 10, 2, 2023:
https://mediacommons.org/intransition/history-world-according-getty-images (accessed on 4 February 2024).
Mittell, J. ‘Episode 17. Jason Mittell & Christian Keathley [Audio podcast episode]’, The Video Essay Podcast, 2020: 36:10-47:30; https://podcasts.apple.com/in/podcast/the-video-essay-podcast/id1474512070?i=1000588250364.
_____. ‘Making Videographic Criticism’, 2015: https://justtv.wordpress.com/2015/07/01/making-videographic-criticism/(accessed on 4 February 2024).
‘Netnography: Robert Kozinets’, USC Annenberg, 2018: 00:008-00:50: https://youtu.be/F8axfYomJn4?si=KhW4usxOmkoxQiGs (accessed on 4 February 2024).
o_higgy. ‘videoessays’, reddit: https://www.reddit.com/r/videoessay/wiki/videoessays/ (accessed on 4 February 2024).
Sekar, S. ‘Screen Stars Dictionary: Aishwarya Rai Bachchan see under ethereal’, Tecmerin: Journal of Audiovisual Essays, 12, 2, 2023: https://vimeo.com/866339957 (accessed on 11 May 2024).
_____. ‘Scoring moments of A. R. Rahman’, Backgroundscore.com (accessed on 11 May 2024).
‘Shawn Kanungo at Pecha Kucha 14 in Edmonton’, Edmonton Journal, 2012: https://youtu.be/u9P_FoLFKXw?si=CppCTTh5cJkQzkYY (accessed on 11 May 2024).
The Lesser Feat. ‘Notes on YouTube Art (Part 1): ContraPoints, Aesthetics, and Artworlds’, 2019: 33:50 – 34:02; https://youtu.be/n7eTvr3ChRo?si=AO8JSlaQBz1ObdMh (accessed on 4 February 2024).
[1] Iten 2022, 39:00-41:00.
[2] Ibid.
[3] Mittell 2015.
[4] [in]Transition.
[5] Kiss 2020.
[6] Grant 2014.
[7] Mittell 2020, 36:10-47:30.
[8] Meadows & Trocan & Webb 2023.
[9] Misek 2023.
[10] Ibid.
[11] Keathley & Mittell 2019.
[12] Almeroth-Williams 2022.
[13] Lavik 2020.
[14] Iten 2022, 39:00-41:00.
[15] o_higgy.
[16] Maini 2021, 1:03:50-1:04:10.
[17] The Lesser Feat 2019, 33:50-34:02.
[18] Elleström 2021, pp. 8-13.
[19] Ibid., p. 34.
[20] Ibid., p. 46.
[21] Ibid., p. 47 (original emphasis).
[22] Ibid., pp. 20-21.
[23] Ibid., p. 49.
[24] Jensen 2021, p. 54.
[25] Theo van Leeuwen as quoted in Jensen 2021, p. 52..
[26] Elleström 2021, p. 54.
[27] Ibid., p. 55.
[28] Bruhn & Schirrmacher 2021, p. 4.
[29] Elleström 2021, p. 54.
[30] Faden 2008.
[31] ENGAGE @ Salford University 2021, 45:29-45:50.
[32] Anon 2021.
[33] Sekar 2023.
[34] Sekar 2009.
[35] Academy of Ideas.
[36] Ibid.
[37] Edmonton Journal 2012.
[38] Kanungo 2017.