Game engines: Optimising VFX, reshaping visual media

by Tom Livingstone

Introduction

Game engines, such as Unreal Engine and Unity (from Epic Games and Unity Technologies respectively), are a family of software platforms containing pre-programmed solutions for problems common to digital game development. These include gameplay mechanics, digital asset management, environment building, and the rendering of the 3D game environment as a 2D screen image, often in ‘real time’. Real-time rendering has sparked a proliferation of use-cases for game engines beyond the world of game development: in computer-animation, XR media, product development and – as shall be discussed in this article – the generation of In Camera VFX (ICVFX).

ICVFX is a specific operation within the broader practice of Virtual Production (VP) and involves cinematographically recording a live-action scene within an LED volume displaying dynamic digital environments rendered in ‘real-time’, capturing generated images ‘in camera’. While not exactly an oxymoron, ‘in-camera VFX’ is nevertheless a phrase that encapsulates the way in which two distinct image-making procedures – the photographic and the computational – are brought into congruence in a novel assemblage of technologies and filmmaking practices as a result of the growing ubiquity of game engines across visual culture. This article aims to test the compatibility between ‘live-action’ and ‘real-time’ technologies in order to gain insight into our contemporary visual media landscape and its relationship to our wider digital milieu.

Whenever the phrase ‘real-time’ is used in conjunction with a computational function – i.e. real-time traffic data collection, real-time video conferencing – it augurs a process that is likely to have an amplified epistemic effect.^[1] ‘Real-time rendering’ is no exception. The previously dominant epistemic paradigm of photographic visuality is – with the emergence of ‘real-time rendering’ – increasingly inflected by a computational paradigm: what I call game engine visuality. The core dichotomy at play, as noted by James Dobson in his history of computer vision,^[2] is between images that depend on sensed data (i.e. ones that are generated photographically) and images that consist of simulated, or synthetic data. Each category of image engenders a specific media-epistemic framework. To adopt language drawn from Vilem Flusser: photographs appear to us not as surfaces, but as windows onto reality,^[3] and as such they shape how we understand the world around us. The past three decades of technical development in computer-animation and VFX have been shaped, as many critics have noted, by a teleology of naturalism.^[4] Images featuring non-veridical content (dinosaurs, space ships, Autobots, and so on) nevertheless have a persistent degree of photorealism. What this means (to borrow Flusser’s language again) is that computer-generated photoreal images, given their conformity to the standard aesthetic and spatial languages of analogue photography, are habitually misperceived as windows onto a reality that does not exist. ICVFX represents a moment in this process of aesthetic convergence and medial and ontological divergence when real-time rendered images have such a degree of photorealism that a camera can be pointed at an LED surface and capture an image of profound depth and dynamic dimensionality.

This article is interested in this logistical, aesthetic, and epistemological overlap of ‘live-action’ and ‘real-time’ image modes. What follows will explore the ways in which game engine visuality incorporates and re-mediates a great deal of the phenomenological and epistemic characteristics of photographic visuality, but in a way that de-prioritises visuality as a sensed and embodied experience. I will begin by offering an outline of the image pipelines of photographic and game engine visuality, emphasising the difference in their relations to external phenomena, profilmic reality, and processes such as data management and computational optimisation. I will then discuss the production logistics of ICVFX with specific reference to the ‘magic cutaway’, a trope that represents, in its ubiquity, the successful optimisation of both film production and ‘real-time’ rendering within the Virtual Production context. The article will then offer a case study to open up not just the aesthetic impact of ICVFX, but to reflect on the de-prioritisation of photographic, lens-based images – correlated as they are to embodied ocular perception – within visual media. Speculating on the epistemic impact of a predominantly computational, as opposed to ocular, visual regime, this article will conclude by drawing insights from critical algorithm studies and the broader field of digital humanities.

Trevor Paglen has noted the growing category of images made by machines for other machines to see, calling them ‘invisible images’. As the adoption of game engines across numerous visual media pipelines demonstrates, not all computational imagery is invisible. In widening my discussion beyond visual studies, my conclusion aims to gesture towards a mode of critical engagement commensurate to the computational technologies and imperatives that underpin contemporary image culture. Game engines are not visual media – they are computational media. Unlike photography, game engines do not depend on the registration of external phenomena. Even when they represent real objects, via processes such as photogrammetry, LIDAR scanning, and so on, the images they output are the derivative by-products of the game engine’s management of computational complexity. And it is these technological conditions, particularly the computational imperative of optimisation, that inform game engine visuality. It is with this emergent, ever-optimising visual regime, crystallised in ICVFX, that this article seeks to initiate a broader engagement.

Triangulating media studies, film studies, and game studies

Before discussing the particularities of the ICVFX process, it is worth summarising some relevant theory that distinguishes between the analogue photograph and the computer-generated image. John May makes a definitive distinction in his book Signal Image Architecture: ‘Appearances notwithstanding, the photograph and the image belong to fundamentally different technical categories…’^[5] He clarifies that you cannot do anything with photographs, whereas images are always latently operative:

photographs, never intrinsically calculable, remain thoroughly visual. Images, structurally calculable, are only apparently visual.^[6]

Alexander Galloway describes two different ‘contracts’ of visuality established by photographic and computational media. He writes,

Photographic vision fans out into the world, locating objects in proximal relation to the origin. Because of its putative resemblance to human vision… The photographic diagram has indeed been quite influential, playing an outsize role in philosophy and culture.^[7]

By contrast: ‘if the photographic eye is, as it were, convex, like the prow of a ship jutting out into the world from the middle, then the computational eye is concave, glancing and encompassing the world from the fringe’.^[8] Galloway elaborates on this diagrammatic inversion that does away with the origin point of seeing, asserting that

computational vision takes it as a given that objects and worlds can and will be viewable from all sides… Computational vision takes it as a given that point of view is not necessary for seeing.^[9]

While this makes clear that visual (ocular, optical) perception is an increasingly insignificant component of computational vision it does not account for the cultural impact of computational media and the way that a visual culture produced by an emergent technical paradigm relates to the visual regimes and epistemic frames that it supersedes. What is lacking from Galloway and May’s theorisation of the respective paradigms of photographic and computational visuality is a focus on the paradigm shift itself.

The same is true of film studies: there has long been a recognition that film aesthetics in the digital era are skeumorphic, retaining an aesthetic link to a redundant set of technical determinants in order to not be wholly illegible to human eyes. Brown and Fleming exemplify this way of thinking, pointing out the redundancy of techniques such as editing in the context of ‘post-analogue’ cinema’s ‘gaseous’ possibilities.^[10] Less abstractly, my focus is on what remains residual – technically, aesthetically, and epistemologically – within the computational regime. If visual media is only secondarily legible to human eyes, it is still the legible portion of visuality – the visual artefacts of digitality – that I am interested in and that, crucially, offer opportunities for critical engagement. In the context of AI-generated imagery, Alan Warburton has asked ‘is photography a zombie medium?’^[11] For my project this leads to further questions, not least of which is the following: if we engage with a zombie medium as if it were still alive, what impact does that have on our framing of the world it purports to represent?

Finally, within digital game studies, there is important work that focuses specifically on game engines.^[12] Within this field, as Eric Freedman’s book The Persistence of Code in Game Engine Culture recognises, there is a growing interest in the relationship between material and embodied epistemologies and game engine technology. Freedman argues that game engines ‘mediate between data and embodiment… they are a structural lynchpin between game-based allegories of space and place and a more pervasively experienced and culturally volatile algorithmic state.’^[13] Drawing Freedman’s insight into the realm of screen studies: analyses of game engine-derived imagery will interrogate the co-existence of technically-discrete media regimes and identify emergent aesthetic characteristics and phenomenological qualia that require attention. Furthermore, in situating game engines as a lynchpin between algorithmic functionality and embodied experiences of space and place, Freedman identifies something that is crucial to my wider argument.

Game engines are deeply imbricated in what Wolfgang Ernst described as the ‘techno-epistemological momentum in culture itself’.^[14] There is a feedback loop between our phenomenological experience of reality and the temporal and spatial characteristics of game engine visuality. Game engines are an increasingly central apparatus of visual media and the source of growing swathes of visual culture and experience. Like photography’s automation of perspectival representation and cinematography’s registration of time in 24 frames per second, game engines’ production of ‘real-time’ imagery through the computational management of complexity is set to inform our perceptual habits more broadly. This article aims to illuminate not just the technical conditions of the images themselves, but also how the images ‘mediate between data and embodiment’ and shape our techno-cultural, and by extension social and embodied, experience.

Live action cinematography and the real-time graphics pipeline

Despite becoming prominent only in the last five years, In-Camera VFX is commonly referred to as a modern-day version of rear projection. This is a useful genealogy – as are comparisons with, for example, painted backdrops, glass and mirror special effects shots – because it emphasises the degree to which a complex on-set apparatus produces a spatial illusion for the lens of the camera. The basic set-up of ICVFX and rear projection is the same. A camera records performers in front of a screened image representing the background of the scene. However, in ICVFX the screened image is not a recorded image, but a real-time rendering of a digital environment. The real-time rendering solves the problem of parallax that plagued rear-projection, as the physical camera is minutely tracked by a range of sensors so that its position, rotation, and orientation can be mapped onto a virtual camera within Unreal Engine. When the physical camera moves, its perspective is rendered in real time. Here Galloway’s description of the contrasting ‘contracts’ of computation and photographic visuality is a useful descriptor for the practical set-up of a Virtual Production Studio. The camera – the photographic eye – ‘jut[s] out into the world from the middle’, whereas the computational eye ‘encompass[es] the world from the fringe’.^[15]

A key feature of ICVFX production is the viewing frustum, a rectangle that delineates the view of the physical camera into the digital environment. This rectangle moves across the LED volume to match the moves of the physical camera. Everything within the ‘inner’ frustum – i.e. that the cinematographer can see through the camera’s viewfinder – is rendered in ‘real time’, and everything beyond that frame plays into the scene as lighting. In the physical space of the studio, the mobility of this frustum on the LED wall is discombobulating, as within the frustum a parallax is being generated that is calculated exclusively for the viewing position of the camera. From anywhere else, this migrating rectangle contains an uncanny spatial arrangement.

Fig. 1: Unreal Engine Documentation – note the cut-off digital assets at the edge of the Frustum.

Fig. 2: Image with a side-by-side of the camera in context and what it sees on the LED wall.

The generation of parallax within ICVFX is an object lesson in computer graphics pipelines and the rendering process itself. It affords an opportunity to apprehend, even if obliquely, the computational processes that generate photoreal images in real time. Re-capping Jacob Gaboury’s outline of the graphics pipeline is useful for demonstrating this and underlining the technical chasm that separates ‘real-time’ images from ‘live-action’ ones.

First, data must be translated out of memory into the ‘world space’ of the game engine environment. Given that in the game engine ‘a point of view is not necessary for seeing’ in the journey from data to image, a point of view must not only be calculated but all the data relevant to the rendering of that point of view must be rationalised. Gaboury elaborates:

the final render will appear to us as a flat two-dimensional image, in three dimensions this viewing window is understood as a set of planes that extend outward from the rectangle that defines the two-dimensional boundaries of the screen, forming what is known as a viewing frustum, which marks the limit of what is visible to the viewing position of the observer.^[16]

The boundaries of the frustum, then, mark the boundaries of another set of computational processes, all of which optimise the process of producing the final render. Information not within the frustum, the planes and edges of objects that will not be visible, are ‘clipped’. ‘This clipping enacts a logic similar to hidden surface removal… where that which is known but invisible to the viewer must be determined so as to pre-emptively erase these objects before they are rendered.’^[17]Clipping demonstrates how rendering proceeds through minimising complexity, pre-emptively removing from the necessary calculations elements that will not be apparent in the final image. On the Virtual Production stage this clipping that occurs at the edge of the frustum also marks a point of spatial discontinuity between the environment as it is perceived from the position of the camera and the environment as it is displayed in the rest of the volume. A crucial irony is emerging here: the rendering process visible within ICVFX is one that supports the sovereignty of the camera’s perspective, even though the camera’s perspective must be limited to space that has been pre-emptively optimised and rendered in real time.

In the final step in the pipeline, the data within the frustum is

rasterized such that for each pixel in the render area, a colour value is derived from the corresponding object in the scene. This pixel data may then be passed through to the frame buffer that holds it until it is ready to be pushed to the screen of the display.^[18]

Words like ‘hold’ and ‘ready’ with their temporal air of waiting and delay belie the speed of the process.

This work of translating graphical data into simulated images is a complex and computationally intensive procedure and in contemporary real-time applications such as 3D gaming, these operations must take place thirty to sixty times every second. To express the sheer magnitude of this process numerically, in order to maintain seamless real-time moving images, a computer must perform trillions of calculations per second.^[19]

The scale of LED volumes requires a multiplication of these numbers. Unreal Engine can render synchronously across a range of render nodes, each servicing a collection of display devices. The Dark Bay Virtual Productions studio, for example, used 20 render nodes. When the physical camera moves in this volume it initiates a computational process across an array of 20 computers, each performing a trillion calculations per second in order to push images across the 1,470 individual LED panels, that covered a total area of 161 meters squared.^[20]

Gaboury’s breakdown of the ‘graphics pipeline’ illustrates the complexity of real-time rendering, as well as the several stages in the process where the calculations are optimised through the bracketing out of information not relevant to the final product. Ultimately, however, it underlines the fact that at a technical level computational image-generation is a process of managing high degrees of complexity very quickly, and is thus reducible to large numbers of individual calculations, packaged as a sequence of computational functions. As it is manifest within the frustum on the LED volume, this computational pipeline coalesces as visual information exclusively for the lens-based perspective of the camera. But the question remains: if the graphics pipeline is capable of producing images that are commensurate to cinematography, if ICVFX shots not only have dynamic VFX but also background acting and performances from digital extras, why not cut photographic media out of the pipeline entirely? In answering this question, the imperatives of optimisation return. The tropes that are emerging as the key aesthetic indicators of ICVFX in its first half-decade are ones that would be simply too computationally complex to generate without a camera. Indeed, as I shall discuss, the lens-based cinematographic capture of light bouncing around a profilmic environment is actively instrumentalised within a wider computational pipeline. In many ways, the use of a live-action camera within Virtual Production is simply another optimisation of calculative functionality and management of computational complexity.

To discuss this further, I would like to look at the principal benefits and affordances of ICVFX and ask why contemporary production practices and film forms make use of game engine-dependent pipelines. A strong culture of boosterism surrounds ICVFX,^[21] but the following affordances are relevant to my discussion. First, in-camera capture of VFX reduces the post-production pipeline and the dependencies that VFX production has on the live action footage itself. The artistry and development that previously went into the creation and compositing of digital elements in post-production is transplanted into a pre-production schedule that allows for greater iterability and more collaboration between the director and the VFX teams. What is more, the environment and VFX have a greater degree of impact during the process of production itself, because the VFX are literally visible, up on the wall.

Second: during the global pandemic, Virtual Production promised to replace location shooting (a huge logistical advantage in the era of COVID restrictions). Add to this an ICVFX production can freeze lighting conditions in a way that is impossible on location (in the real world). In an LED volume, the treasured ‘golden hour’ can be elongated, such that dawn and dusk last for as long as the production requires. Third, and related to the prevalence of the ‘golden hour’ in Virtual Production scenes, the lighting conditions generated by the computer-generated VFX on the LED wall interact with the physical scene. The challenges of seamlessly integrating elements shot against green/blue screen into digital environments made scenes with reflective costumes and golden hour lighting conditions incredibly difficult. In ICVFX by contrast, the VFX elements of the final frame are visible on the LED panels that the camera is pointed at and are apparent as soft light reflecting off every surface that the live action camera records. One of the most prominent characters to emerge from an ICVFX-led production is the Madolorian: it is precisely the shininess of the character’s helmet that integrates him into the varied environments featured in the series.

Taking these three advantages together it is no surprise that a key trope of ICVFX is the magic cutaway. Magic cutaways transport a character – who has just won the lottery, or smelled a new shampoo, for example – from one environment to another. With ICVFX this effect can be achieved without any location work and, moreover, as anticipated by Brown and Fleming’s disdain for montage in the post-analogue era, the transition can occur without any editing. The camera simply records the actor as the environment that surrounds them switches from city to desert to jungle to mountain to iceberg, etc.

Current scholarship tackling these use cases suggests that the preponderance of the magical cutaway in ICVFX filmmaking indicates that despite the infinite variability of environments available within Virtual Production, the technique has a limited spatial language when it comes to representing compelling embodied navigation of said environments.^[22] Relevant to my wider exploration of game engine visuality, the magical cutaway as a technical procedure gives access to the relative status of computational and photographic modes, in the context of an ongoing paradigm shift. As a leading trope of ICVFX the effect maximises the impact of real-time rendering technologies, but it does so by leveraging the recording technology of the camera.

The magical transition from one environment to another is generated computationally, via the mathematical relay that pushes dynamic information within a rasterised 3D scene onto an LED volume. Inside the volume, live action cinematography becomes a tool for registering real time-rendered phenomena. Despite being prioritised within the production assemblage the legibility of the cinematic point of view – its recording of the ephemeral and contingent, its indexical relationship to a profilmic reality – is only functional within the computational management of the digital environment. As noted above, the camera can only make legible what has been pre-rendered exclusively for its perspective. Crucially, this instrumentalisation of cinematography’s ability to register and record subtle changes in light occurring within a profilmic environment also entails an instrumentalisation of our habitual ways of viewing moving images. ‘Live action’ does not just describe a way of recording a scene, it is an epistemic anchor, denoting a set of beliefs that we as viewers will have about the image. The residual modes of photographic visuality – the assumption that the image is a window onto reality, as opposed to merely a colourful surface – persist in the magical cutaway. Indeed, the magical cutaway effect depends on a habitual misreading of images as containing sensed as opposed to simulated data. Ordinarily, ‘live action’ images display continuous and contiguous times and spaces. The magic cutaway presents us with the opposite – a discontinuous, modularised space-time.

This switcheroo is not just a trick of representation, but indicates a deeper inflection in the relationship between photographic and game engine visuality. ICVX relies on both the legibility and also the manipulability of the assumption that the camera passively records the continuous and legible space that lies in front of it. The continuity of 3D space is a given of photographic phenomenology and the epistemic frames that we have absorbed from photography. As the magic cutaway demonstrates, this presumption can be instrumentalised within the presentation of a new spectacular effect; the cutaway is magic precisely because it destabilises what was assumed to be immutable. It also follows the destabilisation of space and time as contiguous phenomena, especially as it involves the instrumentalisation of previously dominant phenomenological presumptions, and has consequences beyond visual culture, particularly as it relates to modes of embodiment and self-articulation within our wider digital milieux.

Case Study – ‘Live Again’

Shane Denson’s recent work on Post-Cinematic Bodies (2023) draws attention ‘to the ways that the user/viewer’s body is interpellated and imagined, constructed or deconstructed, engrossed or expelled’ across a range of media assemblages.^[23]In explicitly focussing on embodiment Denson extends his earlier work on how post-cinematic technologies and aesthetics represent a discorrelation between embodied perception and visual media. Where his previous work engages with the features and conditions of discorrelation either as post-cinema or as an ‘entangled and decentred ethical agency appropriate to a world of discorrelation’,^[24] Post-Cinematic Bodies ties discorrelation (its technologies, aesthetics, and untested affordances) to an imminent process of (re-)correlation. ‘Every discorrelation’, Denson writes, ‘including those effected by computational technologies, inevitably leads to a new form of correlation – a new, seemingly unquestionable alignment of subjects and objects…’ This transition results in what Denson calls a multistability of correlative and discorrelative media regimes which becomes ‘the basis for new phenomenological and political aesthetics of embodiment in a world of VR, AI, smartphones and robots’.^[25] This critical framework is vital for addressing ICVFX. How are bodies and embodied positions interpellated and imagined within ICVFX, and in particular magical cutaways? What phenomenology is promoted within the technical assemblage? What forms of embodiment are discovered or disavowed?

The music video that accompanies the Chemical Brothers’ recent track ‘Live Again’ was directed by the duo Dom & Nic.^[26] It tells the story of a woman who repeatedly awakens in her trailer, only to find that the world outside has miraculously transformed. She responds to each new environment through dance, by turns ecstatic, withdrawn, volatile, measured, returning to the trailer to collapse and re-awaken. The world around her transforms from a desert to a cave to a cityscape, the space above her is crowded with lights, a canopy of trees, a crane with a hook, a spacecraft of some kind, and finally, in the distance, miniature versions of herself and her trailer.

There are few edits: many of the wild transformations that take place in the world around the dancer occur in the unbroken duration of single shots. The video features things that would be, in turn, impossible to film (the space ship), difficult to animate (the dance/physicality), and extremely costly to generate in post-production (the wholescale jumps in environment and change of lighting on the reflective surfaces). The trailer exterior is highly reflective, as is the dancer’s dress, and the colours of the environments play intricately across every surface. Taken as a whole, the video’s aesthetic of shiny costumes in modularised and extensible forms of time and space is an apotheosis of ICVFX circa 2023.

Figs 3-8.

Metaphorically speaking, the video outlines a perpetual and ongoing process of re-attunement. The lyrical refrain ‘Yeah, we live again’ describes the dancer’s predicament – she wakes up again and again in a mutable environment, but then what? The refrain is consistent across the sonic landscapes the song explores. We hear the words ‘we live again’ initially in a high-grain sample, then ‘we live again’ in the highly spatialised and layered synth and rhythm sections; ‘we live again’ repeats as the spaces expand and contract through deft modulation. The video’s narrative literalises this process of persistent environmental re-attunement. Through multiple re-awakenings the dancer performs different dances through the altered places and spaces of the video. The camera’s unwavering attention on the dancer establishes her as a point of suture for the spectator, a visual equivalent of the auditory refrain ‘we live again’, a hook to hang a shared identity from. And so, it is her ability to perpetually adjust to the visuals around her (and more than adjust – to enjoy!) that is offered as a way of navigating the otherwise overwhelming and incoherent visual experience.

As a character exploring her own embodiment through dance in an environment where the conditions of that embodied experience are constantly changing, the dancer does not succumb to disorientation; she resists disintegration through tenacious mode-switching. This ‘mode-switching’ is a contemporary media-epistemic phenomenon highlighted by Eric Jenkins in his book Special Affects: Cinema, Animation and the Translation of Consumer Culture (2014),^[27] and is worth unpacking as it speaks not only to the spectator of the video but also the conditions of Denson’s ‘new phenomenological and political aesthetics of embodiment’. Responding to a phenomenon in computer animation where multiple disjunctive styles are mashed together in a single film (most spectacularly in the Spiderverse films of 2018 and 2023), Jenkins argues that this navigation of contrasting registers is evidence of a set of epistemic strategies that in turn fulminate emergent consumer behaviours. Jenkins’ makes particular reference to Wall-E (2008) and its contrasting aesthetic tendencies, such as the retro-sensibilities of slap-stick, Hello Dolly (1969) and also Wall-E with its own acts of cultural caretaking, and the facetious imagining of a human future of fully-automated luxury (space-ship) communism. Jenkins suggests that the dominant affective register of the film is not one or the other pole within its retro-futurism, but rather the restless oscillation between the two.

In Jenkins’ reading of the film this invitation to mode switching runs counter to the films more surface level ecological messaging because ultimately, the restless flitting between cultural frames encouraged by the film normalises an omnivorous, multi-platform mode of digital consumption, where the renewal of affective pleasure, the restlessness of ‘mode-switching’ is of more import than a stable orientation towards one or another given mode, platform, or commodity. Radical adaptability as a means to constantly renew sources of affective pleasure is also a key to understanding ‘Live Again’. The dancer’s re-invigoration renders the narrative arc of the video more or less flat. The dancer does not accumulate knowledge, or grow as a character, she has momentum but no orientation. Ultimately, she metastasises, appearing as dancer in her own background. This convolution of foreground and background is a neat display of the illusory affordances of ICVFX, but more than that it points to the radical modularity of time and space that the technique encourages. The image of the dancer dancing in her own background, provocatively knots the temporal and spatial dimensionality being constructed. It also adds a degree of melancholy to the strange techno-purgatory that the dancer inhabits: rendering space discontinuous has a temporal impact, the infinite proliferation of the present moment forecloses all possible futures. As a consummate ‘mode-switcher’ the dancer offers up one way of remaining embodied within a totalised media system that is nonetheless constantly in flux. She endeavours to resist the mise-en-abyme of the final images by sheer vitality. But Jenkins’ reading of Wall-E remains apt: ‘mode switching’ is an epistemic adaption to the volatile abundance of digital media, but one that excludes certain forms of embodied experience. Insofar as the dancer ‘lives again’ it is in a recursively repeating futureless present, for all the golden hour cinematography, the dancer cannot dance off into the sunset.

The video is more than just a metaphor for or representation of the body besieged within a baffling hyper-mediated environment – its mode of production is an enactment of this scenario. The opening shot of ‘Live Again’ is an unbroken take that includes the first three ‘scenes’ of the narrative. The dancer emerges from the trailer into: a desert landscape at golden hour; a redwood forest at night; a corner of a neon-lit city. Given that these environments exist as digital renderings on an enormous LED volume, the transitions between real-time rendered scenes is recorded cinematographically, preserving the duration of the sequence if not its spatial contiguity. As such, the actual production conditions for ICVFX remain interestingly close to the surreal story that is being told. The dancer is not simulating the experience of dancing through these wild transitions; the dance takes place within the context of a technologically-enabled ‘real-time’ switch in digital environments. The profilmic conditions of the performance and its recording, then, dovetails with the story that is being told on screen. The dancer re-attunes as smoothly as possible as the environment around her switches in an instant.

In this reading, the photographic media-epistemological framing is dominant: we as viewers are looking through the image to a profilmic reality where a performance is taking place that is responsive to the profilmic environment. However, as my overview of the technical constellation of ICVFX has demonstrated, this is not so straightforward. Similarly, as my reading of media epistemology suggests, our engagement with the images should not be premised on the misperception of the spaces on display. Rather, it needs to be stressed that the digital environments that the camera captures are of a different ontological and medial order than the physical foreground of the recorded image. The processes of mode-switching and re-attunement that the sequence instigates, then, takes place not just across jumps in time and space but across the gaps between visual media technologies and their respective means of recording and/or calculating an image, sensing or simulating data. The ‘mode-switching’ is not just a strategy for navigating the media environment, it is one that is necessitated by the co-presence of medially and ontologically-distinct image categories being brought into congruence within a single frame.

This tension is particularly acute where movement through space is being represented, without being enacted in physical space. There are two moments in the video where the dancer’s relationship to the hybrid physical-digital space around her becomes dynamic. The first involves a black and white city scene, where the dancer holds onto a hook dangling from a crane and is, seemingly, lifted upwards. The second occurs moments later when a large black object moves from the deep background straight overhead, as the dancer cowers beneath it. Both shots use a close up on the dancer to imply action beyond the frame, and in doing so make maximum use of the profilmic materiality of the performer – including her shiny costume – to produce the computer-generated illusion of, in the first instance, her vertical movement through space, and in the second instance the rapid approach of a massive spaceship. This representation of movement is a notable inversion of how CGI and cinematography usually complement one another. In a standard VFX pipeline there is a sequential and chronological relationship between the profilmic and the computer-generated elements within the frame, insofar as the integration of CGI occurs in post-production and is therefore contingent upon the recorded profilmic material. This sequential relationship between live action filmmaking and post-production VFX work is not only a production standard, but is emblematic of the dominance of the structures of photographic visuality even within a visual landscape increasingly reliant on digital and computational processes.

Computer-generated imagery, in the main, is used to augment, enhance, and/or amplify the impact of the cinematographic capture of profilmic data. In ICVFX this sequential relation is reversed, and the medial and technical distinction between photographic and computational media is erased from the final frame. The computational image becomes a phenomenon of projected light within the profilmic environment. However, unlike with LED panel lighting, the digital assets that are displayed on the LED volume and reflected on the performer’s skin have a tangible semantic content such that – signifying space ships, for example, as well as depth and parallax – that they create the illusion of the body of the performer moving through space even when the performer remains perfectly still.

It is worth unpacking how the semantics of the profilmic and computational portions of the image are relayed, as it gets to the heart of how the computationally-derived elements of the image are making photographic capture a single operation within a wider computationally-generated image pipeline. In traditional cinematography, the profilmic has a causal relationship with the final image which may or may not be digitally manipulated in a range of ways. By contrast, and as outlined above, a computationally-derived image is the result of a sequence of calculations that have been optimised in order to deliver a 2D image of a 3D scene. In ICVFX the digitally-generated VFX are a constitutive part of the profilmic reality that the camera captures. Simulated data becomes sensed data, such that we look through the final image as if it were a window on reality (albeit a reality enclosed in an LED volume in which simulated environments are rendered in real time). This overturns the teetering hierarchy between photographic visuality and game engine visuality, as within ICVFX what the camera can capture is contingent, sequentially and aesthetically, on the computationally-generated images that precede and envelop it.

This interpellation of computationally-derived images into the causally-generated image-making process of cinematography is indicative of the degree that photographic media and its associated epistemic frameworks have been operationalised within both ICVFX but also game engine visuality more broadly. The material, technical, and epistemic structures of photographic media are incorporated as a single function within an enclosed computational image generation pipeline. As such it also imbricated within a broader process of optimisation, enabling the management of multiple forms of complexity, not least of which are the logistical challenges of location shooting, in environments with a sun that rises and sets all by itself. But it is also an example of how this process of optimisation, in step with the creation of new production standards dependent on game engines and the computational prowess of ‘real-time’ rendering, is re-shaping the prevailing epistemics of visual media. This can be illuminated by Denson’s notion of dis/correlation.

Where post-cinematic aesthetics mark the points at which digital visual media are discorrelated from normative modes of embodied perception, ICVFX strikingly manifests Denson’s ‘new form of correlation – a new, seemingly unquestionable alignment of subjects and objects’.^[28] Here photographic visuality and its alignment of subjects and objects, as well as our habitual perception of photographic images as windows rather than surfaces, is fully instrumentalised within a new phenomenology of embodiment. This emergent form retains the photographic relationship between camera, performer, and profilmic environment precisely to undercut it and render spatial and temporal disjunctions in ‘real time’. Stressing the manipulability and modularity of spatial experience, the magic cutaway presents a new aesthetics of space and embodiment that devalues the concepts of time and space promoted by analogue cinematography and photographic visuality and posits ‘mode-switching’ as a means of affective renewal and forward momentum.

Conclusion

Thus far I have argued that cinematographic practices and the epistemic frame of photographic visuality have become instrumentalised within the pipelines of game engine visuality. Within ICVFX the production dispositif of ‘live action’ no longer has a leading role in the semantic operation of the image, and lens-based cinematography becomes principally a means of registering the significatory content of computationally-derived images and their conditioning of the profilmic scene (aka reality). This literalises Fleming and Brown’s contention that the virtual camera is functionally a skeumorph that translates digital information for analogue eyes: the physical camera in ICVFX is a means of registering real-time rendered images as visual phenomena and translating them into an atavistic visual language legible to human perception. It is another tool within the extended graphics pipeline, functionally akin to operations like rasterisation, plane projection, and clipping: it is a data management operation that makes simulated data sensible.

It is at this point that I would like to open the horizons of my discussion from the discrete production scenario of ICVFX to think in terms more apt to my wider category of game engine visuality and the prevailing relationships game engines produce between data and embodiment. Following Denson, it is my contention that game engine visuality is in the process of dis/correlating human sensorial experience within a media regime that has de-prioritised visual experience in favour of more infrastructural mechanisms enabled by computational media. Following Freedman, I take game engine technologies and the ‘allegories of space and place’ they produce to be a lynchpin and a point of critical access in this ongoing process or re-correlation to the reality of digital media regimes that we exist within today.

This conclusion is ambitious to concretise the connection between visual culture and the infrastructural influence of digital media regimes, but in order to achieve this a potential critical impasse needs to be addressed. In the context of a paradigm shift in the constitution of the visual and its mediation of reality is aesthetic analysis outmoded, a form of zombie criticism? As mentioned in my introduction, there are several theorisations of the sub-perceptual operativity of digital media in which the relationship between media flows and the perceiving body are integrated at a presubjective level such that the conditions of visual perception are always at best penumbral to active apprehension. Theorists like Mark B. N. Hansen,^[29] Bernard Stiegler,^[30] and Wolfgang Ernst^[31] situate the crucial dynamics of sensorial mediation as taking place at speeds that outstrip even the microtemporal increments of neurological latency, thus feeding forward into the act of perception. The difficulty that goes largely unaddressed in this body of work, but that is particularly relevant to my concept of game engine visuality, is that critiquing the visual artefacts of an increasingly non-visual process of mediation is an impossible, therefore redundant, exercise.

However, Steven Shaviro’s recent work demonstrates how the aesthetic qualities of new technological and cultural constellations (as manifest for Shaviro in music videos) are upstream of visual culture at large, and therefore serve as bellweathers of what is yet to be normalised and naturalised within our everyday visual experience. What Shaviro captures as the ‘allatonceness’ of contemporary music video productions is positioned as symptomatic of a broader redistribution of the ratio of the senses taking place within digital culture – the allatonceness of existing within an always-on, always-everywhere digital field. Shaviro’s methodology – with its careful re-purposing of Deleuzian film theory and MacLuhanite media theory – holds out the possibility that a narrow aesthetic approach can produce an incisive critique of wider, not wholly apprehensible, media operations. The magic cutaway is not the core affordance of game engine visuality, nor an encapsulation of its technical potentials, but what it does signal, in its overt operationalisation of ‘live action’ cinematography within ‘real-time’ pipelines, is the transitional moment between media regimes and their epistemic frameworks. This in turn opens up the possibility of connecting the aesthetic tropes associated with game engine visuality to wider social and cultural phenomena. In short, what are the experiential socio-cultural analogues of these new tropes? What knowledge effect does this set-up inaugurate? What comes next in this particular story of technogenesis?

A recent essay by Greg Elmer focusses in the ‘first person perspective’ view that internet users have of the social platforms they use. This perspective is not the result of any natural embodied experience of networked sociality, but rather the product of personalisation algorithms that generate a ‘first-person perspective’ as a ‘hegemonic’ technique that ‘guides and clusters the individual’s entrée into [the wider] discriminatory, aggregated economy’. The first-person perspective then is essentially an aesthetic device that optimises the interaction between user and platform. It is a system that ‘searches for – and reproduces – commonalities, proximities, and affinities’ producing ‘friction-free user-friendly experiences’ in order to privilege ‘rationalized temporalities and the efficient circulation of business and capital.’^[32]Elmer’s identification of the experiential qualities of online life being a direct product of processes of algorithmic governance is echoed in Eran Fisher’s ongoing work on algorithms – understood ‘as epistemic devices’ – which generate ‘knowledge without subjects, knowledge which leaves subjectivity redundant’.^[33] It is within this broader digital humanities frame, recognising the residual functionality of atavistic forms such as ‘subjectivity’ and ‘first person perspectives’ within wider techno-cultural operations, that I would like to place ICVFX and game engine technology’s incorporation of photographic visuality.

As my analysis of the ICVFX pipeline has shown, game engine visuality has operationalised our residual habits of ocular perception (that correlate to analogue photography) in establishing new correlations between the human body and the digital milieux. Within this context the ‘Live Again’ video becomes more than just a visual experience, but rather an iteration of a problematic common to computational culture. The video visually enacts the flimsiness of Elmer’s ‘first person perspective’ and the necessity of ‘mode-switching’ within the spatio-temporal formations of digital media. As such, an aesthetic analysis of ICVFX does offer an opportunity to reckon with the wider destabilisation of modern subjectivity within a technological environment that privileges algorithmic knowledge over sensory experience. The music video is a visual artefact of an otherwise inaccessible process: the de-prioritisation an embodied experience correlated to ocular media such as photography in favour of a computationally-generated image regime in which images legible to ocular perception are secondary to the optimised management of computational complexity. If ‘Live Again’ enacts new forms of embodiment commensurate and correlated to overwhelmingly digital environments, aesthetic analysis allows us to speculate on how these emergent and prevalent aesthetics reflect, entrench, and normalise operations of digital media and computational optimisation that are not exclusively screen bound.

As mentioned in my introduction, computational media only produce images as a derivative outcome of processes of optimisation. However, that is not to say that a critique of those images cannot account for their technical underpinnings and consolidate the practice of aesthetic analysis within a wider critical engagement with digital technologies. Analysing the rapidly-standardising techniques of virtual production will extend the repertoire of post-cinematic theory and digital visual media critique. At the very least, granular approaches to the visual outputs of a media regime built to optimise the management of complexity will bring to light the urgent need for a critical apparatus capable of addressing game engine visuality with a sophistication commensurate to the intricacies of the ‘real-time’ technologies that are dominating visual experience.

Author

Dr Tom Livingstone is a Research Fellow at The University of the West of England (UWE) working within MyWorld, a UKRI sponsored creative R&D programme. His research focuses on emergent media with a particular interest in the impact of game engines on visual culture. He has published widely on film and digital media and his first book Hybrid Images and the Vanishing Point of Digital Visual Effects will be published by Edinburgh University Press in October 2024.

References

1899 Behind the Scenes, available online: https://www.youtube.com/watch?v=5vWt49oXINg (accessed on 29 January 2024).

Dark Bay, https://www.dark-bay.com (accessed on 29 January 2024).

Denson, S. Discorrelated images. Durham, NC: Duke University Press, 2020.

_____. Post-cinematic bodies. Lüneberg, Germany, Meson Press, 2023.

Dobson, J. The birth of computer vision. Minneapolis: University of Minnesota Press, 2023.

Elmer, G. ‘From the First to the Zero Person Perspective: Neutering the Mediated Life of Affinity’, Computational Culture, 9, July 2023: http://computationalculture.net/from-the-first-to-the-zero-person/.

Ernst, W. Digital memory and the archive, edited by J. Parikka. Minneapolis: University of Minnesota Press, 2013.

_____. ‘Media Archaeography: Method and Machine versus History and Narrative of Media’ in Media archaeology, edited by E. Huhtamo and J. Parikka. Berkeley: University of California Press, 2011: 239-255.

Fisher, E. ‘Do Algorithms Have a Right to the City? Waze and Algorithmic Spatiality’, Cultural studies, 36, no. 1, 2022: 74-95.

Fisher, E. Algorithms and subjectivity: The subversion of critical knowledge. London: Routledge, 2022.

Fleming, D. and Brown, W. ‘FCJ-176 A Skeuomorphic Cinema: Film Form, Content and Criticism in the “Post-Analogue” Era’, Fibreculture journal, no. 24, 2015.

Flusser V. Towards a philosophy of photography. London: Reaktion Books, 2013.

Gaboury, J. Image objects: An archaeology of computer graphics. Cambridge, MA: MIT Press, 2021.

Galloway, A. Uncomputable: Play and politics in the long digital age. London: Verso, 2021.

Hansen, M. Feed forward: On the future of 21^st century media. Chicago: Chicago University Press, 2014.

Holliday, C. The computer-animated film: Industry, style and genre. Edinburgh: Edinburgh University Press, 2018.

Jenkins, E. Special affects: Cinema, animation and the translation of consumer culture. Edinburgh: Edinburgh University Press, 2014.

Livingstone, T. ‘The Spatial Languages of Virtual Production: Critiquing Softwarization with Aesthetic Analysis’ in Creative tools and the softwarization of cultural production: Creative working lives, edited by F. Lesage and M. Terren. Cham: Palgrave Macmillan: https://doi.org/10.1007/978-3-031-45693-0_3 2024.

MacGowan, C. ‘LED Stages: The Boom in Equipment and Tech’, VFX Voice, April 2023: https://www.vfxvoice.com/led-stages-the-boom-in-equipment-and-tech/ (accessed on 28 January 2024.

May, J. Signal image architecture. New York: Columbia University Press, 2019.

Nicoll, B. and Keogh, B. The Unity game engine and the circuits of cultural software. Palgrave Macmillan, 2019.

Prince, S. ‘True Lies: Perceptual Realism, Digital Images, and Film Theory’, Film quarterly, 49, no. 3, 1996: 27-37.

Shaviro, S. The rhythm-image: Music videos and new audiovisual forms. London: Bloomsbury Academic, 2023.

Stiegler, B. Technics and time: Cinematic time and the question of malaise. Stanford: Stanford University Press, 2011.

Warburton, A. The wizard of AI: https://alanwarburton.co.uk (accessed on 30 January 2024).

[1] Cf. Fisher, Eran, 2022.

[2] Dobson 2023, p. 35.

[3] Flusser 2013, p. 15.

[4] Holliday 2018, p. 15.

[5] May 2019, p. 48.

[6] Ibid., p. 50.

[7] Galloway 2021, p. 52.

[8] Ibid., p. 53.

[9] Ibid.

[10] Brown & Fleming 2015.

[11] Warburton 2023.

[12] Nicoll & Keogh 2019.

[13] Freedman 2020, p. 170.

[14] Ernst 2011, p. 235.

[15] Galloway 2021, pp. 52-53.

[16] Gaboury 2021, p. 161.

[17] Ibid., pp. 161-162.

[18] Ibid., p. 162.

[19] Ibid., p. 162.

[20] Dark Bay technical specifications: https://www.dark-bay.com

[21] Cf. MacGowan 2023.

[22] Cf. Livingstone 2024.

[23] Denson 2023, p. 117.

[24] Ibid., p. 219.

[25] Ibid., p. 35.

[26] Chemical Brothers, ‘Live Again, feat. Halo Maud’. Video online: https://www.youtube.com/watch?v=pqU4g5iJk2Y (accessed on 30 January 2024).

[27] Jenkins 2014, pp. 189-206.

[28] Denson 2023, p. 35.

[29] Hansen 2014.

[30] Steigler 2011.

[31] Ernst 2013.

[32] Ibid.

[33] Fisher 2022.

Game engines: Optimising VFX, reshaping visual media

Recent News

BAFTSS Practice Research Award for NECSUS videographic essay

Film-Philosophy Conference 2025 – Call for Papers

CfP: Autumn 2025_#Ageing – Call for Papers

Animal Nature Future Film Festival and its transnational organisational structure

Films flying high: International Film Festival of the Heights in Jujuy, Argentina

Archaeology of projection and economy of the real

Feminist Fandoms

NECSUS: Call for Book Reviewers – August 2024

Editorial Board

Partners

Publisher

Access