Improvisations


Introduction

Improvisations is an interactive, audio-visual installation that explores the theme of improvisation and allows visitors to directly affect the outcome of the piece. It consists of generative visuals projected onto the walls of the gallery and a generative ambient soundscape that plays through speakers set up around the room. Visitors can directly interact with the installation through sound and movement. Inputs from visitors are detected by motion sensors and microphones. Interactivity is a core element of the installation as it allows visitors to become the improvisors. Improvisations explores the unexpected ways that visitors behave in the environment and the unique ways that the installation responds to these inputs.

Background

The idea behind this project comes from my interest and research into the history of improvisation in western music. Improvisation is an essential creative tool for musicians and improvisation and creativity go hand in hand. Improvisation is a pure form of creativity, and all creative works begin their lives as improvisations.

Improvisation exists not just in music but is present in all aspects of our lives. When an unexpected situation is encountered, we are forced to improvise through it. The more creative you are, the better you can adapt to unforeseen circumstances.

Improvisations seeks to encourage visitors to get creative and interact with the installation in creative ways. Like musical improvisation, the final result is spontaneous rather than predetermined, and the outcome is never the same. The installation serves as a vehicle for visitors to become improvisors. It creates an environment that can be manipulated to achieve a variety of unique results. Creating a work that allows for improvisation while making something that is enjoyable to engage with was my primary goal for this project and something that I hope to pursue in future works.

Design

Improvisations consists of both an audio and visual component. The installation is designed around both being responsive to the actions of visitors in the space and creating an environment that slowly changes and evolves over time. The installation is designed for visitors to experiment and interact with it in unique ways, and to spend time in the space to see how it develops.

Visual Design

The visual design of Improvisations is inspired by abstract painting. The basis of the design is allowing visitors to paint directly onto the walls through their movements. There are three layers to the visual, the foreground, midground, and background. Visitors interact with the different layers based on their distance from the screen.

The foreground tracks the visitor’s hands, allowing up to three people at a time to paint directly onto the wall.

The midground creates outlines of visitors as they move through the space. These outlines are on a delay and heavily distorted, creating seemingly random, abstract shapes throughout the display.

The background consists of an evolving, noise-based design that is revealed as visitors move through the space. When no visitors are present, parts of the background are randomly revealed, creating visual activity even when the space is empty.

Audio Design

The sound design is an ambient soundscape, meant to enhance the experience while not being overly intrusive or distracting. Sound is projected into the space through four speakers in the corners of the room surrounding the visitor with sound. The sound design consists of three main elements: an ambient drone, randomized notes, and audio processing.

The ambient drone makes use of a variety of oscillators tuned to randomized pitches to create a constantly evolving soundscape. The soundscape develops by changing the oscillators used, changing the quantization and transposition of the pitches, and modulating the parameters of various filters and reverb effects used. It subtly changes as visitors become more active in the space.

The randomized notes are directly responsive to the amount of visitor motion and activity in the environment. This component serves to increase sonic intensity when players are more active in space. Player movement also changes how the notes are generated and where they are heard.

The audio processing component records audio from the room and plays it back with intense processing. There are two parts to this component. The first responds to loud noises in the environment, playing them back with heavy delay and reverb and moving them around the room. The second part records sounds from the space and plays them back in response to visitor activity.

Implementation

In order to execute the design, the installation must take in data from a variety of sources, process this data, and use the processed data to generate an output. There are two sources of input data that are used to drive the installation: player position and movement data taken from an Xbox Kinect V2, and audio data recorded from a microphone. This input data is sent to a central computer for processing and sent to four speakers and two projectors in the gallery. The audio and visual processing and generation are done separately from each other.

Diagram showing the hardware layout and signal flows of the installation
Diagram showing the flow of software data processing
Diagram showing the flow of software data processing

Video Processing

The visual processing and generation is done in TouchDesigner. TouchDesigner is a node-based development environment for creating interactive visuals. Using TouchDesigner, I created a processing chain that begins with raw input from the Kinect and ends with the completed visuals.

To start, camera, depth tracking, player tracking, and raw position data are taken into TouchDesigner and processed into a usable form. This processed data is both used for visual generation and is sent out of TouchDesigner to be used in audio generation.

The input video streams are processed using a variety of techniques including feedback loops, distortion, pixel displacement, and hue shifting. The final output is sent out of TouchDesigner and is projected onto the wall from two projectors in the room aligned to create a single continuous image. The final output is rendered in real time at 2560×800 at 30 FPS.

The TouchDesigner project that runs the installation. Each element of the project is contained in its own sub-patch. The left column is data input, the center is image generation, and on the right is post processing and output.
Foreground

The foreground uses the player’s hands to paint on the wall. A circle is mapped to the XY coordinates of each hand and feedback effects are used to leave trails behind the hands as they move across the screen.

Midground

This layer uses the Kinect’s player and depth tracking abilities to create a mask comprised of players at a certain distance from the wall. The outline of the players are taken and heavily distorted to abstract their shape. A significant delay is added so that player movement does not directly correspond to the action on the screen. Feedback is used so that the image gradually fades away.

The TouchDesigner component containing the foreground and midground image processing.
Background

The background is based on combining multiple noises together with displacement and heavy processing. A variety of parameters, including the noise period and harmonics, and the displacement weights, are modulated using low frequency oscillators (LFOs) to create a constantly changing visual. The background is revealed by players moving along the far wall of the gallery.

The TouchDesigner component for creating the background.

Audio Processing

The audio processing is done using Max/MSP. Like TouchDesigner, Max is a node-based development environment specializing in audio processing. Max takes the processed Kinect data sent to it from TouchDesigner as well as audio input from the room to create the final output.

Player data from TouchDesigner is used to track player movement and
activity. The intensity of the generated audio is tied to the amount of
player activity. An array of LFOs are used to create cycles of change over long periods of time (in the range of 20 minutes to 2 hours).

The user interface created in Max showing all the elements of the audio generation.
Ambient Drone

This patch works by combining 128 oscillators, each receiving its own pitch information. The pitches are quantized to one of three scales and transposed around the circle of 5th on a slow cycle. Modulating reverb, filters, and waveforms create an evolving drone that acts as the basis of the sonic environment.

Random Notes

This patch generates a stream of random notes based on the amount of visitor activity in the space. As visitors become more active, more notes are generated. Movement in the space influences the panning of generated pitches.

Noise Input

This consists of two separate patches that record audio from the room and process it in two different ways. The first of these patches records a very short snippet of sound whenever a threshold is reached and plays it back through the speakers with heavy reverb and delay. This works best with short impulses such as clapping.

The second patch records up to 30 seconds of audio and plays it back in a loop based on the brightness of the background layer. Heavy reverb is used to distort the sound.

Gallery

Acknowledgments

Thank you to Dr. Devin Arne for his invaluable mentorship and support. Thank you also to the West Chester University Summer Undergraduate Research Institute for the opportunity to see this project come to life, and whose funding and resources have made this project possible.