Form, meaning and language

This project is not meant to be a theoretical dissertation into matters of philosophy and aesthetics, but I think it nevertheless is interesting to touch upon the subjects of language, form and meaning in relation to the music developed in this project. Especially so since I am kind of using the form of spoken language as the meaningful content for my music. In this case the form of speech refers to the prosodic structures that have been identified as significant in linguistics and conversation analysis: melodic intonation contours that express the course of utterances or cue turn-taking, change of key to signal change of subject, accents and stress to mark new or important information, convergence of tempo etc. These underlying musical structures of speech have been starting points for the musical explorations. This exploration of the musical potential of speech is interesting because it can also say something about the semiotic potential of music and point at ways in which music might make sense.

In relation to the significance of prosodic gestures, I think it is very interesting to see how infants seem to develop this kind of underlying melodic framework even before they learn a single word. Baby-talk between infants and parents during the first months after birth seems to work as a fundamental coordination of vocal cues with actions, reactions, facial expressions, emotions, and intentions. This first step of language acquisition is apparently common in all cultures and involves not just learning simple vocal gestures, but combining them into complete melodic narrative structures that later provide the formal foundations for interaction and constructing utterances of speech (S. N. Malloch, 1999; S. Malloch & Trevarthen, 2009; Miall & Dissanayake, 2003; Snow & Balog, 2002). Such melodic messages in baby-talk carry direct meaning by conveying the basic communicative intent of an utterance – for example if its function is attention, prohibition, approval, comfort or play. This is particularly apparent in what linguists calls infant directed speech, where prosodic contours tend to be exaggerated, but is also clearly identifiable in interaction between adults (Fernald, 1989). This is interesting as it shows how this musical foundation for language that we learn as infants constitutes a kind of underlying melodic vocabulary that we use and understand intuitively in speech, and that most likely provides a background for our perception and appreciation of music as well. This is also the reason I chose to focus on these pre-semantic aspects of speech rather than the concepts and ideas of words.

Language and meaning

One of the key features that are conveyed through prosodic gestures is intention. The significance of intention has also been stressed in several theories about language, like in the speech act theory developed by John L. Austin (Austin, 1962) and later extended by John Searle (Searle, 1969). This theory seeks to understand the meaning in speech not primarily from the semantic content of words, but from the performative function and intention of the utterance as an act – of what it is trying to achieve. This perspective on meaning-making is also related to Wittgenstein’s ideas about language-games where the meaning in language derives from its actual use, and that philosophy cannot deduce some kind of essential meaning in language separate from that (Wittgenstein, 1953). According to Searle, the very act of speaking presupposes an intention, defined in his concepts of different illocutionary forces, such as declaring, demanding, ordering, warning, promising, inquiring, exclaiming, asserting etc. Seen in relation to the kind of intention conveyed by speech melodies in infant-directed baby-talk, these forces can be viewed as a kind of abstracted intentional meaning that it is reasonable to believe could play a part of musical experience as well, even without the particular propositional content of spoken utterances.

This inherently social function of speech as action and interaction can also be related to the ideas of Mikhail Bakhtin. In his view, the form of utterances is not only shaped by the intention of the sender, but by a dialogical relationship between sender, receiver and the social circumstances. This is the process whereby meaning is created, and why speech genres make up an important part of the meaning of utterances (Bakhtin, 1986). Since speech genres involve the use of certain prosodic patterns that can be described in terms of musical characteristics, this is also why I have found speech genres to be an interesting approach to exploring the musical content of speech. Interesting because this layer of meaning forms a common reference – a kind of shared social musical language that can have as much precision as the specific words used. The features of some typical genres are often well defined and easy to tell apart, to the degree that their acoustic characteristics can even be reliably identified and classified by automatic computer analyses (Obin, Dellwo, Lacheret, & Rodet, 2010; Obin, Lacheret-Dujour, Veaux, Rodet, & Simon, 2008).

The significance of speech genres and thus a hint of some kind of conventional musical meaning in speech does not however mean that it is possible to point at one essential meaning of a musical utterance. Music does not represent the kind of formal communication system that defines languages. The regularity, and the seemingly orderly harmonic, melodic and rhythmic “rules” observed in certain styles of music have nevertheless tempted many scholars to approach music as a formal language complete with rules for grammar and syntax, from the highly developed rhetorical figures of the baroque music, Rousseau’s view of music as “impassioned speech” (Rousseau, 1781), Leonard Bernstein’s Harvard lectures on musical semantics (Bernstein, 1976), Jackendoff and Lerdahl’s musical adaption of Noam Chomsky’s generative grammar (Lerdahl & Jackendoff, 1983), and various approaches to music as semiotic systems of signs derived from the tradition of Roland Barthes etc. Raymon Monelle gives a thorough account of how such linguistic theories have been applied in musicology (Monelle, 1992). While many interesting insights can be gained from such perspectives, there is an underlying assumption that all music can be explained with one scientific method. In the search for universality, such approaches overlook the multitude of ways music (and speech) makes sense at the same time. And while languages are well-defined, pragmatic communication systems that can be fairly easily described by rules of grammar and syntax, music is an open-ended poetic mode of expression in the aesthetic domain that produces meaning also by challenging such rules and conventions.

Regarding linguistic ideas, I have found the creative semiotic approach of the musician, filmmaker and semiotician Theo van Leeuwen to be more fruitful: Instead of the descriptive “what is” of scientific explanation, this approach offers a creative “what if”, treating sounds as untapped semiotic resources, structures with many layers of potential meaning (Leeuwen, 1999). Such potential includes for example how sounds with similar proportions to human breath periods (duration, dynamics etc.) easily can be perceived as intentional and communicative utterances. In fact, our perception seems so overly hardwired for interpreting meaningful outlines of such patterns that we end up seeing faces in clouds and hear whispers in the wind as well, something that in gestalt psychology is known as illusions. Further along this focus on communicative intent: how the characters of utterances, actions, movements and gestures are intuitively interpreted and empathically mirrored as signs for inner states, thoughts, emotions, intentions etc. This needs of course not be interpreted literally, as our infinite ability to create metaphors can easily make these signs into poetic pictures of something else.

From the perspective of music psychology, John Sloboda has proposed that in the broadest sense, the perceptual background for experiencing the dynamic processes presented by music is our experience of the physical world in motion, and in that world – particularly the moving, living organism (Sloboda, 2012, p. 170). Not as mimicry, but on a deeper level how motions are initiated, experienced and mediated by a human agent. This could even be true in a more general sense for abstract thought as well, like in the way we tend to use spatial metaphors when talking about ideas behind concepts, thinking something through, being on top of things, looking further, view from another angle, against a background etc.

Form and meaning

Regarding the relationship between content, context and concept, it seems clear that my approach to music and speech is quite formalistic. But not relating to the semantics and concepts of words does not mean that I am not concerned with meaning. There is a popular myth that music without lyrics do not express anything specific beyond general emotions, but my experience as an improviser is that music certainly can make sense – and in very particular and nuanced ways as well. With Bakhtin, we could counter that words do not mean anything specific either, only how they are used in actual utterances. Which in turn can create meaning on several different planes, both in speech and music.

In linguistics, it seems easy to make a clear distinction between the form and content of language. Such a division has also been common in thoughts about art, for example in the view by art critic Clement Greenberg, that form is the handle allowing content to be grasped (Kim-Cohen, 2009). That might sound easy enough, but according to philosopher Lars-Olof Ahlberg, this simplistic metaphor of form versus content can be very misleading, as it is far from obvious what actually separates form from content in different art forms, or even within the same art form (Åhlberg, 2014).

To relate this to my work in this project, I first experienced that it quickly became very monotonous to listen to speech when the semantic meaning of the words – the content – was filtered out. At the small scale of phrases, such abstract speech sounds were still musically interesting, but at a larger timescale the uniform rate of events and the overall lack of consequence made it too monotonous to work as music.

Dealing with storytelling in theatre improvisation, Keith Johnstone has described the need to reincorporate elements to create coherence, otherwise it just becomes a meaningless sequence of events that can start and end anywhere (Johnstone, 1981). This, I believe, is similar in musical improvisation. So, it seemed that what was lacking was some kind of musical consequence, distinction or intended differentiation of the musical features. This is perhaps the kind of musical formal consistence I felt I needed to provide, to replace the semantic content that had been removed.

This view of form as content can be related to the role of theme and variation in art, as discussed by Nelson Goodman in “Languages of Art” (Goodman, 1976). According to Goodman, the modification, elaboration, differentiation and transformation of motifs and patterns are processes of constructive search, and such progressive variation is a typical way of advancing knowledge. This seems to be especially true of how artworks explore the world, constructing their own formal languages through such processes of differentiation, and thereby also providing a content expressed by these formal languages.

In a more general sense, progressive variation is perhaps how knowledge is expanded in other fields as well, including in science. It can also describe how a topic might be explored in conversations or in writing, but then these processes relate to the exploration of concepts, of thought ideas, while music is an organised exploration of sound ideas. Following this line of thought, improvised interplay can be viewed as a dialogical construction and collective exploration of such a formal language.

This, I think, is at the core of what I have been trying to do in this project and that is articulated as one of the aims: to investigate the musical potential of the sonic gestures of everyday speech, through formal differentiation and improvised exploration of its features as sound ideas.

If I should attempt some kind of conclusion to these thoughts on meaning, it must be that music and musical utterances have no single meaning, but can convey countless meanings at the same time. This is however also the case with language, which is certainly not as precise as we like to think, and where meaning is just as often inferred from the context and intonation (the utterance “–apple?” might for example be an opening line, an inquiry about hunger, a nutrition advice, a reference to the laws of gravity, or to a computer company, or it can imply that you are a princess, or that I am a witch, or serpent, or none or all of these at the same time). Not to mention that some words like swallow or hide refer to completely different concepts when used in different contexts. While some have seen this ambiguity of language as a flaw in an otherwise near-perfect communication system, it has been viewed in cognitive science as an absolutely necessary feature. Language would be far too cumbersome to use if one had to specify everything exactly and unambiguously all the time, like one has to do in computer programming languages. The same ambiguity also makes it possible to convey several things at once, like in puns or poetry. Like all utterances, gestures and actions performed by living (and imagined) beings, it is possible to interpret speech with a whole range of possible intended and unintended meanings. This, I think, must also be the case with music – it has a multitude of potential meanings, all at once.

As a final remark, it must be stated that the focus on speech in this project has not been an attempt to reduce the content or meaning of music to identify one underlying universal “explanation” of music. Many theoretical approaches to meaning in music have often taken the form of universalistic generalisations, like Leonard Meyer’s “Emotion and Meaning in Music”, where meaning is viewed essentially as arising from tension and release relating to the expectations formed by learned styles (Meyer, 1956). The idea in this project was rather to broaden the experience of both music and speech, by shedding light on some interesting connections between these two universal phenomena of human nature. One artistic aim was to see if it was possible to make music that could show both the musicality of speech as well as the language-like logic of music. I hope that this work can also show how music can make sense as a way of thinking, and, like speech, make sense as a way of being together.

← Previous page: Perception of speech and music Next page: Reflections on the results


Austin, J. L. (1962). How to do things with words. Cambridge, Mass: Harvard University Press.

Bakhtin, M. M. (1986). The Problem of Speech Genres. In Speech Genres and Other Late Essays (pp. 60–102). Austin: University of Texas Press.

Bernstein, L. (1976). The Unanswered Question: Six Talks at Harvard. Cambridge, Mass: Harvard University Press.

Fernald, A. (1989). Intonation and Communicative Intent in Mothers’ Speech to Infants: Is the Melody the Message? Child Development, 60(6), 1497–1510.

Goodman, N. (1976). Languages of art: An approach to a theory of symbols. Indianapolis: Hackett.

Johnstone, K. (1981). Impro : improvisation and the theatre. London: Methuen.

Kim-Cohen, S. (Ed.). (2009). In the blink of an ear : towards a non-cochlear sonic art. New York: Continuum.

Leeuwen, T. van. (1999). Speech, Music, Sound. London: Macmillan Press.

Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music. Cambridge, Mass: MIT Press.

Malloch, S. N. (1999). Mothers and infants and communicative musicality. Musicae Scientiae, 3(1_suppl), 29–57.

Malloch, S., & Trevarthen, C. (2009). Musicality: Communicating the vitality and interests of life. In Communicative musicality: Exploring the basis of human companionship (pp. 1–11). Oxford: Oxford University Press.

Meyer, L. B. (1956). Emotion and meaning in music. Chicago: University of Chicago Press.

Miall, D. S., & Dissanayake, E. (2003). The poetics of babytalk. Human Nature, 14(4), 337–364.

Monelle, R. (1992). Linguistics and Semiotics in Music. Harwood Academic.

Obin, N., Dellwo, V., Lacheret, A., & Rodet, X. (2010). Expectations for Discourse Genre Identification: a Prosodic Study. In Interspeech-2010 (pp. 3070–3073). Retrieved from

Obin, N., Lacheret-Dujour, A., Veaux, C., Rodet, X., & Simon, A. C. (2008). A method for automatic and dynamic estimation of discourse genre typology with prosodic features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1204–1207).

Rousseau, J. J. (1781). Essai sur l’origine des langues, Ou il est parle de la Mélodie & de l’imitation Musicale. In Œuvres posthumes de J.J. Rousseau (le Pléiade, pp. 371–429). Genève.

Searle, J. R. (1969). Speech acts : an essay in the philosophy of language. Cambridge: Cambridge University Press.

Sloboda, J. (2012). Exploring the Musical Mind: Cognition, emotion, ability, function. Oxford: Oxford University Press.

Snow, D., & Balog, H. L. (2002). Do children produce the melody before the words? A review of developmental intonation research. Lingua, 112(12), 1025–1058.

Wittgenstein, L. 2001. (1953). Philosophical Investigations. Blackwell Publishing.

Åhlberg, L.-O. (2014). On Form and Content. In Notions of the Aesthetic and of Aesthetics : Essays on Art, Aesthetics, and Culture (pp. 123–141). Frankfurt: Peter Lang.

← Previous page: Perception of speech and music Next page: Reflections on the results