This is a complete list of all references from both the main documentation and all the blog posts combined.

Ablinger, P. (n.d.). Voices and Piano program note. Retrieved December 1, 2017, from

Astrinaki, M., D’Alessandro, N., Reboursière, L., Moinet, A., & Dutoit, T. (2013). MAGE 2.0: New Features and its Application in the Development of a Talking Guitar. In Proceedings of the International Conference on New Interfaces for Musical Expression. Daejeon, Republic of Korea: Graduate School of Culture Technology, KAIST. Retrieved from

Atal, B. S., & Hanauer, S. L. (1971). Speech Analysis and Synthesis by Linear Prediction of the Speech Wave. Journal of the Acoustical Society of America, 50, 637–655.

Austin, J. L. (1962). How to do things with words. Cambridge, Mass: Harvard University Press.

Bailey, D. (2004). Free Improvisation. In D. Warner & C. Cox (Eds.), Audio Culture: Readings in Modern Music (pp. 255–265). New York: Continuum.

Bakhtin, M. M. (1986). The Problem of Speech Genres. In Speech Genres and Other Late Essays (pp. 60–102). Austin: University of Texas Press.

Bartel, D. (1997). Musica poetica : musical-rhetorical figures in German Baroque music. Lincoln: University of Nebraska Press.

Barthes, R. (1977). Musica Practica. In Image Music Text: Essays selected and translated by Stephen Heath. London: Fontana Press.

Beller, G. (2009). Analyse et Modèle Génératif de l ’ Expressivité, Application à la Parole et à l’Interprétation Musicale. (Doctoral thesis). Universite Paris VI, Paris.

Beller, G. (2014). The Synekine Project. In ACM International Conference Proceeding Series.

Beller, G., Schwarz, D., Hueber, T., & Rodet, X. (2005). Hybrid Concatenative Synthesis On The Intersection of Music and Speech. In Journees d’Informatique Musicale (pp. 41–45).

Beller, G., & Aperghis, G. (2011). Gestural Control of Real-Time Concatenative Synthesis in Luna Park. In P3S, International Workshop on Performative Speech and Singing Synthesis. Vancouver, Canada.

Berio, L. (1974). A-ronne (author’s note). Retrieved from

Bernstein, L. (1976). The Unanswered Question: Six Talks at Harvard. Cambridge, Mass: Harvard University Press.

Bogert, B. P., Healy, M. J. R., & Tukey, J. W. (1963). The quefrency alanysis of time series for echoes: Cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking. In Proceedings of the symposium on time series analysis (Vol. 15, pp. 209–243).

BOLOGNA – an overview of the main elements. (n.d.). Retrieved April 8, 2015, from

Bonds, M. E. (1991). Wordless Rhetoric. Musical Form and the Metaphor of the Oration. Cambridge, Mass: Harvard University Press.

Borgdorff, H. (2012). The conflict of the faculties: perspectives on artistic research and academia. Leiden: Leiden University press.

Bosetti, A. (n.d.). Mask Mirror. Retrieved June 12, 2018, from

Brandt, P. A. (2006). Form and Meaning in Art. In M. Turner (Ed.), The Artful Mind (pp. 171–186). Oxford: Oxford University Press.

Brandtsegg, Ø. (n.d.). Cross Adaptive Processing as Musical Intervention; Exploring Radically New Modes of Musical Interaction in Live Performance. Retrieved June 13, 2018, from

Brandtsegg, Ø. (2007). “New creative possibilities through improvisational use of compositional techniques, – a new computer instrument for the performing musician.” Norwegian University of Science and Technology, Trondheim. Retrieved from

Brandtsegg, Ø., Saue, S., & Lazzarini, V. (2018). Live Convolution with Time-Varying Filters. Applied Sciences, 8(1). Retrieved from

Brunson, W. (2009). Text-Sound Composition – The Second Generation. In Proc. of EMS-09 Conference on Electronic Music Studies.

Bunnell, T. H., Polikoff, J., & McNicholas, J. (2004). Spectral Moment vs. Bark Cepstral Analysis of Children’s Word-Initial Voiceles Stops. 8th International Conference on Spoken Language Processing, 73, 1999. Retrieved from

Cook, P. R., & Lieder, C. N. (2000). SqueezeVox: A new controller for vocal synthesis models. In ICMC.

de Cheveigné, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917–1930.

Degottex, G., & Obin, N. (2014). Phase distortion statistics as a representation of the glottal source: Application to the classification of voice qualities. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1633–1637).

Delalez, S., & Alessandro, C. (2017). Vokinesis : syllabic control points for performative singing synthesis. In Proceedings of the International Conference on New Interfaces for Musical Expression (pp. 198–203).

Edstrom, B. (2016). Arduino for Musicians: A Complete Guide to Arduino and Teensy Microcontrollers. Oxford University Press.

Fant, G. (1960). Acoustic theory of speech production: with calculations based on X-ray studies of Russian articulations. The Hague, Netherlands: Mouton.

Fasciani, S. (2014). Voice-Controlled Interface for Digital Musical InstrumentS. PhD Thesis, National University of Singapore.

Fels, S. (2004). Designing for Intimacy: Creating New Interfaces for Musical Expression. Proceedings of the IEEE, 92(4), 672–685.

Fels, S. S., & Hinton, G. E. (1998). Glove-Talk II – a neural-network interface which maps gestures to parallel formant speech synthesizer controls. IEEE Transactions on Neural Networks, 9(1), 205–212.

Fels, S. S., Pritchard, B., & Lenters, A. (2009). ForTouch: A Wearable Digital Ventriloquized Actor. In NIME (pp. 274–275).

Fernald, A. (1989). Intonation and Communicative Intent in Mothers’ Speech to Infants: Is the Melody the Message? Child Development, 60(6), 1497–1510.

Feugère, L., D’Alessandro, C., Doval, B., & Perrotin, O. (2017). Cantor Digitalis: chironomic parametric synthesis of singing. EURASIP Journal on Audio, Speech, and Music Processing, 2017(2).

Fletcher, H. (1933). Loudness, Its Definition, Measurement and Calculation. The Journal of the Acoustical Society of America, 5(2), 82.

Forrest, K., Weismer, G., Milenkovic, P., & Dougall, R. N. (1988). Statistical analysis of word‐initial voiceless obstruents: Preliminary data. The Journal of the Acoustical Society of America, 84(1), 115–123.

Frayling, C. (1993). Research in Art and Design. Royal College of Art Research Papers Series (Vol. 1). London: Royal College of Art. Retrieved from

Gazzaniga, M. S. (1967). The Split Brain in Man. Scientific American, 217(2), 24–29.

Gibbons, M., Limoges, C., Nowotny, H., Schwartzman, S., Scott, P., & Trow, M. (1994). The new production of knowledge. London: SAGE.

Godøy, R. I., & Leman, M. (Eds.). (2010). Musical Gestures: Sound, Movement, and Meaning. New York: Routledge.

Goodman, N. (1976). Languages of art: An approach to a theory of symbols. Indianapolis: Hackett.

Harnoncourt, N. (1982). Musik als Klangrede : Wege zu einem neuen Musikverständnis. Salzburg: Residenz.

Helskog, G. H. (2003). Den humanistiske dannelsen og 1990-tallets utdanningsreformer. Norsk Pedagogisk Tidsskrift, 87(01–02).

Hessels, L. K., & van Lente, H. (2008). Re-thinking new knowledge production: A literature review and a research agenda. Research Policy, 37(4), 740–760.

Howitt, A. W. (2000). Automatic Syllable Detection for Vowel Landmarks. (Doctoral thesis). Massachusetts Institute of Technology, Cambridge, Mass.

Janer, J. (2008). Singing-driven interfaces for sound synthesizers. Universitat Pompeu Fabra.

Johnstone, K. (1981). Impro : improvisation and the theatre. London: Methuen.

Kim-Cohen, S. (Ed.). (2009). In the blink of an ear : towards a non-cochlear sonic art. New York: Continuum.

Kälvmark, T. (2010). University Politics and Practice-based research. In M. Karlsson, Henrik, Biggs (Ed.), The Routledge Companion to Research in the Arts (pp. 3–23). New York: Routledge.

Lane, C. (2006). Voices from the Past: compositional approaches to using recorded speech. Organised Sound, 11(1), 3–11.

Lane, C. (Ed.). (2008). Playing with Words. London: CRiSAP.

Leeuwen, T. van. (1999). Speech, Music, Sound. London: Macmillan Press.

Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music. Cambridge, Mass: MIT Press.

Lewis, G. E. (1996). Improvised Music after 1950: Afrological and Eurological Perspectives. Black Music Research Journal, 16(1), 91–122.

Lucier, A. (1980). Music on a Long Thin Wire [CD]. New York: Lovely Music.

Malloch, S. N. (1999). Mothers and infants and communicative musicality. Musicae Scientiae, 3(1_suppl), 29–57.

Malloch, S., & Trevarthen, C. (Eds.). (2009a). Communicative musicality: Exploring the basis of human companionship. Oxford: Oxford University Press.

Malloch, S., & Trevarthen, C. (2009b). Musicality: Communicating the vitality and interests of life. In Communicative musicality: Exploring the basis of human companionship (pp. 1–11). Oxford: Oxford University Press.

Malterud, N. (2012). Artistic research – necessary and challenging. INFormation, Nordic Journal of Art and Research, (1).

Margulis, E. H. (2014). On repeat: How music plays the mind. New York: Oxford University Press.

Martin, P. (2010). Prominence detection without syllabic segmentation. In Speech Prosody 2010-Fifth International Conference.

Mattheson, J. (1739). Der vollkommene Capellmeister. Hamburg: Christian Herold.

McPherson, A. P. (2012). Techniques and Circuits for Electromagnetic Instrument Actuation. Proceedings of the International Conference on New Interfaces for Musical Expression, 1–6.

Mermelstein, P. (1975). Automatic segmentation of speech into syllabic units. The Journal of the Acoustical Society of America, 58(4), 880–883.

Mermelstein, P. (1976). Distance measures for speech recognition, psychological and instrumental. Pattern Recognition and Artificial Intelligence.

Merton, R. K. (1973). The Normative Structure of Science. In The sociology of science: theoretical and empirical investigations (pp. 267–280).

Meyer, L. B. (1956). Emotion and meaning in music. Chicago: University of Chicago Press.

Miall, D. S., & Dissanayake, E. (2003). The poetics of babytalk. Human Nature, 14(4), 337–364.

Monelle, R. (1992). Linguistics and Semiotics in Music. Harwood Academic.

Moore, F. R. (1988). The dysfunctions of MIDI. Computer Music Journal, 12(1).

Moulines, E., & Charpentier, F. (1990). Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Communication, 9(5–6), 453–467.

NTNUs historie. (n.d.). Retrieved from

Obin, N., Dellwo, V., Lacheret, A., & Rodet, X. (2010). Expectations for Discourse Genre Identification: a Prosodic Study. In Interspeech-2010 (pp. 3070–3073). Retrieved from

Obin, N., Lacheret-Dujour, A., Veaux, C., Rodet, X., & Simon, A. C. (2008). A method for automatic and dynamic estimation of discourse genre typology with prosodic features. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1204–1207).

Obin, N., Lamare, F., & Roebel, A. (2013). Syll-O-Matic: an Adaptive Time-Frequency Represen- tation for the Automatic Segmentation of Speech into Syllables. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 6699–6703). Vancouver, Canada.

Obin, N., Rodet, X., & Lacheret-Dujour, A. (2008). French prominence: A probabilistic framework. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 3993–3996). Las Vegas, NV: IEEE.

Obin, N., Rodet, X., & Lacheret-Dujour, A. (2009). A Syllable-Based Prominence Detection Model Based on Discriminant Analysis and Context-Dependency. In Speech and Computer (pp. 97–100). Russia. Retrieved from

Patel, A. D., & Daniele, J. R. (2003). An empirical comparison of rhythm in language and music. Cognition, 87(1), pp.35-45.

Prasanna, S. R. M., Reddy, B. V. S., & Krishnamoorthy, P. (2009). Vowel Onset Point Detection Using Source, Spectral Peaks, and Modulation Spectrum Energies. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 556–565.

Pritchett, J. (1996). The Music of John Cage. Cambridge University Press.

Radiolab. (2006). Musical Language [Audio Podcast]. New York: WNYC Radio. Retrieved from

Roads, C. (1996). The computer music tutorial. Cambridge, Mass.: MIT Press.

Robinson, D. W., & Dadson, R. S. (1956). A re-determination of the equal-loudness relations for pure tones. British Journal of Applied Physics, 7(5), 166–181.

Rodet, X., & Schwarz, D. (2007). Spectral Envelopes and Additive + Residual Analysis/Synthesis. In Analysis, Synthesis, and Perception of Musical Sounds (pp. 175–227).

Rousseau, J. J. (1781). Essai sur l’origine des langues, Ou il est parle de la Mélodie & de l’imitation Musicale. In Œuvres posthumes de J.J. Rousseau (le Pléiade, pp. 371–429). Genève.

Rubidge, S. (2005). Artists in the academy: reflections on artistic practice as research. In Dance Rebooted: Initializing the Grid Conference Proceedings. Retrieved from

Sandell, S. (2011). Music inside the Language [CD]. Steninge, Sweden: LJ Records.

Sandell, S. (2013). På insidan av tystnaden : en undersökning. (Doctoral thesis). Konstnärliga fakulteten, Göteborgs universitet, Göteborg.

Schaeffer, P. (1966). Traité des objets musicaux. Paris: Le Seuil.

Schnell, N., & Battier, M. (2002). Introducing Composed Instruments, Technical and Musicological Implications. In Proceedings of the international conference on new interfaces for musical expression (pp. 156–160). Dublin, Ireland.

Schnell, N., Borghesi, R., & Schwarz, D. (2005). FTM : Complex Data Structures For Max/Msp. In Proceedings of International Computer Music Conference (ICMC).

Schnell, N., Röbel, A., Schwarz, D., Peeters, G., & Borghesi, R. (2009). MUBU & friends – Assembling tools for content based real-time interactive audio processing in MAX/MSP. In Proceedings of International Computer Music Conference (pp. 423–426).

Schwab, M. (2011). Editorial. Journal for Artistic Research, (0). Retrieved from

Schwab, M., & Borgdorff, H. (Eds.). (2014). The exposition of artistic research: publishing art in academia. Leiden: Leiden University Press.

Schwarz, D., Beller, G., Verbrugghe, B., & Britton, S. (2006). Real-time corpus-based concatenative synthesis with catart. In 9th Int. Conference on Digital Audio Effects (DAFx) (pp. 279–282). Montreal, Canada.

Scrivener, S. (2002). The art object does not embody a form of knowledge. Working papers in art and design. Retrieved from

Searle, J. R. (1969). Speech acts : an essay in the philosophy of language. Cambridge: Cambridge University Press.

Shklovsky, V. (1965). Art as Technique. In L. T. Lemon & M. J. Reis (Trans.), Russian Formalist Criticism: Four Essays (pp. 3–24). Lincoln: University of Nebraska Press.

Sloboda, J. (2012). Exploring the Musical Mind: Cognition, emotion, ability, function. Oxford: Oxford University Press.

Smith III, J. O. (2014). Cross Synthesis Using Cepstral Smoothing or Linear Prediction for Spectral Envelopes. Retrieved December 4, 2017, from

Snow, D., & Balog, H. L. (2002). Do children produce the melody before the words? A review of developmental intonation research. Lingua, 112(12), 1025–1058.

Stokes, D. E. (1997). Pasteur’s quadrant: basic science and technological innovation. Washington, D.C.: Brookings Institution Press.

Szczepek Reed, B. (2011). Analysing conversation : an introduction to prosody. Basingstoke: Palgrave Macmillan.

Tamburini, F. (2000). Automatic detection of prosodic prominence in continuous speech. In Proceedings of Eurospeech 2003 (pp. 129–132).

Trifonova, A., Brandtsegg, Ø., & Jaccheri, L. (2008). Software engineering for and with artists: a case study. In Proceedings of the 3rd ACM International Conference on Digital Interactive Media in Entertainment and Arts (DIMEA’08) (Vol. 349, pp. 190–197). Athens, Greece.

Vincent, M. (2010). Music & Language Interrelations. (Doctoral thesis). University of Toronto.

Wennerstrom, A. (2001). The Music of Everyday Speech: Prosody and Discourse analysis. Oxford University Press.

Wishart, T. (1994). Audible design : a plain and easy introduction to practical sound composition. Orpheus the Pantomime.

Wishart, T. (1996). On sonic art. New York: Routledge.

Wishart, T. (2012). Sound Composition. Orpheus the Pantomime.

Wittgenstein, L. 2001. (1953). Philosophical Investigations. Blackwell Publishing.

Yegnanarayana, B., & Gangashetty, S. V. (2011). Epoch-based analysis of speech signals. Sadhana, 36(5), 651–697.

Zwicker, E., & Fastl, H. (1999). Loudness. In Psychoacoustics (pp. 203–238).

Ølnes, N. (2016). From Small Signs to Great Form – Analysis of the musical interplay in free improvisation, using the tools of Aural Sonology. Norwegian Academy of Music, Oslo. Retrieved from

Åhlberg, L.-O. (2014). On Form and Content. In Notions of the Aesthetic and of Aesthetics : Essays on Art, Aesthetics, and Culture (pp. 123–141). Frankfurt: Peter Lang.

← Previous page: Resources Next page: Downloads