Personalized Face and Speech Communication over the Internet

Kshirsagar, S. and Joslin, C. and Lee, W. and Magnenat-Thalmann, N.

Abstract: We present our system for personalized face and speech communication over the Internet. The overall system consists of three parts: The cloning of real human faces to use as the representative avatars, the Networked Virtual Environment System performing the basic tasks of network and device management, and the speech system, which includes a text-to-speech engine and a real-time phoneme extraction engine from natural speech. The combination of these three elements provides a system to allow real humans, represented by their virtual counterparts, to communicate with each other even when they are geographically remote. In addition to this, all elements present use MPEG-4 as a common communication and animation standard and were designed and tested on the Windows Operating System (OS). The paper presents the main aim of the work, the methodology and the resulting communication system.

  journal = {IEEE Signal Processing Magazine},
  author = {Kshirsagar, S. and Joslin, C. and Lee, W. and Magnenat-Thalmann, N.},
  title = {Personalized Face and Speech Communication over the Internet},
  publisher = {IEEE Publisher},
  volume = {Vol. 18},
  number = {No. 3},
  pages = {17-25},
  month = may,
  year = {2001},
  topic = {Facial Animation}