Audio Specifications



Make sure the audio being analyzed conforms to the following guidelines:

  • The audio file must be an uncompressed 16 bit PCM WAV or AIFF file.
  • The audio must be recorded at 16 kHz or greater.
  • The audio should be mono, though FaceFX will reduce the sound to one channel automatically.
  • The audio file is between 0.5 and 90 seconds long. FaceFX will attempt to analyze longer WAV files by breaking the sound into chunks that are less than 90 seconds long and analyzing the chunks without text. FaceFX will create placeholder animations (a simple mouth open and mouth closed animation) for audio files less than 0.5 seconds. These default min and max values can be changed with the set command using the a_audiomin and a_audiomax variables.
  • For best results, speech should not start within the first 100 milliseconds of the audio file. This allows the speech detection algorithms to get initialized properly and it also reduces the importance of how negative keyframes are dealt with in the FaceFX integration. If audio that has speech starting immediately must be used, it is possible that the beginning of some animations will be incorrectly recognized as silence. This can be fixed by setting the a_detectspeech variable to false with the the set command.
Version Number: 
2009