Automatic Speaker Recognition (ASR) has been an active area of research for the past four decades with speech collected mostly in research laboratory environments. However, due to growing applications and possible misuses of Voice over Internet Protocol (VoIP) networks, there is a need to employ robust ASR systems over VoIP networks, especially within the context of internet security and law enforcement activities. There is, however, little systematic study on analyzing effects of several artifacts of VoIP (such as speech codec, packet loss, packet reordering, network jitter and foreign-cross talk or echo) on performance of an ASR system. This chapter investigates each of the issues of VoIP individually and trades it with the performance of the ASR system. In this chapter, a narrowband 2.4 kbps mixed-excitation linear prediction (MELP) codec is used over a VoIP network.
|Original language||English (US)|
|Title of host publication||Forensic Speaker Recognition|
|Subtitle of host publication||Law Enforcement and Counter-Terrorism|
|Publisher||Springer New York|
|Number of pages||27|
|State||Published - Jan 1 2012|