INTERPOL recently announced
the successful completion of the final field test of the Speaker Identification
Integrated Project (SiiP).
Using a database with real audio recordings, the UK’s
Metropolitan Police Service and the Portuguese Polícia Judiciária, demonstrated
how unknown speakers talking in different languages could be identified through
social media or lawfully intercepted audios using a fusion of key markers such
as gender, age, language and accent.
The four-year (May 2014 – April 2018) European Union-funded
research project is run by an international consortium of 19 partners, comprising Law
Enforcement Agencies or LEAs (INTERPOL and police forces from the UK, Italy,
Portugal and Germany), SMEs, industrial hi-tech companies and academic
institutes.
As a full project partner, INTERPOL focuses on ensuring that
the speaker identification technology meets the operational needs and
requirements of law enforcement agencies, while guaranteeing that the legal aspects
of the technology are compatible with existing national legislation including
INTERPOL’s Rules for the Processing of Data and safeguards for individual
privacy.
The technology and its
utility
SiiP is a probabilistic, language-independent, voice
recognition system that uses a novel Speaker-Identification (SID) engine and
Global Info Sharing Mechanism (GISM) to identify unknown speakers who are
captured in lawfully intercepted calls, in recorded crime or terror arenas, in
social-media and in any other type of speech sources.
The system’s speaker identification technology combines
multiple speech analytic algorithms (Speaker-model-Identification,
Gender-Identification, Age-Identification, Language-Identification and
Accent-Identification) which are provided by different vendors. This fusion
results in highly reliable and confident detection, keeping the False-Positives
& False-Negatives to the minimum.
SiiP enables LEAs to overcome two main challenges they face
today:
- The Evasion
Challenge – The use of hidden, fake and arbitrary identities by terrorists
and criminals at the telephony and Internet mediums in aim to avoid their
lawful interception, identification and tracking by LEAs. These include amongst
other, the use of arbitrary nick names in various Internet VOIP applications
(e.g. Skype, Viber), use of face mask in Social-Media (e.g. YouTube) and
Frequent altering of SIM cards in cell-phones. - The second side
problem – The difficulty in identifying unknown participants in a lawfully
intercepted call of a known speaker.
Depending on adequate judicial warrant and in accordance
with the legal and ethical frameworks, the system can be run on any speech
source and channel (Internet, Social-Media, PSTN (public switched telephone network), Cellular and SATCOM) and
provide LEAs with better intelligence and improved judicial admissible
evidence.
It can use the speaker model as search criteria for
social-media in aim to find more information about the speaker of interest. Each
speaker identity can be associated with rich-metadata (Identifiers used by the
Speaker, Personal details, Location-profiles, Social-connections and many
more), taken from a variety of sources in the web, in Social-Media and in
Telephony. Suspect voice and metadata from Internet and Telephony sources,
including Social-Media (e.g. YouTube) can be added.
In accordance with INTERPOL regulations, the system establishes
a secured global info sharing mechanism for speaker-models and their associated
metadata between LEAs around the world, via the INTERPOL, thereby expediting
the investigation process and achieving significant resource savings for the
LEAs.