Is there a way to detect individual voices via the Video Indexer ?

I would like to identify who is speaking and at what timestamp, given an video/audio file that has multiple speakers(seven speakers)

Comments

  •  
    Hi, You can get that information by looking at the index of the video (JSON) under videoIndex > insights > speakers. Here is the call you need to perform: https://api-portal.videoindexer.ai/docs/services/Operations/operations/Get-Video-Index?&pattern=index Thanks, Ori Video Indexer Team
    Posted by Hidden Sun, 15 Mar 2020 16:47:01 GMT
  •  
    Hi Ori, What I meant was is there a way to indentify a speaker given an audio file? For example, if I index an audio fiel that has seven speakers, is there a way to train to tell who is currently speking? The response should return Speaker, and timestamp. Is this possible?
    Posted by Hidden Thu, 19 Mar 2020 16:48:59 GMT
  •  
    We currently don't have that ability in VideoIndexer. Thanks, Avner.
    Posted by Hidden Thu, 19 Mar 2020 17:33:06 GMT
  •  
    Hi Avner, if not with VideoIndexer, do you know what would be the other way to achieve this?
    Posted by Hidden Fri, 20 Mar 2020 05:24:59 GMT
  •  
    This seems like what you are looking for: https://azure.microsoft.com/en-us/services/cognitive-services/speaker-recognition/
    Posted by Hidden Fri, 20 Mar 2020 07:32:43 GMT
  •  
    So is it possible to connect the Speaker Recognition service to the Video Indexer? Is there some documentation on how I could do this?
    Posted by Hidden Sat, 21 Mar 2020 06:19:19 GMT
  •  
    Unfortunately, no. Ther's no integration to VideoIndexer. This is something you can integrate on you own.
    Posted by Hidden Sat, 21 Mar 2020 07:06:41 GMT


You're not signed in. Please sign-in to report an issue or post a comment.