▲ ▼ On premise video indexing with visual data
Current video indexing are mostly done by converting 'speech-to-text' and then indexing it. This is mostly done to save on compute resources and that algorithms to index video with visual data are not as stable as that of text.
e.g. Searching for a video for an old man with red shoes wouldn't yield results unless there was an audio in the video describing it when using speech-to-text indexing.
It is said that even Youtube hasn't indexed all its video by visual data due to sheer number of videos. There APIs available like that from Microsoft which claim to enable video indexing with visual meta data, but they are not a on-premise solution and so using those might rake up bills if the number of videos are large.
There is a need gap for on-premise video indexing technology.