Mosami provides an API for operating on video streams and connecting them to the browser. So where is the “application” in our programming model?
In the websites you visit every day, you interact with buttons on a web page (“buy”), the buttons send messages to server-side applications, and that server-side application updates a database (quantity[item]–) that synchronizes information across users.
In a real-time communications application like Mosami, that backend isn’t a database anymore, but the structure remains similar – the user sees a webpage from your webserver that combines their video display and any non-video interactions they need (like buttons), those interaction events are processed by a server-side application on your webserver, and that application speaks to a backend – for video, the Mosami API – when it wants to make changes to other resources.
A few things change in the programming style because the video has its own independent channel outside the webpage:
- Interaction events can come from the video, not just the user’s browser. Here, analyzer modules – DetectMotion, DetectFace, DetechSpeech, and others – watch a video stream and send messages to the server-side application. Your application processes the events and determines what action to take.
- Changes to the video happen in the cloud instead of the browser. Processing (multiple) video streams is hard work, and out of reach for many client devices. Instead, the browser is a “thin client” that only sends the webcam video up to the cloud, and displays a video stream it receives from the cloud into the browser page. To make changes to the video, your server application sends messages via the Mosami API to filter and mixer elements to modify and combine streams in different way.