Vedantu is a LIVE online learning platform that enables personalised learning. Vedantu delivers quality education with cutting edge technology which brings India's Top Teachers and students together in a LIVE interactive e-classroom. At Vedantu teachers interact with students through a 2-way interactive whiteboard, which acts as a bridge to share LIVE audio and video feed. We at Vedantu host more than 6000 LIVE sessions per day which are attended by lakhs of students across all grades.
One of the major drawbacks of the traditional brick and mortar classroom model is that students are not being able to revisit a classroom session more than once. Vedantu solves this challenge by providing replays of all LIVE sessions for which a student has enrolled, thus enabling the student to revisit a session any number of times.
Initially, we integrated a third-party video streaming service with our platform to serve the session recordings to students. We had numerous technical bottlenecks with this third-party service including high latency, limited scalability, and no control over the edge locations. This solution was also a very expensive one since lakhs of students stream recordings every day.
We knew there had to a better solution and hence decided to build Vedantu's native video streaming player from scratch. A significant part of the feature involved the streaming of high-production-value video content, on-demand, with very little lag. Given that streaming was something not very new to us, it was essential that we gained the required knowledge, used the right tools, and ran the correct tests to ensure that we delivered a top-notch experience upon launch.
How we delivered the content?
There are different process for content delivery and there is a lot to consider when it comes to choose a right strategy to do it and evaluate different potential strategy and choose the one which is easy to integrate and change on the fly.
Then after a little bit of research we agreed upon to use an adaptive bitrate protocol developed by Apple known as HTTP Live Streaming (HLS) to deliver our content.
Next work was to Process MP4 files to HLS Streams and for this, we used the FFMPEG package, a free and open-source software. We developed flow with the videos and config as input and the stream our directory was containing was HLS Stream as an output. The output generated from this we were storing it in an Amazon S3 bucket and that contained a manifest with .m3u8 file extension and one subdirectory for each format we were trying to produce.
There are multiple gears when it comes to video frames and qualities. So now the question was how many videos we are going to store for one single session? And the answer is simple maths. Consider this case, the videos that you are going to store for one single session is to a number of video resolutions multiple by the number of video formats. But in our case, the formats would be either mp4 or mkv and we don't have 4k videos to store so till 720p resolution was more than enough so we multiplied the number of formats with the number of resolutions and defined all these configurations in an array and operated on it.
These configurations were very crucial because if your internet connection is slow then you can't deal with a difficult video format in terms of video quality.
Along with that, the array contains the bitrate so that every time your request came for steaming you get the desired quality giving you a lag-free steaming experience and while watching the whole session recording you don't have worry about quality as it will be adjusted by continuous bitrate comparison.
Validating the process:
The main thing after building a feature is to validate the process for a large number of requests with certain corner cases. We wanted to ensure that request on our side should never fail even if there is a hardware failure. The process should be resilient and, it should create HLS streams for our defined online sessions and shouldn't miss a single session. Here consistency can take a hit as we were running more number of sessions daily there can be cases of late ec2 recording case or any case where the recording started but, HLS streams don't get created.
So to ensure the process we wanted to know the HLS miss, so for that, we monitored it on our daily stats email where we checked whether or not HLS is there in case of AWS EC2 recording was running the way it was.
Handling the process of delivery:
Given that we needed this content to be downloaded with as little latency as possible to allow for a speedy, interactive experience, we used AWS Cloudfront, a content delivery network (CDN). As members from different areas of India were requesting the content, the CDN would copy the HLS stream directory from the S3 bucket into regional and local caches for a while, so members in similar locations would be able to access the content more quickly.
We wanted to make our content to be easily available with as low latency as possible so that our users can access the content seamlessly that's why we used AWS Cloudfront which is a content delivery network and as our users are from different corners of India so CDN would help us achieve faster results. What this do is copy our streams from our bucket to edge locations so that our users would be able to access our content faster.
We used CDN to make sure that our content should be delivered when it is requested. Instead request going outside the edge location it tries to search for it in the CDN where the Vedantu client has to request it with possible tokens with an expiry time.
So this in-house video streaming platform helped us avoid Vimeo as a dependency to stream our video and gave us full control on our videos and, that's how we made a win-win situation for our users and developers.
These are our first steps in this direction and we anticipate a lot more experimentation and development on this front. If this is something that sounds interesting to you, write to us at firstname.lastname@example.org.