📼 DESIGNING AN ON-DEMAND STREAMING SERVICE ON AWS ☁️
Streaming services are surging in popularity. The reason for this is simple — people love to binge-watch their favourite shows. It’s the preferred method of viewing content due to speedier internet connections and an abundance of shows to watch has accelerated the decline of traditional cable viewership.
Understanding the system design & architecture of a real-time streaming service can be a frightening task. Fortunately, I am going to make it simpler by using thoughtful design. In this article, I am going to break down the architecture so that you can identify the most optimal way of architecting this solution. To build an architecture that is highly available, resilient and secure that can also ingest, store, process, and deliver video content on-demand, this solution uses the following AWS services in this diagram:
FEATURES
Upload & Search: The user should be able to upload a video file into a simple storage service(S3) and also upload the metadata(description, title and tags) to a database that can scale to millions of concurrent users with an unpredictable workload to search for a particular video and for that reason we will use a NoSQL offering and the most suitable service for this is ElasticSearch with support for Kibana Dashboard integration for optimized search.
Encoding: Videos should be converted to different quality formats such as 720p to 1080p or to 4K and vice versa whenever they are uploaded by a user. For this, we use Amazon Elemental Media Convert to automatically transcode videos uploaded to S3 into formats suitable for playback on a wide range of devices.
Parallel processing: When a video is uploaded to S3, it will trigger a Lambda function to start Step Functions and each of the video conversions will be executed in parallel.
Viewership: We do not want to serve the videos directly through S3 and if the primary location of our bucket is in West Virginia and someone wants to access the video from South Africa the user(s) will experience service degradation due to buffering. We can solve this challenge by using AWS’s CDN, CloudFront to deliver the videos closer to the viewers, formatted for playback on a wide range of devices. Encoded videos are stored in HLS format and it works by breaking down video files into smaller downloadable files and delivering them using the HTTP protocol to save on network bandwidth, deliver videos faster and ensure cost is minimized.
Censorship: Users can upload adult videos that may go against the community guidelines and we don’t need the video to be visible to the end-users. Amazon Rekognition will be used to analyze every frame of a video or image to identify and flag inappropriate content using Lambda and Step Functions.
DESIGN STRATEGIES
Resilient: S3 — Cross-region replication (CRR) allows you to replicate or copy your data in two different regions. Having your data in more than one region will help you prepare and handle data loss caused by unprecedented circumstances.
Security: We can approach security in two main ways; security of the data and access authorization as in who is allowed to access the data. Security of the data can further be divided into two, data in transit which is encrypted using SSL/TLS and data at rest is secured using KMS.
Cost optimization: Some of the videos that are not viral can be moved from the standard tier to the infrequent access tier and eventually pay less for this. You can set up intelligent tiering to automate this by setting a lifecycle policy based on the duration that is specified.
SUMMARY
In this article, we came up with a high-level architecture for a streaming service, defined security best practices as well as the database schema. In addition, we explored many important tradeoffs when the system is examined more closely. All solutions have flaws, our job is to evaluate each of them and pick the most efficient one that meets all the critical requirements.