Netflix Logo

How to Design Netflix System Design

System design: Netflix Streaming



Netflix hosts thousands of movies and has millions of users watching them every minute of the day. Netflix has to make sure that the movies are streamed to the users without any buffering. The users should be able to watch the movies in high definition and without any lag. Let's look at a possible design for a streaming service like Netflix. Learn about cutting-edge design, unraveling the secrets behind the seamless streaming experience. Design Netflix system design at its best.



There are three different APIs that we need to design for a streaming service like Netflix.

  • 1. Get movies list
  • 2. Get streaming metadata of the movie
  • 3. Stream the movie

The movie list can be gotten several ways but for this discussion let us assume we are getting the top 10 movies from a database that was pre-populated daily using a separate process. Then the client gets the metadata of the movie which includes the details of files that make up the movie. Movies are usually not stored as a single file. When a studio uploads the movie to Netflix, it is usually stored as a set of files. A large movie is broken down into smaller files usually by time slices or by scenes. These files are then encoded into different bitrates and resolutions. When a user requests a movie, the streaming service has to decide which file to serve to the user. To steam each file, Netflix has to make sure that the movies are stored in a distributed manner and are served from a server that is close to the user.


  • Type

    Since the data we need to store is the movie list like the top 10 and the metadata of the posts, we would need a database that supports heavy reads. We can potentially choose a NoSQL database to support horizontal scaling across multiple regions.

  • Replication: Leader follower

    Since the majority of the requests are reads, we can use a leader-follower replication strategy. Since the writes can be slower, we can write to the leader and replicate the data to the followers synchronously, avoiding complex data resolution issues.



To allow for ultra-fast ready we can use a globally distributed Cache between the server and the database. The server can try to pull from the cache first and if the data is not found in the cache, it can pull from the database. The value can then also be written to the cache.



When the movie is divided into smaller files, these files are stored initially in an object storage. But to get the best performance, these files are globally distributed using a CDN. However, keeping all the files on the CDN will be expensive and therefore we need to decide which files to keep on the CDN using a content expiry strategy.