Thursday, January 23, 2020

Gateway Routing system design pattern

Routing requests to multiple services using a single endpoint. This pattern is useful when you wish to expose multiple services on a single endpoint and route to the appropriate service based on the request. The pattern may come in handy in system design interview problems.

Context and problem

When a client needs to consume multiple services, setting up a separate endpoint for each service and having the client manage each endpoint can be challenging. For example, an e-commerce application might provide services such as search, reviews, cart, checkout, and order history. Each service has a different API that the client must interact with, and the client must know about each endpoint in order to connect to the services. If an API changes, the client must be updated as well. If you refactor a service into two or more separate services, the code must change in both the service and the client.

Despite the differences between them, all these API endpoints have some common requirements like a secure connection, session and/or authentication token management, rate limiting, request tagging and header validations, etc.

Solution

Place a gateway in front of a set of applications, services, or deployments. Use application Layer 7 routing to route the request to the appropriate instances.

With this system design pattern, the client application only needs to know about and communicate with a single endpoint. If a service is consolidated or decomposed, the client does not necessarily require updating. It can continue making requests to the gateway, and only the routing changes.

A gateway also lets you abstract backend services from the clients, allowing you to keep client calls simple while enabling changes in the backend services behind the gateway. Client calls can be routed to whatever service or services need to handle the expected client behavior, allowing you to add, split, and reorganize services behind the gateway without changing the client.

This system design pattern can also help with deployment, by allowing you to manage how updates are rolled out to users. When a new version of your service is deployed, it can be deployed in parallel with the existing version. Routing lets you control what version of the service is presented to the clients, giving you the flexibility to use various release strategies, whether incremental, parallel, or complete rollouts of updates. Any issues discovered after the new service is deployed can be quickly reverted by making a configuration change at the gateway, without affecting clients.

Issues and considerations

• The gateway service may introduce a single point of failure. Ensure it is properly designed to meet your availability requirements. Consider resiliency and fault tolerance capabilities when implementing.
• The gateway service may introduce a bottleneck. Ensure the gateway has adequate performance to handle load and can easily scale in line with your growth expectations.
• Perform load testing against the gateway to ensure you don't introduce cascading failures for services.
• Gateway routing is level 7. It can be based on IP, port, header, or URL.

When to use this system design pattern

Use this system design pattern when:
• A client needs to consume multiple services that can be accessed behind a gateway.
• You wish to simplify client applications by using a single endpoint.
• You need to route requests from externally addressable endpoints to internal virtual endpoints, such as exposing ports on a VM to cluster virtual IP addresses.
• This system design pattern may not be suitable when you have a simple application that uses only one or two services.

Example

The gateway is commonly implemented as a proxy layer at the same level where the load balancer sits. In Kubernetes for example you could use an Ingress controller as a gateway service.

Using Nginx as the router, the following is a simple example configuration file for a server that routes requests for applications residing on different virtual directories to different machines at the back end.

server { listen 80; server_name domain.com; location /app1 { proxy_pass http://10.0.3.10:80; } location /app2 { proxy_pass http://10.0.3.20:80; } location /app3 { proxy_pass http://10.0.3.30:80; } }

To better prepare for the system design interview, we recommend reading these articles about industry proven implementations of the gateway system design pattern at scale:

Fibonacci and Matrix Exponentiation

Problem statement
The Fibonacci numbers, commonly denoted Fn form a sequence, called the Fibonacci sequence, such that each number is the sum of the two preceding ones, starting from 0 and 1. That is
and
for n > 1.
The beginning of the sequence is thus:

A common coding interview question is to compute the n-th Fibonacci number.

Naive solution

A naive way of implementing F(n) is to use recursion:

def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)

Such an implementation leads to an exponential runtime. The issue is that we compute the same values over and over again. Let's take a look at the recursion tree:

We can quickly estimate an upper bound for the time complexity as O(2^n). A tight bound Θ((1+sqrt(5)/2)^n) = (Fib(n)) can be determined by using generating functions and the golden ratio.

Linear solution

We can easily improve the previous solution by using memorization to avoid recomputing the same value multiple times:

fib = {
0: 0,
1: 1
}

def fibonacci(n):
if n in fib:
return fib[n]

fib[n] = fibonacci(n - 1) + fibonacci(n - 2)
return fib[n]

The time complexity is now linear O(n), but we also using O(n) memory. Let's rewrite the solution iteratively:

def fibonacci(n):
fib = [0, 1]
for i in xrange(2, n + 1):
fib.append(fib[i - 1] + fib[i - 2])
return fib[n]

Note how at each step we only need to look at the previous two values. Thus we can achieve the same result with constant O(1) additional memory:

def fibonacci(n):
if n <= 1:
return n

tmp0 = 0
tmp1 = 1
for i in xrange(2, n + 1):
curr = tmp0 + tmp1
tmp0 = tmp1
tmp1 = curr
return tmp1

Optimal solution

The linear solution is the one described in most coding interview training resources. But can we do better? As it turns out we can actually solve the problem in logarithmic time. Let's see how.

We will rewrite the Fibonacci formula using simple matrix algebra:

which is also equivalent to

now all we need to do is compute the matrix exponentiation which can be implemented in O(logN) time. The pseudocode implementation is not too difficult either:

power(result_matrix, base_matrix, exponent):
if exponent == 1:
result_matrix = base_matrix.copy()
return

if exponent % 2 == 1:
power(result_matrix, base_matrix, exponent - 1)
result_matrix = result_matrix * base_matrix
else
power(result_matrix, base_matrix, exponent / 2)
result_matrix = result_matrix * result_matrix

fibonacci(n):
base_matrix = [ [1,1], [1, 0] ]
result_matrix = []
power(result_matrix, base_matrix, n)

return result_matrix[0][1]

Solving linear recurrence problems

linear recurrence relation is an equation that defines the $n^\text{th}$ term in a sequence in terms of the $k$ previous terms in the sequence. The recurrence relation is in the form:
$x_n=c_1x_{n-1}+c_2x_{n-2}+\cdots+c_kx_{n-k}$
Where each $c_i$ is a constant coefficient.

The Fibonacci problem is a particular case of a linear recurrence of a 2nd degree with both coefficients equal to 1. The matrix exponentiation solution can be used in solving any linear recurrence problems. For example if we had to solve:

xn=6xn112xn2+8xn3

then we can build the matrix:

[  6,  -12,  8  ]   [ Xn-1 ]   [ Xn   ]
[  1,   0,    0 ] x [ Xn-2 ] = [ Xn-1 ]
[  0,   1,    0 ]   [ Xn-3 ]   [ Xn-2 ]

For a recurrence of k-th degree the matrix multiplication takes O(k^3) and the exponentiation takes O(logN) time, therefore we can solve the general problem in O(k^3 * logN) time.

Note

The logarithmic solution is only viable when we need to compute a single value for a given n. If, for example, we are asked to compute all the Fibonacci numbers up to and including n, then the linear solution is the obvious right choice.

Saturday, January 18, 2020

How relevant are the coding interview questions?

"Have you ever used any of these fancy algorithms in any of your industry roles?" - It's a question that comes up every now and then. In most of the cases, the answer is "No".

So then, why do we ask such tough algorithmic problems? It turns out, there is a strong correlation between doing well in the coding interview and being a successful software engineer. Here's a few reasons why:

• Someone who can quickly solve such a complicated problem will find it easy to deal with the daily small tasks.
• The debugging skills required during the coding interview are transferrable to industry programming.
• Good test coverage and edge case handling is critical in both situations.
• Comparing different approaches in terms of time and space complexity is something you will frequently encounter in your career.
• Writing simple, readable, nicely spaced and properly aligned code is a must when working in a large project.
• Your interaction with the interviewer translates into your ability to pair program.
• the list goes on...
But why an algorithmic question? Why not a different type of question? Wouldn't a small app/project/task be more relevant in this context?

This has a lot to do with the cost per hire. Large companies spend over 50k USD per hire. Think 50 phone screens and maybe 10 on-site interview rounds only to get one engineer to pass the bar and accept the offer. It all adds up in terms of costs: recruiting efforts, engineering time, travel expenses, lodging, etc. We need a quick, yet good enough way to identify potential hires from prospects. The concept of a 40-minute algorithmic interview proved to be quite effective and this model has been widely adopted.

Unfortunately, the correlation between a good problem solver and a solid engineer only works one-way. There are many amazing engineers who simply do not perform well in solving algorithmic questions. This is where the process gets really flawed. An experienced interviewer should be able to pick up on this, maybe twist the question a bit, and still get enough relevant signals to make a hire/no hire decision. At the same time, a less experienced interviewer may be tempted to simply reject the candidate.

Is the current format the best option for the coding interview? Maybe. Maybe not. It's certainly far from perfect. But it's good enough and it is cost effective. And it will continue to play an important role in the early stages of the interview process. So make sure to take some time to brush up on those CS Fundamentals and do some coding interview practice every once in a while.