Lightweight Kubernetes Alternative: How to Get Scalable Infrastructure in Minutes with One Command
One of the main challenges for anyone working with cloud infrastructure is how to ensure scalability and reliability. If you search for "how to make cloud infrastructure scalable and reliable," you'll repeatedly encounter one term—Kubernetes. Enterprises use Kubernetes, cloud providers have managed Kubernetes, and numerous small and medium-sized projects that avoid Lambda functions or Cloudflare Workers still discuss Kubernetes, attempting to utilize it as well. However, is it truly effective for all projects, and should you really dedicate days, weeks, or months to setting up and managing Kubernetes for your project, incurring significant expenses just to operate Kubernetes itself?
In this post, I will introduce the main concepts of developing scalable and reliable infrastructure and demonstrate how you can achieve this without Kubernetes in a matter of hours or less.
Main Concepts of Scalability and Reliability in Cloud Computing/On-Premise
The core concept is straightforward—if a server hosting your web app lacks computing resources or fails to respond, you should enhance the server's resources (in case the server has insufficient capacity to manage additional resources) or start a new server and redirect requests to it. Let's explore your options for managing traffic spikes and server failures:
- Only One Server with the Possibility of Adding More Resources - vCPUs and RAM (Vertical Scalability)
This approach is the simplest to begin with—you order a server from a cloud provider where you can increase or decrease vCPUs and memory without shutting down the server, such as LocalCloud servers or OVH public cloud servers. Not all cloud providers offer the flexibility to scale servers without downtime—for instance, Scaleway does not permit this. The process entails deploying the web app or apps on a server equipped with 2 vCPUs and 8GB RAM, maintaining this configuration as long as you have fewer than 1000 active users (this threshold varies depending on your project and the tech stack you use). When only 20% (or 15% or 10%, depending on your preference) of CPU or RAM remains available on the server, you can adjust the server's capacity via the cloud provider's dashboard or an API. Adding more resources typically takes less than a minute. While this process might seem straightforward, you also need a tool to manage deployments, TLS certificates, proxy requests, and receive notifications about any issues. LocalCloud (a source-available platform) can assist in this regard—you execute a single command to install it on a server and utilize an interactive tool to manage apps on the server. LocalCloud imposes no limitations—you can install as many apps as your server can support. The principal drawback of this single-server scenario is that if the server fails, your apps will cease to function. However, you need not be overly concerned—as long as your user base remains modest, this option is viable.
Here’s what you should consider:
- Mitigating attacks such as DDoS
- Receiving notifications about server issues
- Maintaining valid TLS certificates for HTTPS
LocalCloud can address these challenges, or you can explore other open-source projects for each specific issue. - DNS Load Balancing
The most basic infrastructure for distributing requests across servers might include only two servers, each with a distinct IP address. In this scenario, DNS is employed to balance requests between these servers. To establish this infrastructure, you add two A records to DNS (yourdomain.com -> IP_of_server_1, yourdomain.com -> IP_of_server_2), enabling the distribution of user requests across the two servers. This method does not verify server responsiveness to requests, which is suboptimal but is still sufficient for many projects. To enhance reliability, you could set a low TTL (Time to Live—the duration DNS records are cached on your local/cloud device, for instance, setting a TTL of 60 seconds for a new A record for yourdomain.com means the IP address for this domain is cached on the user's computer for 60 seconds) and modify A records based on web server availability. Suppose you have two servers with IP_1 and IP_2. For optimal DNS load balancing efficiency, you insert two A records in your domain registrar account with a TTL of 10 seconds (not all registrars permit such low TTLs) and monitor your servers (for example, by dispatching a HEAD health request from one server to another every 5 seconds). If a server does not respond to the health request, you remove the A record associated with that server's IP address. The primary advantage of this load balancing technique is that it eliminates the need for a specialized server to manage all traffic with a load balancer (HAProxy, CaddyServer, NGINX, etc.), as DNS directly returns the IP address of the server to which a request should be sent, thereby significantly reducing cloud expenses by avoiding a centralized load balancer.
- Load Balancers
Another strategy for enhancing the scalability and reliability of your infrastructure involves deploying a specialized server equipped with a load balancer behind your application servers. All user requests are directed to the load balancer, which determines the nearest available server hosting the app at that moment and forwards the request accordingly. The main disadvantages of this approach include the load balancer introducing additional latency, serving as an additional point of failure, and incurring extra cloud costs. If you have only one load balancer and it fails, your users will no longer receive responses. Of course, you could deploy additional balancers and employ DNS load balancing among them, but this increases complexity and costs, complicating the entire system.
- Combining DNS Load Balancing and Load Balancers
The most dependable method, if you lack access to hardware, is to position several load balancers behind your web app servers and incorporate DNS load balancing. For added reliability, you might consider DNS anycast, although this exceeds the scope of this post.
Kubernetes and the Lightweight Alternative
Having covered the basics of enhancing your project's scalability, let's revisit Kubernetes to examine how it manages external requests (from your app's users). Kubernetes requires an external load balancer, implying the need for a separate load balancer infrastructure in addition to Kubernetes unless you opt for managed Kubernetes. After investing considerable time in reading documentation and setting up Kubernetes, managing an external load balancer setup becomes yet another task. Remember, these additional layers incur massive costs. For more information on Kubernetes Load Balancers, visit https://www.densify.com/kubernetes-autoscaling/kubernetes-service-load-balancer/, although I suggest skipping that guide and exploring below how LocalCloud addresses these issues, offering a significantly lighter alternative to Kubernetes.
LocalCloud eliminates unnecessary layers of abstraction and is based on a different principle—each machine (whether cloud or dedicated server) includes an integrated load balancer and proxy server. This allows you to simply add an A record with the server's IP address once LocalCloud is installed, and LocalCloud will not only balance requests across containers on that server but also manage TLS certificates (ensuring your web projects are accessible via HTTPS) and proxy requests. If you want to balance requests across multiple servers situated in different data centers, you can employ any of the previously described methods without enduring lengthy configuration processes and weeks of labor. You might wonder, "What about securing internal communications between servers (e.g., from a server hosting a web app to a database server)?" This is easily handled—communication between servers is safeguarded by a Virtual Private Network (VPN) that operates seamlessly out of the box. To establish such scalable and secure infrastructure with LocalCloud, you simply execute a single command on each server.
Still seems a bit complex? Try it now. It will take only 5-10 minutes.