Use Knative when you can, and Kubernetes when you must.
This post provides a fresh view of Knative. Not as a solution for Serverless, but as an Opinionated Kubernetes. The additional automation, simplification, and multiple other benefits offered by Knative make deploying services via Knative a better choice than deploying the same services using Kubernetes directly. Knative extends Kubernetes and allows mix and match between Knative services and Kubernetes microservices.
As an opinionated system, it requires each deployed service to work in a certain way. The post analyzes Knative benefits to Kubernetes microservice users. It concludes that many existing microservices are already built to run as Knative services offering an immediate benefit to users.
What does Knative offer on top of Kubernetes?
Knative eliminates many Kubernetes complexities. It automates the entire microservice deployment process. Users need only provide a high-level service definition. Given a service definition, Knative automatically derives and creates the necessary Kubernetes resources for the microservice to become operational. Further, since all resources are automatically derived, misconfiguration of any of the resources is no longer a possibility.
In this way Knative takes care of the details of networking, scaling, revision tracking, and more; while eliminating the need for the Kubernetes user to deploy and later maintain these resources manually as commonly done in Kubernetes outside of Knative.
Knative allows users to deploy Kubernetes services using a much simpler and straightforward CLI tool (See KN) and hide the many implementation details normally exposed to Kubernetes users. The number of configuration knobs is drastically reduced which means Kubernetes services deployed with Knative take much less effort to both deploy and then maintain. Last, to use Knative, the user is not required to have a full understanding of Kubernetes which means a higher return on the time invested.
Knative introduces automated horizontal scaling to Kubernetes services out-of-the-box. It adjusts the number of pod replicas to the actual load as needed. Both the Knative Pod Autoscaler and the Kubernetes standard Horizontal Pod Autoscaler are supported. This eliminates the need for the user to ensure sufficient pre-provisioning to absorb any peak time loads. Consequently, Knative will use fewer pods on average, compared to running the same microservices on Kubernetes without Knative. This allows many more services to be served by any given Kubernetes cluster.
For Kubernetes offered as a service, auto-scaling enables users to pay less – pay only for what is being used, without the need to pre-provision resources based on peak time requirement projections.
One complex aspect of microservice maintenance is when a new revision of the microservice needs to be deployed. Rolling a new revision without service downtime and being able to roll back in case issues were detected with the new revision, are essential for production environments. By default, Kubernetes achieve rolling updates pod-by-pod with slight performance degradation.
Knative automates the revision updates and offers the user a much more controlled process to safeguard against the hazards of deploying new revisions. For example, Knative allows the user to initially deploy only 1% of the traffic to the new revision and gradually decide to remove the previous revision once the user is sufficiently confident that the new revision is safe for use.
An Application Backbone
Knative serves as the application backbone. It queues and distributes the application events and thus ensures that each application layer can scale independently. Knative ability to queue the events helps decouple the different application layers and absorb the application load as needed until the application layer is auto-scaled.
In practice, Knative handles all inter-microservice communications between services deployed on Knative and all ingress communications, such that in most cases, the application will no longer require to deploy/use/configure any additional queuing/eventing systems – a significant benefit by itself.
The Serverless Pattern: Using Knative as an Opinionated Kubernetes
As is often the case, such benefits come with a cost. In the case of Knative, the cost is due to Knative being an Opinionated Kubernetes. Knative compels all deployed services to be designed in what we will call here “the serverless pattern”. In other words, Knative offers automation, simplification, auto-scaling, controlled revisions, and an application backbone for as long as the user abides by certain design patterns when building the application. As we discuss next, these patterns however are extremely common in Kubernetes microservices or at least those that were designed as web services and better yet, as RESTful web services.
Web services only
The Serverless pattern considers “amorphic compute resources” that serve “events”. In Knative the “amorphic compute resources” are in fact user containers, each running in a Knative-controlled Pod. The “events” are simply web requests (using HTTP or HTTPS). Hence, the first limitation Knative puts on the application is that, it needs to serve web requests, it may not serve other TCP/UDP traffic to qualify as a Knative Service. Note that both HTTP/1.1 and HTTP/2 are supported, and more specifically Knative supports gRPC.
Knative does not limit the application from initiating communication with other backend services. For example, the service may communicate with a Redis or SQL service although such communication is not constructed as web requests. At the same time, the Redis or SQL service will not run as Knative services, it may for example be deployed as regular Kubernetes microservices or outside the cluster.
Binding to a single ingress port
Knative expects each service to bind to a single port. Therefore, if an existing service was designed to bind to multiple ports, the user will be required to redesign the service and construct multiple standalone services, each serving a single port, before deploying the services using Knative.
Knative is not meant to offer nearly as many configuration options as Kubernetes. Yet many common requirements of deploying web services are already supported. Knative typically serves services that use a single user container but can be configured to support multiple user containers, where only one of the containers binds to receive and serve external web requests.
Mix and Match Knative Services and Microservices
Since Knative extends Kubernetes, a user may decide to deploy only part of its application workload with Knative and deploy other parts, such as services that do not abide with “the serverless pattern” for example, as regular microservices. The user may therefore still achieve the described benefits for the deployed Knative services.
Typical Web Services
It is common for microservices to serve web requests. Specifically, microservices designed to offer a RESTful service, or inspired by RESTful design patterns are likely to rely on web requests. Further, developers following or inspired by the 12-factor app approach for microservices, opt to assign “each type of work to a process type” – i.e., divide the monolithic application into as many micro-services as needed to ensure each task is performed in an independent microservice. This in many cases will result in a single service port required per microservice.
We may therefore conclude that many/most well-designed Kubernetes microservices are already meeting “the serverless pattern” required by Knative.
Knative and Kubernetes pod scaling caveats
Considering Knative as an alternative to deploying regular microservices over Kubernetes may work reasonably well as long as the user is not tempted by Knative ability to scale to zero. Scaling to zero, although an important serverless feature, introduces long cold start latencies, which is not expected by Kubernetes microservices. It is generally recommended that users will always set the Knative service to a lower bound of one replica, to maintain regular microservice Service Level Agreements (SLAs). If however, the SLA permits a 3-5 seconds delay to process a web request under normal conditions, the Knative default lower bound of zero, can be used.
Further, it is a common practice for Kubernetes users to choose 3 replicas or more per micro-service. This ensures service continuity during cluster node restarts and other cluster transient states. Since Knative ensures that web requests will be queued even during transient states, when the number of replicas may become zero, Kubernetes users are assured that web requests will not get lost. Therefore, if during cluster transient states, the SLA allows a 3-5 seconds service latency, setting a lower bound of one will suffice. If such a delay is not permitted during transient states, it is advised to consider a lower bound of 2 or 3.
Setting a minimum scale larger than 2 or 3 is not advised since Knative will automatically scale the service up as needed based on the actual load.
Use Knative when you can, and Kubernetes when you must.
Knative may offer considerable advantages to Kubernetes users deploying microservices that serve web requests. Knative simplifies the use of Kubernetes, automates and unifies the way services are deployed and maintained, offers unique auto-scaling and revision control features, and serves as the application backbone connecting between services and queuing client web requests at the ingress.
Knative is therefore a better/improved/more modern way to deploy and manage services on Kube. It is the right way to deploy microservices that abide by “the serverless pattern”.
A nice anecdote is that Kubernetes is defined as: “an open-source system for automating deployment, scaling, and management of containerized applications”. This definition fits Knative just as well since in practice, Knative is simply a more opinionated system and thus offers the same service with more automation and less flexibility.