AppOps - Cloud Native Application Operations

Offering DevOps as a service might initially sound counterintuitive, as the core idea of DevOps is to bring development and operations closer together. However, we believe that software developers should be able to focus on the continuous development of the product to create ongoing value for the business. We take on aspects of DevOps, starting with deployment review and engineering, and work closely together. We proactively manage key operational tasks in production: we can respond to issues around the clock, prevent problems through maintenance of involved components, and be prepared for incidents involving data loss or corruption by ensuring data backups. Understanding that it’s not our application, we collaborate with developers or software suppliers to write runbooks for incidents or maintenance, defining what VSHN can and should do and when to collaborate or hand over to the software specialists.

Included Services

Continuous Collaboration

Your application deployment(s) is known at VSHN. Through shared code and configuration access, handling Merge Requests together, and documentation of workflows and runbooks, we’re ready to work together and support - be it to make changes to the deployment, scale things or handle incidents. Optionally, you might call us 24/7.

Component Maintenance

Container (Docker) Images are the way to ship software, but once built they should already be considered outdated. You build and deploy when you release, but the involved 3rd party software, libraries, etc. might already have security issues or worthwhile improvements. We have processes implemented to automatically spot and update such components in a curated way - Usually using Renovate and Git Merge Requests. We define together what VSHN does and what should stay with developers. Usually, VSHN handles patch level updates weekly and creates tickets to collaborate for major version updates or software end-of-life.

Environment Base Monitoring & Alerting

We react and try to figure out what is wrong with a deployment, should resource limits be hit, the number of Pods do not meet the deployment definitions, used certificates expire, or internal services become unavailable (Usually done via Kubernetes level metrics and alert rules).
Includes backend services provisioned in the scope of the deployment, like databases, search, and caches - Usually done through AppCat.
Includes application relevant based metrics like CPU, memory and persistent volume usage.

Backup and Restore Testing

As engineered together, we make sure that backup jobs run, and we react if they fail. How we react is part of what we define together in runbooks - Usually done with K8up and alerting rules. We can also help you with restore tests or even disaster recovery plans.

Guaranteed Availability of your Application

Optionally, with well-defined Service Level Indicators (SLIs) and Service Level Objectives (SLO), we continuously proactively invest to prevent problems and handle incidents to meet the SLO - for example our 99.99% uptime. If we violate the objective, you’re entitled to get money back (service credits). See Service Levels for more information.

Guaranteed Availability

Optionally (see Pricing), we guarantee Service Level Indicator (SLI) and Service Level Objective (SLO) combination as defined through Application Monitoring Workshop and are based on the VSHN Service Level Guaranteed Availability (all definitions apply).

Additional Exceptions

Additionally to VSHN Service Level Exceptions:

Issues in the customers' or 3rd party vendors' code that lead to problems or outages.
Changes to the applications' code, deployment configuration, depending services and components where VSHN was not actively involved.

Pricing

Base Fee: To continuously know your deployment, be ready to collaborate and do our proactive part, we charge a base fee. The fee depends on the complexity of the deployment, which is estimated together with you and reviewed every half year as things might change.
Guaranteed Availability: The prices are per production deployment of your or a 3rd party application. Included are automatically managed preview, test, staging deployments which are not critical.
Additional Effort: During our proactive work or when responding to incidents we might discover issues that should be proactively addressed, or we have other improvement suggestions. We might have end-of-life components, etc. for this you, or we create tickets, we estimate the effort, you approve, we work, and we bill by-the-hour according to effective effort we put into it at VSHN default rates (hour packages at lower rates available).

See Pricing.

How We Work Together

After the Application Deployment Engineering we’re almost ready to go into operations mode together.

After the Production Start, it’s good to not hear from us as this means everything runs smoothly, but we’re there for you: you (or we) create and handle Merge Requests and depending on your Support Plan option you create tickets in our ticket system, call us, or we work together in our chat system. If we see anything that needs your attention, we create tickets and inform you.

Onboarding

We collaborate and define the needed runbooks together. What do we need to do, where do we need to involve your developers.
We assess together, whether we need to defined Application Service Level Indicators (Application Monitoring) and if you need Guaranteed Availability for it. We would do this in one or more Application Monitoring Workshop.

Production Start

After the Onboarding, we need to know, when you consider your application "in production". From that point on (and when we have a valid Sales Order), we do our part of the collaboration to operate your application together.

Prerequisites

We run and can take responsibility for what we engineered in collaboration with the customer’s app owner or developer (Application Deployment Engineering). Contrary, we can’t take over the operations of deployments or environments that were not created in cooperation with VSHN or do not meet our standards.

Enabling Conditions

See AppFlow Enabling Conditions, additionally:

We can’t handle alerts from non-VSHN-controlled external monitoring systems or from checks unknown to VSHN - as we can’t influence false positives directly and might not be prepared to react to such alerts. This would have no customer benefit.

Scope

In the scope is one production deployment (including automated test and branch deployments). This offering needs to be ordered once per production deployment. Other deployments are out of scope.