Users of cloud services are presented with a bewildering choice of VM types and the choice of VM can have significant implications on performance and cost. In this paper we address the fundamental problem of accurately and economically choosing the best VM for a given workload and user goals. To address the problem of optimal VM selection, we present PARIS, a data-driven system that uses a novel hybrid offline and online data collection and modeling framework to provide accurate performance estimates with minimal data collection. PARIS is able to predict workload performance for different user-specified metrics, and resulting costs for a wide range of VM types and workloads across multiple cloud providers. When compared to a sophisticated baseline linear interpolation model using measured workload performance on two VM types, PARIS produces significantly better estimates of performance. For instance, it reduces runtime prediction error by a factor of 4 for some workloads on both AWS and Azure. The increased accuracy translates into a 45% reduction in user cost while maintaining performance.
Month: July 2017
Introducing PWK (play with K8s)
P[lay]W[ith]D[ocker] is currently being used for different things such as:
-
Try new features fast as it’s updated with the latest dev versions.
-
Setup clusters in no-time and launch replicated services.
-
Learn through it’s interactive tutorials (training.play-with-docker.com).
-
Give presentations in conferences or meetups.
-
Allow to run advanced workshops that’d usually require complex setups.
-
Collaborate with community members to diagnose and detect issues.
It’s our pride to present PWK (http://play-with-k8s.com), our first iteration of a fully functional Kubernetes playground.
Abstraction Levels in Public and Private Clouds
By operating at different planes of abstraction, public cloud services and private cloud infrastructure make it virtually impossible to have a coherent, seamless hybrid cloud design. As organizations mature in their understanding and use of public services like AWS and move beyond simply treating them as rentable virtual server farms, they will internalize the public-private dichotomy and see that their hybrid cloud strategy has flaws. Until the industry better addresses the abstraction-layer mismatch, I expect to see more and more organizations rethinking their hybrid cloud plans.
Calculating Service Availability
This article expands upon the topic of SLOs to focus on service dependencies. Specifically, we look at how the availability of critical dependencies informs the availability of a service, and how to design in order to mitigate and minimize critical dependencies.
Fast Feedback Enabling DevOps Practices
Post from Subbu Allamaraju, DevOps is not DevOps, explains the value of fast feedback loops in enabling DevOps practices.
SDN Summary Graphics
This PDF from the Open Networking Foundation summarizes:
- Basic model of SDN
- The Architecture of Software-Defined Networks
- Role of the SDN Controller
Open Policy Agent
OPA is a lightweight general-purpose policy engine that can be co-located with your service. You can integrate OPA as a sidecar, host-level daemon, or library.
Services offload policy decisions to OPA by executing queries. OPA evaluates policies and data to produce query results (which are sent back to the client). Policies are written in a high-level declarative language and can be loaded into OPA via the filesystem or well-defined APIs.
Iterating DevOps
The Ship Show #46 podcast, The Epistemology of DevOps, originally published 13 August 2014, has many takeaways, generally focusing on issues associated with organizational and process "debt" that have little to do with the technical issues generally talked about in the "DevOps" context.
Participants:
12 min:52 sec - Primary focus of the podcast starts about here
14:13 Kevin Behr makes a remark about sensing "restlessness about scaling" in the industry. This isn't directly about the organizational/process debt discussed later, but insightful since posts and tweets about scaling concerns are pervasive today and have clearly been a concern for some time.
16:53 Engineers want to know, "Is there an RFC for DevOps so I can use it like a tool?"
17:30 We don't need to standardize DevOps (so it can be productized and sold)
18:05 Developers: "If you ops people would just expose an API so we can interact with you like robots..."
18:25 DevOps is not about optimizing for developers
20:25 Culture is an abstraction we invent to represent interactions in a system
25:30 Taylor did atomistic science, focusing on the individual. Lean focuses on the system.
26:25 Science means we don't know, so we have to keep asking in a structured way. Once we know, we do engineering.
30:35 Almost no company teaches how to improve daily
37:10 The ability to transmit information among people is the limiting factor in most organizations
44:00 coal miners cross training for more productivity and safety through learning in a complex, dangerous environment
46:15 increase response repertoire
47:00 ITIL good for simple environments where things are constrained
51:00 Difficulties in creating emergent teams to deal with problems
51:40 Meetings to deal with problems diffuse responsibility and people think they provide safety
52:00 ephemeral crews to deal with dynamic capabilities; cross silos like cells with permeable walls
54:00 the pragmatic maxim
55:15 Explore it by Elisabeth Hendrickson
56:45 Heresy in Devops is ok
59:20 What's wrong with the enterprise today? Everything is a project. Actually, we're on a permanent change footing.
61:35 We're not resources, we're humans.
62:15 conversation ends
Noisy Neighbors in IaaS / Cloud
A good metric for noisy neighbor identification is high CPU steal. Some references: