This article expands upon the topic of SLOs to focus on service dependencies. Specifically, we look at how the availability of critical dependencies informs the availability of a service, and how to design in order to mitigate and minimize critical dependencies.
Fast Feedback Enabling DevOps Practices
Post from Subbu Allamaraju, DevOps is not DevOps, explains the value of fast feedback loops in enabling DevOps practices.
The Ship Show #46 podcast, The Epistemology of DevOps, originally published 13 August 2014, has many takeaways, generally focusing on issues associated with organizational and process "debt" that have little to do with the technical issues generally talked about in the "DevOps" context.
12 min:52 sec - Primary focus of the podcast starts about here
14:13 Kevin Behr makes a remark about sensing "restlessness about scaling" in the industry. This isn't directly about the organizational/process debt discussed later, but insightful since posts and tweets about scaling concerns are pervasive today and have clearly been a concern for some time.
16:53 Engineers want to know, "Is there an RFC for DevOps so I can use it like a tool?"
17:30 We don't need to standardize DevOps (so it can be productized and sold)
18:05 Developers: "If you ops people would just expose an API so we can interact with you like robots..."
18:25 DevOps is not about optimizing for developers
20:25 Culture is an abstraction we invent to represent interactions in a system
25:30 Taylor did atomistic science, focusing on the individual. Lean focuses on the system.
26:25 Science means we don't know, so we have to keep asking in a structured way. Once we know, we do engineering.
30:35 Almost no company teaches how to improve daily
37:10 The ability to transmit information among people is the limiting factor in most organizations
44:00 coal miners cross training for more productivity and safety through learning in a complex, dangerous environment
46:15 increase response repertoire
47:00 ITIL good for simple environments where things are constrained
51:00 Difficulties in creating emergent teams to deal with problems
51:40 Meetings to deal with problems diffuse responsibility and people think they provide safety
52:00 ephemeral crews to deal with dynamic capabilities; cross silos like cells with permeable walls
54:00 the pragmatic maxim
55:15 Explore it by Elisabeth Hendrickson
56:45 Heresy in Devops is ok
59:20 What's wrong with the enterprise today? Everything is a project. Actually, we're on a permanent change footing.
61:35 We're not resources, we're humans.
62:15 conversation ends
Noisy Neighbors in IaaS / Cloud
A good metric for noisy neighbor identification is high CPU steal. Some references:
Debugging with Events
Series on debugging with events, including:
Operability – Adrian Colyer
"the most important high level things are in Hamilton’s opinion:
- Expect failures to happen regularly and handle them gracefully
- Keep things as simple as possible
- Automate everything"
The Morning Paper on Operability
See also Internet Scale Services Checklist
Containerized CI Solutions in AWS – Part 1: Jenkins in ECS via Stelligent
In this first post of a series exploring containerized CI solutions, I’m going to be addressing the CI tool with the largest market share in the space: Jenkins. Whether you’re already running Jenkins in a more traditional virtualized or bare metal environment, or if you’re using another CI tool entirely, I hope to show you […]
via Containerized CI Solutions in AWS – Part 1: Jenkins in ECS — Stelligent