Recently, I spotted this question on Reddit:
This is a good question – it indicates that the poster is realizing that the public cloud is a different place than on premises!
There’s three things to address:
- What does the Pets vs Cattle mean?
- Does the pets vs cattle concept mean you need to change how you manage your servers in Azure?
- Should VMs in Azure be domain joined?
What is Pets vs Cattle?
The slide above describes the concept. Fundamentally, rather than caring for your pet server – giving it a name, giving it regular checkups and nursing it at the first sign of illness, we treat the server like one of a thousand cattle. Give it a number, schedule regular health checks, and shooting it when it shows signs of illness. Application of this concept can vary broadly across organizations! Depending on the scenario, it could mean that there’s a policy to rebuild a server if the OS starts acting up, or it could mean that you’re rolling over servers every 30-90 minutes, like Netflix, or it could be anything in the middle.
Does the Pets vs. Cattle Concept Change Server Management for Azure?
No.
Azure has a lot top-level[footnote]By top level, I mean IaaS VMs, App Services, VM Scale Sets, SQL Database, etc[/footnote] services, and even more as you go down the stack. If your application is developed in-house, chances are you can move off of IaaS VMs to one of the more managed services such as App Services, Functions, or even VM Scale Sets. In this scenario, most of your infrastructure and deployment can be fully automated, and in many cases there’s no need for AD Domain Services.
So – should I domain join my Azure VMs?
Probably.
Most of the applications I see being moved to the cloud either involve third-party vendors, with varying levels of support, or there’s a “forklift, then optimize” approach that is being taken. In these scenarios, it’s a great idea to extend your Active Directory to the cloud, and domain join the VMs. One way to automate configuration of your VMs is, in fact, through Active Directory. It’s not the newest or most capable tool, and I don’t recommend relying on it as a configuration management tool, but it certainly makes management a lot easier. Centralized identity management alone goes a very long way towards making configuration better.
In general, good architecture would avoid using Azure IaaS VMs. If you use a scalable architecture (VM Scale Sets), you may want to domain join the VMs for manageability or common authentication, but that does require a bit of automation to be created around destroying VMs (VM Scale Sets will only recycle VM names every 10000 VMs, if I recall correctly, but it will recycle – in any case, having 10000 VMs without a home is not a great starting point).
If you’re using VMs as a stop-gap as part of a longer-term migration (and this is, indeed, a stop-gap approach), OR you have no on-premises Active Directory, you may be able to make use of Azure Active Directory Domain Services. This can be a confusing topic – AAD DS is a managed service that looks like Windows Server Active Directory to your VMs, and has your users synchronized into it already[footnote]It’s important to note that the user objects are not the same as your on-premises users – they just look the same. It’s a different domain, with the same name.[/footnote]. This means you get many of the benefits (Single sign on) that Active Directory provides, without the requirement to manage and pay for the AD DS servers, and you have an Active Directory for the application to use.
Does the pets vs cattle concept have relevancy on-premises
Absolutely!
Troubleshooting operating system (and many application) issues is, in most cases, a waste of your time! Most IT organizations already have moved in this general direction, as applications are also increasingly supporting this mode of operation. Exporting data, configuration, and settings is generally quite well-supported now, and most organizations can start up a new barebones VM from a standard image in a few hours – or less. Setting up a clean VM, reinstalling the applications, and restoring the data is usually something that can be accomplished inside a day, whereas troubleshooting a server can take weeks, and you may never know what fixed the issue.