We went through an exercise this week to combine three SQL Server instances into two new ones. The old instances were a combination of SQL 2000 and 2005 systems running on either x32 or x64 – the new instances are both x64, but we had to leave one at SQL 2000 because of a legacy application that doesn’t support 2005.
From my perspective as the development manager, this really wasn’t that big of a deal. With the exception of Microsoft CRM, all of our applications have their connection information centralized in their own config files. CRM is a little more annoying, in that you have to change a DSN and run a configuration utility. The utility is especially annoying as it requires a super-powerful account to run it properly. We also have a number of custom frameworks and applications that “hang off” of our CRM implementation, each of which required configuration.
The process, predictably, took longer than expected. There were occasional problems restoring the databases, and we weren’t really able to start the configuration effort until around 9pm. And then the problems started happening. Accounts lost their permissions in the restore effort. SQL 2005 has case-sensitive passwords, and mismatches between the config file and the password caused accounts to get locked out temporarily. Some config entries were missed or transposed. Overall, kind of a mess, and the configuration effort took until around 1am. I ended up leaving around 2:30, and the database guys didn’t leave until 4 – and they just went home and resumed their work remotely.
On Monday we will have a retrospective to try to think about how we learn from this. My focus may be on how we centralize our configuration management, but that’s a bit of a double-edged sword. Centralizing makes maintenance easier, but also creates a single point of failure if something bad happens. But there are of course ways to mitigate that. I’m just hoping that we can learn from this and make it a valuable experience.