High Availability: MassTransit 2.x with Clustered MSMQ – Part 2


Introduction

In this post, we’ll look at configuring a Windows Failover Cluster and then install an MSMQ role onto the cluster.  If you’re looking for how to configure MassTransit 2.x against an existing clustered MSMQ role, you might want to skip ahead to Part 3 (coming soon).

Disclaimer: I’ve personally configured MSMQ in 4 separate Failover Clusters to date, and the experience was completely different each time.  I can’t really cover off all the potential problems you might encounter, but I present here the consolidated experience I have had to endure, in case it saves you time and headaches.

Prerequisites – Per-Machine Configuration

In order to be successful, machines participating in the cluster need to be configured identically before clustered roles are installed. I’ve found that it’s better to have the configuration set before even adding them to a failover cluster.

Windows Features

We need the following to be installed on all nodes in the cluster:

  • · Message Queuing
    • Message Queuing Services
    • Message Queuing Server
    • Directory Service Integration
    • Message Queuing Triggers

clip_image002

Important! Message Queuing must be running in Directory mode, not Workgroup mode. Once MSMQ is installed and running, check the value of the following Windows Registry key to ensure the mode is correct:

HKLM\Software\Microsoft\MSMQ\Parameters
Workgroup = 0

clip_image003

Shared Disk

You’ll also need shared disk made available to all potential members of a Windows Failover cluster.  The clustered MSMQ role needs to be able to fail it’s shared disk over to an active node.

Installing the Failover Cluster

Install the Failover Clustering feature on each candidate node (each server which will participate in the cluster). 

Note the user performing the installation/configuration must be a member of the Enterprise Admins and Domain Admins security groups within the domain. Don’t forget to validate the cluster once configured. The setup should create a new cluster AD object in the same OU as the node member servers.

clip_image002[6]

Each cluster member needs to be configured to access appropriate shared disk resources (e.g iSCSI, NFS mounts etc.). The MSMQ clustered role requires shared disk in order to function.

I’m not going to walk through the setup, it is pretty straightforward.

Once all the cluster members are installed and configured, run Failover Cluster Manager on one of the nodes and you can proceed to installing MSMQ.

Configuring the MSMQ Cluster Role

Prerequisites

  • IP Address reserved within the cluster subnet (optionally, create a DNS A record to match the network name you wish to use)
  • Shared disk between cluster nodes

Once the failover cluster has been successfully created, there are a couple of permissions changes which need to be made before you can begin to install the MSMQ cluster role.

Permission Changes & Role Installation

In Active Directory, there should be a computer object which represents the cluster name, e.g. BUS10$

The cluster AD object needs to have “Create all child objects “ permissions on all member server objects in Active Directory, i.e. under the Servers OU (if that’s where they are located).

Using the Wizard, select MSMQ and follow the dialogs. You’ll need to provide the Client Access Point name (like a NETBIOS name), and an IP address.

The MSMQ role installation wizard tries to create a DNS A record for you (for the network name of the MSMQ queue), but requires permissions within DNS to do so. I find it’s cleaner to not have to set this permission, and just create the record manually (if the record exists, or the object doesn’t have permissions, it doesn’t break the role).

clip_image005

Next, select a shared disk for the resource and click through to finish. See the troubleshooting section for help if you encounter issues.

In one environment, once configured, after many, many attempts the MSMQ Cluster Service wouldn’t install on either of the cluster nodes. The only path forward was to add the Queue AD Computer object (BUSQ) to the Domain Admins group temporarily. Once the failover had occurred to all nodes (31 & 32), this membership could be safely revoked.

Troubleshooting

Generate Cluster Logs

Open an elevated PowerShell prompt on a cluster node and type : Get-ClusterLog –timespan X

This will generate a log to (System Drive) C:\Windows\Cluster\Reports\ Cluster.log with the last X minutes of log.  It sometimes provides more accurate information, but beware it can be incredibly verbose, so make sure you limit the span of the log with –timespan.  I’ve had this command (sans timespan) generate a half terabyte log file before!

Summary

This leads into the next post – how to install and run MassTransit against the clustered MSMQ queue.  Coming soon.


About Rob Sanders

IT Professional and TOGAF 9 certified architect with nearly two decades of industry experience, 18 years in commercial software development and 11 years in IT consulting. Check out the "About Rob" page for more information.

Leave a comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>