How to troubleshoot Service Broker problems
By Mladen Prajdić
on 21 August 2007
| 0 Comments
| Tags: Service Broker
Article Series Navigation:
In my first and second article about Service Broker
I've shown how to build a central data repository on one server with one
database that stores data (fist article) and across multiple servers with a
single server that stores all data (second article). In this one I'll discuss
some of the problems that can arise and how to troubleshoot them.
Tools at our disposal
Profiler
Profiler now has a whole section of events dedicated just to
Service Broker. Note that Service Broker conversations are always done between
two points. This means that you have to monitor both points with profiler to
get the whole accurate picture of what is happening.
These events are:
- Broker:Activation fires when a queue monitor starts an
activation stored procedure.
- Broker:Connection reports the status of a transport
connection managed by Service Broker.
- Broker:Conversation reports the progress of a
conversation.
- Broker:Conversation Group fires when a conversation group
is created or dropped.
- Broker:Corrupted Message fires when a corrupt message is received.
- Broker:Forwarded Message Dropped fires when a message
meant for forwarding was dropped.
- Broker:Forwarded Message Sent fires when a message is
successfully forwarded.
- Broker:Message Classify fires when routing for a message
has been determined.
- Broker:Message Undeliverable fires when a received message
that should have been delivered to a service in this instance can't be
retained.
- Broker:Queue Disabled fires when message poisoning was
detected. This means there were five consecutive transaction rollbacks on
a Service Broker queue. It contains the database ID and queue ID of the
queue that contains the poison message.
- Broker:Remote Message Acknowledgement fires when a
message acknowledgement is sent or received.
- Broker:Transmission fires when a transport error occurs in
the transport layer. The error number and state values indicate the source
of the error.
- Security Audit:Audit Broker Login reports audit messages
related to Service Broker transport security.
- Security Audit:Audit Broker Conversation reports audit
messages related to Service Broker dialog security.
Some of these events have a EventSubClass column that
provides more information about the event so be sure to include that column. I
usually simply select all of the events and all of the columns. This provides
all the info you can get from traces for better analysis.
Catalog Views and Dynamic Management Views
There are 4 DMV's for service broker
- sys.dm_broker_activated_tasks returns a row for each
stored procedure activated by Service Broker. It can be joined to
dm_exec_sessions.session_id via the spid column.
- sys.dm_broker_connections returns a row for each Service
Broker network connection.
- sys.dm_broker_forwarded_messages returns a row for each
Service Broker message that an instance of SQL Server is in the process of
forwarding.
- sys.dm_broker_queue_monitors returns a row for each queue
monitor in the instance. A queue monitor manages activation for a queue.
There are 11 Catalog views. These hold all the necessary information
to correctly diagnose problems. It's important to know what to search for and
where. That's why I'll go over each catalog view and describe in which
situation it's useful.
sys.transmission_queue
This catalog view is your first stop when troubleshooting
Service Broker. sys.transmission_queue is a crucial catalog view for Service
Broker operation because every sent message sits in it until the target sends
back a acknowledgement. If an acknowledgement returns successfully then the
message will disappear from the view. If not the transmission_status column
will hold the error information. If your message doesn't reach its
destination look at this catalog view to see what happened.
sys.conversation_endpoints
This catalog view holds a row for each conversation that the
Service Broker participates in. The conversation_id column has the same value
on both ends. Note here that if you want to close a conversation you have to
close it using a conversation_handle value which is different on each
conversation end. You have to look it up via the conversation_id value.
The state_desc column holds the status of the conversation. If
messages can't reach the other end look here so see the state of the
conversation which must be 'Conversing' for proper two way functioning.
sys.conversation_groups
This catalog view contains a row for each conversation
group. You won't be using this one much since the same information is available
in sys.conversation_endpoints
sys.remote_service_bindings
This catalog view contains a row for each remote service
binding. Remote service bindings are used for implementing dialog security.
This means that you can view which service is bound to which user and which
permissions. If you suspect problems with dialog security this is the 2nd place
to look in after sys.transmission_queue followed by running Profiler.
sys.routes
This catalog views contains one row for each created route.
Routes are used to locate the network address for a service. Here you search
for information about which route deals with which remote Service Broker and
the lifetime of the route. Expired routes are hard to debug so check here if
you suspect this. Route address can be LOCAL, TRANSPORT or the IP or DNS name of
the computer.
sys.service_contracts
This catalog view contains a row for each contract in the
database. Not much other help you can get from this one.
sys.service_contract_message_usages
This catalog view contains a row for each contract-message
type pair. This one is more useful than the previous one. If you suspect that
the service doesn't support the message type you expect it to support (for
example a typo when creating it) or if you set the wrong end of the
conversation to use it you can check here if you're right.
sys.service_contract_usages
This catalog view contains a row for each service-contract
pair. That's about it.
sys.service_message_types
This catalog view contains a row for each message type
registered in the service broker. If you suspect message validation issues this
is the place to verify.
sys.service_queue_usages
This catalog view returns a row for each reference between
service and service queue. A service can only be associated with one queue
while a queue can be associated with multiple services. Since one queue can
accept messages from more than one service you can check here which services
are bound to which queues.
sys.services
This catalog view contains a row for each service in the
database. Helps get a clearer view of service ownership and queue use.
How to go about it?
It's important to know how the full conversation between
endpoints occurs:
- Initiator sends the message.
- Target receives the message
- Target sends back the acknowledgement of successfull
message receive
- Initiator accepts the acknowledgement
The two main starting points are sys.transmission_queue and
sys.conversation_endpoints. The first one will tell you what errors
happened when sending messages. The other will tell you if the conversations
you're sending your messages on are valid and are being used properly.
When you suspect a problem, look into these two on the
Initiator side. If you don't find anything wrong so far, start up Profiler and
attach it to both the Target and Initiator with all of the events for a clearer
picture. When debugging send problems catalog views are your friend and when
debugging receive problems the profiler is your friend.
Also don't forget to look at your queues if they hold any
errors in them. After five failed concurrent transactions you will get what is
known as Message Poisoning after which your queue will get disabled and a
Broker:Queue Disabled event will be fired. If this happens you will have
problems like I did. So be sure to handle this in your
activation stored procedure on the target part of the conversation.
Conclusion
Hopefully, I shed some light on what each part of the
service broker helper system objects does and what to use on which occasion. It
may look hard at the beginning but as you go along you'll see that the same
patterns on how to troubleshoot usually repeat themselves.