From Reddit: "A single SQL query got stuck for millions of seconds and we only noticed it after it filled up the log partition, the disk usage alert was our only signal."
Does reading these lines feel familiar? This is not an edge case. It is a real problem experienced in IT environments, where teams suffer silently for weeks, only realizing there is an issue when the disk is full or users complain. And unfortunately, general infrastructure monitoring tools are full of blind spots for these types of problems and fall short.
In this article, we will examine both the technical anatomy of this problem and how the SolarWinds Database Observability family — particularly SQL Sentry — can detect SQL query and similar crises before they happen.
Anatomy of the ProblemWhy Wasn't the Monitoring Tool Used Enough?
Infrastructure monitoring tools like PRTG are designed to monitor OS-level metrics such as CPU, RAM, disk I/O, and network. While these metrics may look normal, a disaster could be growing inside SQL Server. This is because SQL Server has a highly complex resource management layer of its own.
From Database Monitoring to Database Mastery →Critical Blind Spot: When a SQL query is in a WAITING state, it does not consume any CPU. At the OS level, it looks like nothing is happening, but the transaction remains open, continues to hold locks, and continues to consume log space.
A SQL query stuck for minutes was most likely waiting in a lock chain. It is impossible to see this from PRTG, because nothing looks abnormal at the OS level.
Silent Death in SQL Server: Wait Statistics
SQL Server records the reason why a SQL query is waiting through wait statistics. These live in DMVs (Dynamic Management Views) such as sys.dm_exec_requests, sys.dm_os_wait_stats, and sys.dm_exec_sessions.
Running this query manually means nothing on its own — the problem must be monitored continuously and automatically, not just at a single moment. This is exactly where the difference between infrastructure tools and a true database observability platform becomes clear.
"The phrase 'we didn't notice until the disk started filling up' is not a monitoring problem, it is an observability problem. The difference between watching and understanding lies exactly here." — SQL Sentry Team, SolarWinds
What Does SQL Sentry See?
SQL Sentry is the SQL Server-specific component of the SolarWinds Database Observability family. Far beyond OS-level metrics, it monitors the inside of the SQL Server engine in real-time.
Dimensions Monitored in Real-Time
- Top SQL & Query Plans: Which query has been running for how long, has the query plan changed since the last execution?
- Blocking Chains: Which session is blocking whom? Where is the root cause of the block chain?
- Wait Statistics: Real-time analysis of wait types like LCK_M_X, PAGEIOLATCH, CXPACKET
- Deadlock Graph: Analyze deadlocks visually instead of XML, keep historical records
- Transaction Log Usage: Why the log filled up, which session consumed the log space
- Tempdb Pressure: Pre-detect tempdb allocation failures
- Index Usage Analysis: Missing index recommendations, unused index detection
-
01
DMV (sys.dm_exec_requests): The raw data source.
-
02
SQL Sentry Engine Analysis: Interpretation and correlation of the data.
-
03
Threshold Breach — Alert: Detection of an abnormal situation.
-
04
Automated Action or Notification: Alerting the relevant teams.
-
05
Root Cause Analysis: Finding and resolving the source of the problem.
Alert Configuration
In SQL Sentry, the rule "generate an alert for any query that takes longer than 30 seconds" can be created directly from the UI. Not only that — composite conditions such as who is at the root of the blocking chain, why a wait type is taking so long, or whether log usage has passed a critical threshold can also be defined. When an alert is triggered, you don't just receive a notification; the relevant query text, execution plan, and session details can also be viewed instantly.
| Criterion | Previous State (PRTG) | With SQL Sentry |
|---|---|---|
| Alert Trigger | When disk starts filling up (Reactive) | Instant alert when 30 sec threshold is breached (Proactive) |
| Query Detection | No detection of stuck queries | Blocking chain is visually monitored |
| Root Cause | Investigated manually | Wait type → automated root cause detection |
| Response Speed | Delayed response because OS metrics appear normal | Query plan and session details are ready when alert triggers |
Understanding the Blocking Chain: The Invisible Crisis
In the scenario on Reddit, here is what most likely happened: A query was running inside an explicit transaction and held an LCK_M_X (exclusive lock) on a row/page. Then a network disconnect, an application crash, or another reason caused the session to fall into a zombie state. The transaction neither committed nor rolled back. The lock continued to be held.
SQL Sentry displays this chain as a live visual graph. Who is the root blocker, how many sessions are affected, what is the total wait time — all of these are on a single screen. Moreover, past blocking events are also saved; you can answer the question "what exactly happened that night?" after the incident.
EcosystemDatabase Observability Family: Integrated Power
SQL Sentry is powerful on its own, but the SolarWinds Database Observability family goes far beyond that. Database Performance Analyzer (DPA) historically analyzes query-level wait time and shows exactly which query slowed down, when, and why, going back hours or days.
- SQL Sentry: Real-time monitoring, blocking, deadlock, query plan analysis — an X-ray of the inside of SQL Server
- Database Performance Analyzer: Historical wait analysis, trend detection, capacity planning
- SolarWinds Observability: Full-stack visibility — infrastructure + application + database on a single platform
- Database Mapper: Dependency map — which application is hitting which database with which query
When used together, this family doesn't just say "there's a problem right now" — it also answers questions like "why did this problem occur, when did it start, what query plan change triggered it, after which application deployment did it get worse". The shift from reactive monitoring to proactive observability is made possible by this exact integrated approach.
How Observability Makes a Difference in Database Status Monitoring → Practical Advice3 Things You Can Do Right Now
Whether SQL Sentry is installed or not, these three configurations will bring your SQL Server environment to a much healthier point:
- Enforce a query timeout:
CommandTimeoutmust absolutely be set at the application layer. No query should run indefinitely — this should be a common rule for both the application and the DBA side. - Blocked process threshold: Set SQL Server's own
blocked process thresholdsetting to 20-30 seconds. This activates SQL Server's own Extended Events mechanism and works integrated with SQL Sentry. - Transaction log monitoring: Monitor the size and growth rate of the log file at the SQL level, not in PRTG — knowing only the size without knowing which session is consuming the log space is insufficient.
How Long Have Your Queries Been Waiting Silently?
Bring true visibility to your SQL Server environment with SQL Sentry and the SolarWinds Database Observability family. Don't wait until the disk is full.
Request a Demo for SolarWinds Database Observability! →