<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Edwin M SarmientoReviewing the Windows Event Logs to Find Hints That Can Cause Availability Issues &#8211; Edwin M Sarmiento</title>
	<atom:link href="https://www.edwinmsarmiento.com/reviewing-the-windows-event-logs-sqlserver-ha/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.edwinmsarmiento.com</link>
	<description>Intentional Excellence</description>
	<lastBuildDate>Mon, 13 Apr 2026 21:00:49 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
<site xmlns="com-wordpress:feed-additions:1">84283043</site>		<item>
		<title>Reviewing the Windows Event Logs to Find Hints That Can Cause Availability Issues</title>
		<link>https://www.edwinmsarmiento.com/reviewing-the-windows-event-logs-sqlserver-ha/</link>
		<comments>https://www.edwinmsarmiento.com/reviewing-the-windows-event-logs-sqlserver-ha/#respond</comments>
		<pubDate>Tue, 15 Nov 2016 18:19:13 +0000</pubDate>
		<dc:creator>Edwin M Sarmiento</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server Availability Groups]]></category>
		<category><![CDATA[SQL Server Failover Clustered Instances]]></category>
		<category><![CDATA[Windows Event Logs]]></category>
		<guid isPermaLink="false">http://www.edwinmsarmiento.com/?p=3136</guid>

				<description><![CDATA[Troubleshooting Availability of SQL Server Workloads Running on Windows Server Failover Cluster - Part 3. SQL Server Failover Clustered Instances (FCI) and Availability Groups (AG) depend a lot on Windows Server Failover Clustering (WSFC). Understanding how the underlying WSFC platform works can help us maintain availability of our databases In the first blog post in this series, I talked about how to use the Cluster Dependency Report to identify the potential [&#8230;]]]></description>
					<content:encoded><![CDATA[<p><em id="gnt_postsubtitle" style="color:#770005;font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:1.3em;line-height:1.2em;font-weight:normal;font-style:italic;">Troubleshooting Availability of SQL Server Workloads Running on Windows Server Failover Cluster - Part 3</em></p> <img width="554" height="555" src="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered.jpg" class="featured-image wp-post-image" alt="" srcset="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered.jpg 554w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-150x150.jpg 150w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-300x300.jpg 300w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-35x35.jpg 35w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-399x400.jpg 399w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-82x82.jpg 82w" sizes="(max-width: 554px) 100vw, 554px" /><p>SQL Server Failover Clustered Instances (FCI) and Availability Groups (AG) depend a lot on Windows Server Failover Clustering (WSFC). Understanding how the underlying WSFC platform works can help us maintain availability of our databases</p>
<p style="text-align: left;"><div style="background-color:#eeeeee;border:1px solid #D6D6D6;font-family:arial,helvetica,sans-serif;font-size:15px;line-height:20px;margin:8px 0 20px;padding:15px 20px;"><span style="color: #333333;"><em>This blog post is the third in a series that talks about the process that I follow when troubleshooting availability of SQL Server failover clustered instances and Availability Groups. Note that the focus of this series is primarily on availability &#8211; identifying and dealing with downtime. </em></span></div></p>
<p>In the <a href="https://www.edwinmsarmiento.com/exploring-the-cluster-dependency-report/" target="_blank">first blog post</a> in this series, I talked about how to use the Cluster Dependency Report to identify the potential cause of a SQL Server FCI or AG being offline. In the <a href="https://www.edwinmsarmiento.com/sqlserver-error-log-troubleshooting-ha/" target="_blank">second blog post</a>, I talked about how to use the SQL Server error log to find the two most common keywords that can cause availability issues.</p>
<p>I&#8217;ve gotten some feedback from several SQL Server experts who have read my blog posts. A common theme about the feedback was &#8220;<span style="color: #800000;"><strong><em>It&#8217;s not that simple.</em></strong></span>&#8221;</p>
<p>From my experience working with customers on SQL Server FCI and AG deployments, the biggest challenge I see when addressing availability issues is &#8220;the human aspect&#8221; (well, actually, its true for just about any issue you can think of). Either the support team does not know what to do or are confused with which one to focus on first. Hence, the reason behind simplifying the process of identifying problems that are causing availability issues with the overall goal of bringing the SQL Server FCI or AG back online as quickly as we possibly can.</p>
<h2>Navigating The Windows Event Log</h2>
<p>After looking at the Cluster Dependency Report, you can quickly identify the possible component that caused your SQL Server FCI or AG to go offline. If it&#8217;s the SQL Server resource, you can head over to the SQL Server error log to identify what caused the issue from the database engine point-of-view. But if it isn&#8217;t, this is where the Windows Event Log can help.</p>
<p>The Windows Event Log can be very overwhelming especially when you have other stuff running on your WSFC such as monitoring agents, firmware, utilities, etc. To simplify the troubleshooting process, I start with the <strong>System</strong> event log, filter out any other messages and only display those that are marked <strong>Critical</strong>, <strong>Waring</strong> and <strong>Error</strong>.</p>
<p><img fetchpriority="high" decoding="async" class="aligncenter size-full wp-image-3141" src="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered.jpg" alt="eventlogfiltered" width="554" height="555" srcset="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered.jpg 554w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-150x150.jpg 150w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-300x300.jpg 300w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-35x35.jpg 35w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-399x400.jpg 399w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/EventLogFiltered-82x82.jpg 82w" sizes="(max-width: 554px) 100vw, 554px" /></p>
<p>This makes it easy for me to only focus on what events were reported with these levels that may have caused availability issues to my SQL Server FCI and AG.</p>
<p>As always, I try to minimize the use of the GUI especially for repetitive tasks. So, I use the PowerShell cmdlet <a href="https://msdn.microsoft.com/powershell/reference/5.1/microsoft.powershell.management/Get-EventLog" target="_blank">Get-EventLog</a> to do the trick.</p>
<pre class="brush: powershell; title: ; notranslate">

Get-EventLog -LogName System -EntryType Error,Warning

</pre>
<p>The beauty of this is that I can pass a list of server names as a parameter in the Get-EventLog PowerShell cmdlet using the <strong>-ComputerName</strong> parameter. Which means I can display all of the Error and Warning events for all of the member servers in my WSFC &#8211; in a single line of PowerShell code.</p>
<pre class="brush: powershell; title: ; notranslate">

Get-EventLog -LogName System -ComputerName WSFC-NODE1, WSFC-NODE2, WSFC-NODE3 -EntryType Error,Warning

</pre>
<p>Even better, I can display only those events that occurred within a specific time frame. That way, I can really zoom in on those specific events when the issue occurred (or when the customer decided to report it).</p>
<pre class="brush: powershell; title: ; notranslate">
$Oct31 = Get-Date 10/31/2016
$Nov13 = Get-Date 11/13/2016
Get-EventLog -LogName System -ComputerName WSFC-NODE1, WSFC-NODE2, WSFC-NODE3 -EntryType Error,Warning -After $Oct31 -before $Nov13
</pre>
<p>You might be wondering why the <strong>Critical</strong> events are not included in the -EntryType parameter. That&#8217;s because the Get-EventLog PowerShell cmdlet is old school &#8211; it&#8217;s been around ever since the Monad days. A recommended approach is to use the <a href="https://msdn.microsoft.com/powershell/reference/5.1/microsoft.powershell.diagnostics/Get-WinEvent" target="_blank">Get-WinEvent</a> PowerShell cmdlet. The only reason I use Get-EventLog is because its easier to use and I know that I don&#8217;t have to worry about what version of PowerShell is running on the servers. For more complex PowerShell scripts, I use the <a href="https://msdn.microsoft.com/powershell/reference/5.1/microsoft.powershell.diagnostics/Get-WinEvent" target="_blank">Get-WinEvent</a> PowerShell cmdlet.</p>
<h2>Failover Clustering Event Logs</h2>
<p>I&#8217;ve avoided talking about the failover cluster error logs up to this point. That&#8217;s because I don&#8217;t want support engineers and server administrators to look at the this as their first option. <strong>I want it to be the last option, especially the cluster debug logs</strong>. I&#8217;ve seen engineers waste a lot of time trying to fix an availability issue with a SQL Server FCI or AG only to find out that the real issue is outside of the failover cluster. One very common example is when the virtual network name could not come online. It could be because there is a duplicate IP address on the network or that the virtual computer object was accidentally deleted in Active Directory. Had the troubleshooting process started with the Cluster Dependency Report, it could have been communicated a lot sooner to the Active Directory or the DNS administrators to get them involved in solving the issue.</p>
<p>Now, on to the Failover Clustering Event Logs. Open the Windows Event Logs via Event Viewer and navigate to <strong>Applications and Services Log -&gt; Microsoft -&gt; Windows -&gt; FailoverClustering </strong>to start with. Have a look at the different event categories that you can review to identify the cause of availability issue.</p>
<p><img decoding="async" class="aligncenter size-full wp-image-3138" src="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/Windows-Event-Log-WSFC.jpg" alt="windows-event-log-wsfc" width="272" height="490" srcset="https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/Windows-Event-Log-WSFC.jpg 272w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/Windows-Event-Log-WSFC-167x300.jpg 167w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/Windows-Event-Log-WSFC-222x400.jpg 222w, https://www.edwinmsarmiento.com/wp-content/uploads/2016/11/Windows-Event-Log-WSFC-82x148.jpg 82w" sizes="(max-width: 272px) 100vw, 272px" /></p>
<p>Just by looking at the screenshot, you know that this in itself is still confusing. Now, you know why I didn&#8217;t start with this option first <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<p>I usually ignore events that are irrelevant. For example, I ignore anything under the <strong>FailoverClustering-CsvFs</strong> and <strong>FailoverClustering-CsvFlt</strong> if I don&#8217;t have cluster shared volumes (CSVs) configured in the WSFC. But I go directly to the <strong> FailoverClustering -&gt; Diagnostic </strong>and<strong> Operational </strong>events to start digging deeper.</p>
<h2>Digging Deeper</h2>
<p>At this point in the troubleshooting process, you&#8217;ve gone from quickly identifying why your SQL Server FCI or AG is unavailable to finding out the root cause. <span style="color: #800000;"><strong>Stop right there.</strong></span></p>
<p>Didn&#8217;t we say that the goal was to bring the SQL Server FCI or AG online as quickly as we possibly can? It&#8217;s not that I&#8217;m discouraging you from finding the root cause of the problem. It&#8217;s just that we need to <span style="color: #0000ff;"><strong>be focused on the main goal</strong></span>. If you&#8217;ve identified why your SQL Server FCI or AG is offline, make every effort to bring it back online as soon as you possibly can. Leave the &#8220;digging deeper&#8221; part after everything is back to normal.</p>
<p>In the next blog post, I&#8217;ll talk about how to read the cluster debug log and how to identify the real root cause of the availability issue. And while I like getting really geeky about reading the cluster debug log or even crash dumps, I don&#8217;t want you to do that first when trying to resolve an availability issue unless you really have no choice.</p>
]]></content:encoded>
			

		<wfw:commentRss>https://www.edwinmsarmiento.com/reviewing-the-windows-event-logs-sqlserver-ha/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
				<post-id xmlns="com-wordpress:feed-additions:1">3136</post-id>	</item>
	</channel>
</rss>