IMPORTANT ANNOUNCEMENT FOR OUR READERS!
AskPFEPlat is in the process of a transformation to the new Core Infrastructure and Security TechCommunity, and will be moving by the end of March 2019 to our new home at https://aka.ms/CISTechComm (hosted at https://techcommunity.microsoft.com). Please bear with us while we are still under construction!
We will continue bringing you the same great content, from the same great contributors, on our new platform. Until then, you can access our new content on either https://aka.ms/askpfeplat as you do today, or at our new site https://aka.ms/CISTechComm. Please feel free to update your bookmarks accordingly!
Why are we doing this? Simple really; we are looking to expand our team internally in order to provide you even more great content, as well as take on a more proactive role in the future with our readers (more to come on that later)! Since our team encompasses many more roles than Premier Field Engineers these days, we felt it was also time we reflected that initial expansion.
If you have never visited the TechCommunity site, it can be found at https://techcommunity.microsoft.com. On the TechCommunity site, you will find numerous technical communities across many topics, which include discussion areas, along with blog content.
NOTE: In addition to the AskPFEPlat-to-Core Infrastructure and Security transformation, Premier Field Engineers from all technology areas will be working together to expand the TechCommunity site even further, joining together in the technology agnostic Premier Field Engineering TechCommunity (along with Core Infrastructure and Security), which can be found at https://aka.ms/PFETechComm!
As always, thank you for continuing to read the Core Infrastructure and Security (AskPFEPlat) blog, and we look forward to providing you more great content well into the future!
Not sure where February went but it sure is flying by in a hurry. During the month I had another interesting question given to me to answer. Don’t forget that you can contact us using the Contact tile just to the right of this article when viewed from our TechNet blog.
When building a cluster, is it okay to use the same model network adapters for all interfaces on the same node?
Thoughts regarding the question
Technically, the answer to this question is that it is perfectly fine to use the same model network adapters for all interfaces in the cluster. Modern clusters (Windows Server 2008 and newer) need to pass validation in order to be supported…and to avoid issues. Assuming that validation passes with the network adapters chosen, then you should be good to go. However, my conservative nature tends to want to take this a step further…not just what is supported, but what might be better and supported. The validation process tells you that things are working as expected. But what about later if the single driver that services all network adapters in the system malfunctions and prevents any communication? A failover cluster may experience a failover that might otherwise be prevented if communication were possible through at least one network interface.
With a single driver for all network interfaces it is possible that all communication may be impacted. I’ve seen that same scenario play out more times than I can count over the 16 years I’ve supported clusters at Microsoft. Those issues typically go away like waving a magic wand when the offending network driver receives the proper update. Usually the adapters using the same malfunctioning driver don’t end up completely non-functional…but when there’s a problem it may trigger an otherwise unnecessary failover. Why? Because there are timing tolerances for node to node communications as well as global updates and out of tolerance delay on all interfaces looks like a failure of all networks when the single network driver flakes out. As a result, the cluster has to try to recover from the situation to keep resources highly available. Thus, the single network driver approach can be vulnerable.
When I build a cluster, or when someone asks me about building one, I suggest using slightly different models of network adapters within the same server for the public and private networks. They can even be from the same manufacturer as long as they use a different driver. This way, you’re using two different adapter drivers. If one of them fails and renders corresponding adapters useless, you still have the potential for other adapters in the system to function with the other driver(s). One could argue that other single driver situations for storage or other devices could be considered failure points as well. When I’ve seen that happen, typically I/O operations get retried but access to storage may not complexly fail. Such incidents may be transient, recoverable, and noted in the event log. Failover Cluster nodes need to be able to communicate which is why I have the opinion I do about network adapters and their corresponding drivers. It is important to remove as many single points of failure as possible. When it comes to communication, network adapters are typically inexpensive. Again…what I’m saying here is not a design requirement. It’s just an opinion based on experience.
Circling back around to the original question, it is perfectly fine to use adapters in the same server that are all the same model and use the same driver. However, It might be good to consider slightly different network adapters to avoid a single network adapter driver as a single point of failure. It is also wise to keep hardware configurations as consistent as possible amongst all failover cluster nodes. Consistency of hardware across nodes is always a plus in my opinion.
A really great post about cluster validation can be found below:
Until next time!