Hopefully, this article helps others who encounter this "fun," random issue that I recently experienced. Here's my story!
We received a new shipment of Dell R640s - the same exact order that we had been purchasing repeatedly for a couple years now. Upon racking them and setting up iDRAC, we went to PXE install ESXi 6.7 U3 (Dell custom image). Unfortunately, after the NIC received its DHCP offer and the initial boot files over TFTP, the ESXi installer would show up and hang. It would simply be stuck loading the kernel (usually b.b00 or tboot.00).
After troubleshooting literally everything I could think of, we were stuck. Here's what we tried:
- Take dump of older R640s' configs via iDRAC and import them into the new R640s so that BIOS settings were identical
- Swap DACs between switch and R640
- Various different types of switch configs on the impacted ports
- Install ESXi via iDRAC's remote media feature with the same ISO that PXE was installing (which worked)
- Install ESXi inside a VM on the R640 via PXE (which also worked)
- Upgrade BIOS and NIC firmware
- Turn knobs in the NIC's BIOS to see if something was set incorrectly or oddly
- PXE install the same OS on another node (which worked)
So, as seen by the abbreviated list above, I tried almost everything and was certain that it was a CPU or NIC option that was causing the issue. I decided to downgrade the new R640's Broadcom 57412 (advanced 10/25Gb) NIC's firmware to match the older R640's and...it worked! Do I know the root cause and why this solution works? Nope. I intend to read the changelogs and try to identify what triggered this but for now, here's the solution!
- Download v22.214.171.124_03 of the Broadcom NetXtreme-E firmware from Dell's website. You'll want the Windows EXE. https://dl.dell.com/FOLDER05947739M/4/Network_Firmware_YK81Y_WN64_126.96.36.199_03.EXE
- Login to iDRAC
- Navigate to Maintenance > System Update > Manual Update
- Browse locally to the file that you downloaded above
- Commit the update and tell the machine to start now with a reboot afterwards
- You should see the iDRAC upgrade utility boot up. This will take a few minutes.
- After it successfully flashes the new firmware and reboots, you can retry your PXE installation. It should work now!
While I feel like many people won't hit this issue, I felt the need to post just in case it helps at least one other person. I was going crazy trying to figure it out!