Jumbo Frame Support In XenServer

  • Post author:
  • Post category:IT
  • Post comments:0评论

XenServer – In depth investigation series

Background

Jumbo frame is always tricky because there is no standard.

In XenServer environment, jumbo frame is often used for network (storage network) used for IP based Storage traffic. But not all NIC drivers support jumbo frame, unfortunately the NIC driver (kernel module) documentation doesn’t normally mention jumbo frame supportability.

Symptom

Recently I’ve discovered that jumbo frame is NOT supported for Cisco VIC Ethernet NIC driver –enic, to my surprise. I would have thought that all Cisco NICs should support jumbo frame just because it carries the Cisco badge. BUT, it’s not the case.

If one insists on enabling jumbo frame for storage network overenicdriven NICs or bond, kernel panic and random host reboots are expected.

NOTE: Kernel Crash dump generated by XenServer dom0 is different from Linux kernel crash dump generated by kdump (kexec-tools) running on bare metal. Of course, dom0 is the privileged first PV guest on a host.

Crash Dump Analysis

In kernel crash dump generated in/var/crash, we should see the following in xen.log

ip_fragment(defined innet/ipv4/ip_output.c), calledip_do_fragment) when IPv4 tried to fragment a large datagram (packet) because it could not be sent in one piece. This indicates that the packet size exceeded 1500 bytes. In other words, jumbo frame was enabled.

ip_do_fragmentthen calledskb_copy_bitsto copy bits from skb (socket buffer) to kernel buffer, during the process,memcpycaused segmentation fault, kernel mm trieddo_page_faultto handle page fault (determine address and the problem then pass it off to the appropriate routine) BUT failed unfortunately.

Based onbad_area_nosemaphoreand_bad_area_nosemaphore(defined inarch/x86/mm/fault.c) it seemed to be in an interrupt, with no user context (or were running in a region with pagefaults disabled), as a result the page fault could not be handled.

Looking deeper intono_context(defined inarch/x86/mm/fault.c), it seemed that kernel tried to access some bad page, triggeredoops_beginandoops_end(defined inarch/x86/kernel/dumpstack.c), do_exit (kernel/exit.c) called.

In dom0.log we saw similar call trace and more information about the Oops.

If you look intokernel/exit.c, we should understand thatBUG()was called. Kernel was not able to handle the paging request error nor recover, finally the running kernel gave up and panicked ;-D

Conclusion

The conclusion of the investigation is that enic does NOT support jumbo frame, DO NOT use it for storage networks on top of Cisco VIC NICs in XenServer.

I ended up changing the MTU for the storage network back to 1500 to fix the problem. The easy way is to remove the Storage IP, change the storage network MTU (if you don’t remove IP the MTU field is greyed out), reconfigure storage IP afterwards on each host in the pool. Alternatively, use xe command line (xe network-param-set uuid= MTU=1500) to change MTU for the network, and then unplug / plug the corresponding underlying PIFs are required, obviously more complicated process, your choice.

IMPORTANT: Broadcom NetXtreme II driver – bnx2x, you may know that jumbo frame can be enabled for bnx2x with GRO on back in XenServer 6.2 SP1 as perCTX200270(Yes, I wrote it…). It is NOT the case any more. This has changed, probably due the fact that the bnx2x driver keeps evolving.

The following Linux NIC drivers are known to support jumbo frame (some with conditions)

  • igb
  • ixgbe
  • e1000 (some cards may be affected due to errata)
  • e1000e (cards older than 82571 are affected)
  • bnx2 (not bnx2x)
  • be2net
  • bna
  • cxgb4

发表回复