[Ntop-misc] PF_RING/TNAPI hang

Luca Deri deri at ntop.org
Sun Dec 13 15:05:20 CET 2009


George
did you try to balance interrupts? If not please read this http://www.ntop.org/wordpress/?p=1 (it's a blog I will announce once it's completed).

Cheers Luca

On Dec 5, 2009, at 12:01 AM, George Wynes wrote:

> Long mail -- apologies.
> 
> I've run into problems trying to implement PF_RING/TNAPI to to capture
> traffic off a 10GbE link.  My end goal is to run multiple instances of
> snort.  I need to be able to capture over 1Mpps for it to be any use.
> 
> I've been making slow progress but now i'm stuck.
> 
> Here's the relevant spec from my environment - I can provide more if
> necessary:
> 
>         4 x 4 Core Xeon E7338 @ 2.40GHz
>         Ubuntu 9.4 (kernel 2.6.28-11-server)
>         Intel 7300 Chipset
> 
> With PF_RING alone and enabling ioatdma and dca (many thanks to this
> list), I am able to capture about 250K - 400K pps depending on the system
> tuning performed but I whatever knobs I twist I always hit the single
> CPU core bottleneck.
>  
> With TNAPI included though, pfcount no longer works.  It immediately hangs after
> the first line of output...
>  
> [...snipped...]
> gwynes at host-01:/var/tmp/PF_RING/userland/examples$ sudo ./pfcount -i eth14 at 12
> Capturing from eth14 at 12
> [...snipped...]
> 
> ...and eventually I get messages written to the console...
>  
> [...snipped...]
> [ 1554.779998] BUG: soft lockup - CPU#0 stuck for 61s! [tnapi(eth14.0):4851]
> [...snipped...]
> I get the same result from both the parent device and the individual
> queues.  I've also tried the different example applications and tcpdump.
> While this is going on there are tnapi processes running on all CPUs at
> about 50% except for the one on CPU0 which is at 100%.  Also, ethtool -S
> eth14 shows traffic being received on all of the RX channels while this is going
> on.
>  
> Any ideas where to look?  I've included the output i'm getting in
> /var/log/messages. I think this is showing me that dca, ioatdma and
> ixgbe are set up correctly. I've tried newer/older kernels, CentOS 5.4
> (TNAPI wouldn't compile) and Ubuntu 9.10 (same problem).
> Any pointers appreciated.
> 
> Thanks.
> 
> -g
> 
> [...snipped...]
> Dec  4 08:13:35 host-01 kernel: [ 1235.765771] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44.14-NAPI
> Dec  4 08:13:35 host-01 kernel: [ 1235.765775] Copyright (c) 1999-2009 Intel Corporation.
> Dec  4 08:13:35 host-01 kernel: [ 1235.765878] ixgbe 0000:0f:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> Dec  4 08:13:35 host-01 kernel: [ 1235.865886] ixgbe: 0000:0f:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16
> Dec  4 08:13:35 host-01 kernel: [ 1235.867748] ixgbe: eth14: ixgbe_probe: (PCI Express:2.5Gb/s:Width x8) 00:1b:21:20:dd:6b
> Dec  4 08:13:35 host-01 kernel: [ 1235.867832] ixgbe: eth14: ixgbe_probe: MAC: 1, PHY: 5, PBA No: e31879-002
> Dec  4 08:13:35 host-01 kernel: [ 1235.867834] ixgbe: eth14: ixgbe_probe: Internal LRO is enabled
> Dec  4 08:13:35 host-01 kernel: [ 1235.867836] ixgbe: eth14: ixgbe_probe: Intel(R) 10 Gigabit Network Connection
> Dec  4 08:13:35 host-01 kernel: [ 1235.867864] TNAPI: (C) 2006-09 ntop.org
> Dec  4 08:13:35 host-01 kernel: [ 1235.867866] TNAPI: using kernel threads for packet polling
> Dec  4 08:14:15 host-01 kernel: [ 1276.162906] TNAPI: init_tnapi(eth14)
> Dec  4 08:14:15 host-01 kernel: [ 1276.162915] ---- TNAPI: eth14 added at device slot 0
> Dec  4 08:14:15 host-01 kernel: [ 1276.163007] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163061] TNAPI: spawn thread [eth14] on CPU 10
> Dec  4 08:14:15 host-01 kernel: [ 1276.163107] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163160] TNAPI: spawn thread [eth14] on CPU 10
> Dec  4 08:14:15 host-01 kernel: [ 1276.163206] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163256] TNAPI: spawn thread [eth14] on CPU 13
> Dec  4 08:14:15 host-01 kernel: [ 1276.163304] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163349] TNAPI: spawn thread [eth14] on CPU 13
> Dec  4 08:14:15 host-01 kernel: [ 1276.163403] TNAPI: spawn thread [eth14] on CPU 9
> Dec  4 08:14:15 host-01 kernel: [ 1276.163449] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163496] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163539] TNAPI: spawn thread [eth14] on CPU 13
> Dec  4 08:14:15 host-01 kernel: [ 1276.163583] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163626] TNAPI: spawn thread [eth14] on CPU 13
> Dec  4 08:14:15 host-01 kernel: [ 1276.163669] TNAPI: spawn thread [eth14] on CPU 7
> Dec  4 08:14:15 host-01 kernel: [ 1276.163716] TNAPI: spawn thread [eth14] on CPU 13
> Dec  4 08:14:15 host-01 kernel: [ 1276.226313] ADDRCONF(NETDEV_UP): eth14: link is not ready
> Dec  4 08:14:15 host-01 kernel: [ 1276.227861] ixgbe: eth14: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> Dec  4 08:14:15 host-01 kernel: [ 1276.234144] ADDRCONF(NETDEV_CHANGE): eth14: link becomes ready
> Dec  4 08:17:49 host-01 kernel: [ 1489.974496] [PF_RING] Welcome to PF_RING 4.1.0 ($Revision: 3982 $)
> Dec  4 08:17:49 host-01 kernel: [ 1489.974498] (C) 2004-09 L.Deri <deri at ntop.org>
> Dec  4 08:17:49 host-01 kernel: [ 1489.974502] NET: Registered protocol family 27
> Dec  4 08:17:49 host-01 kernel: [ 1489.974513] [PF_RING] Ring slots       4096
> Dec  4 08:17:49 host-01 kernel: [ 1489.974514] [PF_RING] Slot version     10
> Dec  4 08:17:49 host-01 kernel: [ 1489.974516] [PF_RING] Capture TX       Yes [RX+TX]
> Dec  4 08:17:49 host-01 kernel: [ 1489.974517] [PF_RING] Transparent Mode 0
> Dec  4 08:17:49 host-01 kernel: [ 1489.974519] [PF_RING] IP Defragment    No
> Dec  4 08:17:49 host-01 kernel: [ 1489.974520] [PF_RING] Initialized correctly
> Dec  4 08:17:49 host-01 kernel: [ 1489.974535] [PF_RING] registered /proc/net/pf_ring/
> Dec  4 08:17:49 host-01 kernel: [ 1489.975093] [PF_RING] successfully allocated 995328 bytes at 0xffffc20013562000
> Dec  4 08:17:49 host-01 kernel: [ 1489.975096] [PF_RING] allocated 4111 slots [slot_len=242][tot_mem=995328]
> Dec  4 08:17:49 host-01 kernel: [ 1489.975226] device eth14 entered promiscuous mode
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] Modules linked in: pf_ring ixgbe ioatdma msr isofs udf crc_itu_t video output input_polldev lp parport iTCO_wdt iTCO_vendor_support psmouse pcspkr serio_raw tpm_infineon tpm i7300_idle joydev tpm_bios shpchp dca ses enclosure usbhid usb_storage qla2xxx scsi_transport_fc scsi_tgt mptsas niu mptscsih mptbase e1000e scsi_transport_sas fbcon tileblit font bitblit softcursor [last unloaded: ixgbe]
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] CPU 0:
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] Modules linked in: pf_ring ixgbe ioatdma msr isofs udf crc_itu_t video output input_polldev lp parport iTCO_wdt iTCO_vendor_support psmouse pcspkr serio_raw tpm_infineon tpm i7300_idle joydev tpm_bios shpchp dca ses enclosure usbhid usb_storage qla2xxx scsi_transport_fc scsi_tgt mptsas niu mptscsih mptbase e1000e scsi_transport_sas fbcon tileblit font bitblit softcursor [last unloaded: ixgbe]
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] Pid: 4851, comm: tnapi(eth14.0) Not tainted 2.6.28-11-server #42-Ubuntu
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] RIP: 0010:[<ffffffffa0255c92>]  [<ffffffffa0255c92>] ixgbe_clean_rxtx_many+0x232/0x2d0 [ixgbe]
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] RSP: 0018:ffffffff80a9ae40  EFLAGS: 00000202
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] RAX: ffff88201ac82320 RBX: ffffffff80a9aea0 RCX: ffff88201ac82310
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] RDX: ffff88201ac82490 RSI: 0000000000000010 RDI: 0000000000000202
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] RBP: ffffffff80a9adc0 R08: 0000000000000010 R09: ffff88201ac82358
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] R10: ffff8800a75dd000 R11: 0000000000000000 R12: ffffffff80213668
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] R13: ffffffff80a9adc0 R14: 0000000000000001 R15: 0000000000000009
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] FS:  0000000000000000(0000) GS:ffffffff80aa3000(0000) knlGS:0000000000000000
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] CR2: 00007f8a00d25540 CR3: 00000020140b8000 CR4: 00000000000006a0
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009] Call Trace:
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  <IRQ>  [<ffffffff805b6ae4>] ? net_rx_action+0x104/0x240
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80256a5c>] ? __do_softirq+0x9c/0x170
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80213d8c>] ? call_softirq+0x1c/0x30
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80214ffd>] ? do_softirq+0x5d/0xa0
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff802567dd>] ? irq_exit+0x8d/0xa0
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff802152c5>] ? do_IRQ+0xc5/0x110
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80212bf3>] ? ret_from_intr+0x0/0x29
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  <EOI>  [<ffffffff805c8411>] ? eth_type_trans+0x1/0x170
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffffa0252123>] ? pkt_poll_thread+0x1d3/0x5d0 [ixgbe]
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80268890>] ? autoremove_wake_function+0x0/0x40
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffffa0251f50>] ? pkt_poll_thread+0x0/0x5d0 [ixgbe]
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80268429>] ? kthread+0x49/0x90
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff80213979>] ? child_rip+0xa/0x11
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff802683e0>] ? kthread+0x0/0x90
> Dec  4 08:18:54 host-01 kernel: [ 1554.780009]  [<ffffffff8021396f>] ? child_rip+0x0/0x11
> [...snipped...]
> 
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc at listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc



More information about the Ntop-misc mailing list