[Ntop-misc] Kernel panic with PF_RING on 2.6.14
Amit D. Chaudhary
amit_ml at rajgad.com
Wed Jun 21 23:47:52 CEST 2006
I got atleast a couple of private emails asking if the problem I ran
into was solved.
The problem was due to a faulty hardware platform, we could not pin
point what part of it (memory, processor, etc), it was a regular P4
pizza box for data centers.
It went away with other systems and we RMA'ed the faulty hardware back.
Thanks
Amit
Amit D. Chaudhary wrote:
> Hi,
>
> I am sending it out incase it tells anyone something else. It happens
> when sending lot of traffic and then killing pcount or other
> application reading from the interface. There is still some packet in
> queue, probably a race condition, but I could not nail it down.
>
> Also, note should
> kfree(ptr);
> in ring_remove be
> kfree(entry);
> instead. It does not avoid the crash below though.
>
> ps: I am unable to get objdump to give me assembly mixed with source
> (-DS), even when turning O2 to O, any ideas?
>
> Thanks
> Amit
>
> The stack trace and kernel crash is below,
>
> Unable to handle kernel paging request at virtual address 00100108
> printing eip:
> f8a9449d
> *pde = 7e82c067
> Oops: 0000 [#1]
> SMP
> Modules linked in: ring(U) loop(U) ipv6(U) parport_pc(U) lp(U)
> parport(U) autofs4()CPU: 1
> EIP: 0060:[<f8a9449d>] Not tainted VLI
> EFLAGS: 00010216 (2.6.14-1.1653_FC4_GEN5smp)
> EIP is at skb_ring_handler+0x44/0x1c9 [ring]
> eax: f8a97140 ebx: f7038300 ecx: 00000001 edx: 00000001
> esi: 00100100 edi: f70ff480 ebp: 00000001 esp: f7e21e4c
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=f7e20000 task=f7e99560)
> Stack: f79a0380 018e0ddb 00000000 f8a94459 f79a0380 00002683 f70ff480
> c02b162a
> f79a0000 c0393090 f79a0000 f70ff480 0000000e f79a0380 00002683
> f70ff480
> f88e04b2 c0416e2c 00000080 00000080 00000080 f79a0380 f70b6630
> 00002683
> Call Trace:
> [<f8a94459>] skb_ring_handler+0x0/0x1c9 [ring]
> [<c02b162a>] netif_receive_skb+0x21/0x34c
> [<f88e04b2>] e1000_clean_rx_irq+0x170/0x4cb [e1000]
> [<c011ba85>] __wake_up+0x32/0x43
> [<f88dfcff>] e1000_clean+0x42/0x122 [e1000]
> [<c02b1b35>] net_rx_action+0xb7/0x1bb
> [<c0124522>] __do_softirq+0x72/0xdc
> [<c01245bf>] do_softirq+0x33/0x36
> [<c0105abe>] do_IRQ+0x1e/0x24
> [<c010437a>] common_interrupt+0x1a/0x20
> [<c0101b71>] mwait_idle+0x25/0x43
> [<c020cc23>] acpi_processor_idle+0xf0/0x291
> [<c0101a04>] cpu_idle+0x4e/0x63
> Code: 7b a9 f8 85 c0 0f 84 0d 01 00 00 8b 35 28 7b a9 f8 c7 44 24 08
> 00 00 00 00 8
> <0>Kernel panic - not syncing: Fatal exception in interrupt
> [<c011f448>] panic+0x45/0x1c4
> [<c0104c0f>] die+0x17b/0x185
> [<c0310530>] do_page_fault+0x0/0x60e
> [<c0310851>] do_page_fault+0x321/0x60e
> [<c0310530>] do_page_fault+0x0/0x60e
> [<c01044d3>] error_code+0x4f/0x54
> [<f8a9449d>] skb_ring_handler+0x44/0x1c9 [ring]
> [<f8a94459>] skb_ring_handler+0x0/0x1c9 [ring]
> [<c02b162a>] netif_receive_skb+0x21/0x34c
> [<f88e04b2>] e1000_clean_rx_irq+0x170/0x4cb [e1000]
> [<c011ba85>] __wake_up+0x32/0x43
> [<f88dfcff>] e1000_clean+0x42/0x122 [e1000]
> [<c02b1b35>] net_rx_action+0xb7/0x1bb
> [<c0124522>] __do_softirq+0x72/0xdc
> [<c01245bf>] do_softirq+0x33/0x36
> [<c0105abe>] do_IRQ+0x1e/0x24
> [<c010437a>] common_interrupt+0x1a/0x20
> [<c0101b71>] mwait_idle+0x25/0x43
> [<c020cc23>] acpi_processor_idle+0xf0/0x291
> [<c0101a04>] cpu_idle+0x4e/0x63
>
>
>
> Amit D. Chaudhary wrote:
>
>> Update: The problem went away when running on a different (our older)
>> server. The server with the problem was a new eval box with newer P4
>> chips, etc.
>> The kernel config still has P4\Xeon has processor.
>>
>> Thanks
>> Amit
>>
>> ps: I do have serial console and logging for future.
>>
>>
>> Amit D. Chaudhary wrote:
>>
>>> John,
>>>
>>> I also have Xeon\P4 selected as the main processor. Thanks for the
>>> note, will try this out.
>>>
>>> Brad, this is lame to admit, but I do not have a serial console
>>> cable (Small Co), will get it soon.
>>>
>>> Regards
>>> Amit
>>>
>>> John Hally wrote:
>>>
>>>> I had a similar problem with the latest stable 2.6.15. As soon as
>>>> I fired
>>>> off a snort instance that was compiled against the ring pcap
>>>> lib/includes,
>>>> the kernel panicked, but left nothing in the messages file.
>>>> Running on
>>>> Fedora core4, dual 3.4ghz Xeons with 2gb ram using the built in
>>>> gbit nics.
>>>> About the only thing I changed in the kernel config is selecting
>>>> the xeon
>>>> processor instead of the generic Pentium default.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: ntop-misc-bounces at listgateway.unipi.it
>>>> [mailto:ntop-misc-bounces at listgateway.unipi.it] On Behalf Of Amit D.
>>>> Chaudhary
>>>> Sent: Wednesday, January 11, 2006 8:10 PM
>>>> To: ntop-misc at listgateway.unipi.it
>>>> Subject: Re: [Ntop-misc] Kernel panic with PF_RING on 2.6.14
>>>>
>>>> Brad,
>>>>
>>>> I will try unpatches kernels for more testing.
>>>>
>>>> We do not use filtering or clustering\transparent mode, etc for
>>>> that, so I would not notice any problems.
>>>>
>>>> Thanks
>>>> Amit
>>>>
>>>> Brad Doctor wrote:
>>>>
>>>>
>>>>
>>>>> Never seen a panic like this, definitely not. But that isn't a
>>>>> pristine
>>>>> kernel and I don't know what Fedora is putting into their kernels
>>>>> these
>>>>>
>>>>
>>>>
>>>> days.
>>>>
>>>>
>>>>> Alll of mine are from kernel.org and patched just a little, mostly
>>>>> drivers
>>>>>
>>>>
>>>>
>>>> for
>>>>
>>>>
>>>>> the Syskonnects. sky2 presently since sk98lin was apparently
>>>>> EOL'ed since .13 or so. The 0.9 version of the driver (included
>>>>> in src) is good but the
>>>>> .10 driver is not so good. Anyway...:)
>>>>>
>>>>> I've been using 2.6.15 and have been far happier with it. Basically
>>>>>
>>>>
>>>>
>>>> skipped
>>>>
>>>>
>>>>> serious testing from 2.6.11 to .15. .10 was good, .11 was
>>>>> horrible, and
>>>>>
>>>>
>>>>
>>>> .15
>>>>
>>>>
>>>>> is great so far!
>>>>>
>>>>> On the performance part below - great news! And the filtering is
>>>>> proper?
>>>>>
>>>>> -brad
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Recently I upgraded a test system from 2.6.12 to 2.6.14
>>>>>> (2.6.14-1.1653_FC4smp to be precise. This is a single processor,
>>>>>> P4 box with e1000 as the net driver.
>>>>>> With high traffic coming in(650 mbps for 2 minutes), the system
>>>>>> crashes with the stack trace ending in skb_ring_handler() with
>>>>>> error_code and do_pagefault behind it. Any ideas on what is wrong?
>>>>>> This is reproducable and happens every 2-3 time after capturing
>>>>>> part of the traffic.
>>>>>> /etc/modprobe entry:
>>>>>> options ring bucket_len=96 num_slots=72767
>>>>>>
>>>>>> Anyone else running 2.6.14 with high speed traffic, is yes what
>>>>>> version?
>>>>>>
>>>>>> Nothing is changed in the ring code and it is same as cvs.
>>>>>> I am using libpcap-0.9.4 and pcap_poll to read, but that should
>>>>>> have no impact on this.
>>>>>>
>>>>>> I do not have the complete stack trace handy as for now, there is
>>>>>> no console logging.
>>>>>>
>>>>>> Brad, on the performance question, you had earlier with newer
>>>>>> libpcap and kernel, they are pretty close to each other with 1%
>>>>>> less or more between each runs for above traffic
>>>>>>
>>>>>> Thanks
>>>>>> Amit
>>>>>> _______________________________________________
>>>>>> Ntop-misc mailing list
>>>>>> Ntop-misc at listgateway.unipi.it
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop-misc mailing list
>>>> Ntop-misc at listgateway.unipi.it
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>>> _______________________________________________
>>>> Ntop-misc mailing list
>>>> Ntop-misc at listgateway.unipi.it
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Ntop-misc mailing list
>>> Ntop-misc at listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>>
>> _______________________________________________
>> Ntop-misc mailing list
>> Ntop-misc at listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc at listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>
More information about the Ntop-misc
mailing list