[Ntop-dev] RES: RES: RES: Ntop processing only one packet
Rafael Sarres de Almeida
rafael.almeida at bcb.gov.br
Tue Sep 7 00:46:08 CEST 2010
Hi Luca;
Today I had some time to continue debugging this problem. The workaround I did may work, but I want to solve this correctly.
About your questions:
void* dequeuePacket(void* _deviceId) never seems to be executed (or I am doing something very wrong in gdb). I have put a lot of breakpoints in this function and it never stops. The breaks only stop in queuePacket. I think it does not run ever.
I think the dequeue thread is not running. How can I verify this? I see a lot of THREADMGMT logs, what is the dequeue thread? I can see 10 threads running when debugging:
(gdb) info thread
10 Thread 0x45a08940 (LWP 10165) 0x000000304ee9a1a1 in nanosleep () from /lib64/libc.so.6
* 9 Thread 0x45007940 (LWP 10149) queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad594042 "") at pbuf.c:2564
8 Thread 0x44606940 (LWP 10148) 0x000000304ee9a1a1 in nanosleep () from /lib64/libc.so.6
7 Thread 0x43c05940 (LWP 10147) 0x000000304eeccfc2 in select () from /lib64/libc.so.6
6 Thread 0x43204940 (LWP 10144) 0x000000304fa0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x42803940 (LWP 10143) 0x000000304fa0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x41e02940 (LWP 10142) 0x000000304fa0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
3 Thread 0x41401940 (LWP 10141) 0x000000304ee9a1a1 in nanosleep () from /lib64/libc.so.6
2 Thread 0x40a00940 (LWP 10140) 0x000000304ee9a1a1 in nanosleep () from /lib64/libc.so.6
1 Thread 0x2aaaab7e02c0 (LWP 10127) 0x000000304ee9a1a1 in nanosleep () from /lib64/libc.so.6
If you tell me where in the code this thread should be started, I can try to debug it and discover why it is not running (if it is the case).
Thanks.
Rafael Sarres de Almeida
-----Mensagem original-----
De: ntop-dev-bounces at listgateway.unipi.it [mailto:ntop-dev-bounces at listgateway.unipi.it] Em nome de Luca Deri
Enviada em: terça-feira, 31 de agosto de 2010 06:03
Para: ntop-dev at unipi.it
Assunto: Re: [Ntop-dev] RES: RES: Ntop processing only one packet
Rafael
thanks for debugging the code. The software works as follows:
- are we the only one processing packets? If so (i.e. no other threads
are doing this) then process the packet immediately. This turns into
if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex,
"queuePacket") == 0) {
/* Locked so we can process the packet now */
.....
processPacket(_deviceId, h, p1);
releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
return;
}
- if another thread is processing packets already, we need to queue the
packet
/*
If we reach this point it means that somebody was already
processing a packet so we need to queue it.
*/
if(myGlobals.device[deviceId].packetQueueLen >=
CONST_PACKET_QUEUE_LENGTH) {
...
}
In this second case ntop notifies the dequeue thread that there's a
packet to process
signalCondvar(&myGlobals.device[deviceId].queueCondvar);
Now my question is: are you sure that for some reason the dequeue thread
isn't looping or isn't really awake? Can you please check what happens
in (pbuf.c)
void* dequeuePacket(void* _deviceId) {
}
Just enable the traces (around #ifdef DEBUG) to see what happens there.
Cheers Luca
On 08/30/2010 11:26 PM, Rafael Sarres de Almeida wrote:
> Hi Luca;
> Just to add more info to my previous mail:
> I gdb the code in the first packet, seems like that the releaseMutex (2538,pbuf.c) is not releasing. I followed the code, it calls the realeaseMutex function after it processes the first packet, but on the next loop, the tryLockMutex (2510,pbuf.c) fails, so the program thinks the mutex is not released. Here is the debug:
>
>
> Breakpoint 4, queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad590042 "") at pbuf.c:2510
> 2510 if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex, "queuePacket") == 0) {
>
> *********** It is going to process first packet if Mutex is not locked.
>
>
> (gdb) step
> Mon Aug 30 18:09:35 2010 THREADMGMT[t1094719808]: SIH: Idle host scan thread running [p12154]
> _tryLockMutex (mutexId=0x2aaaab7e1150, where=0x2aaaaad8d59d "queuePacket", fileName=0x2aaaaad8d41a "pbuf.c", fileLine=2510)
> at util.c:2078
> 2078 return(pthread_rwlock_trywrlock(&mutexId->mutex));
> (gdb)
> [New Thread 0x45a08940 (LWP 12189)]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Started thread for throughput data collection
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1147169088]: RRD: Data collection thread running [p12154]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Throughput data collection: Thread starting [p12154]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Throughput data collection: Thread running [p12154]
> 0x000000304fa0a760 in pthread_rwlock_trywrlock () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function pthread_rwlock_trywrlock,
> which has no line number information.
> queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad590042 "") at pbuf.c:2514
> 2514 myGlobals.receivedPacketsProcessed++;
> (gdb) break 2538
> Breakpoint 5 at 0x2aaaaad595a7: file pbuf.c, line 2538.
> (gdb) continue
> Continuing.
>
> Breakpoint 5, queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad590042 "") at pbuf.c:2538
> 2538 releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
>
> ***************Releasing MUTEX
>
> (gdb) step
> _releaseMutex (mutexId=0x2aaaab7e1150, fileName=0x2aaaaad8d41a "pbuf.c", fileLine=2538) at util.c:2156
> 2156 return(pthread_rwlock_unlock(&mutexId->mutex));
> (gdb)
> 0x000000304eedfa10 in pthread_mutex_unlock () from /lib64/libc.so.6
> (gdb)
> Single stepping until exit from function pthread_mutex_unlock,
> which has no line number information.
> 0x000000304fa0a020 in pthread_mutex_unlock () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function pthread_mutex_unlock,
> which has no line number information.
> 0x000000304fa0a0d8 in _L_unlock_766 () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function _L_unlock_766,
> which has no line number information.
> 0x000000304fa0d5e0 in __lll_unlock_wake () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function __lll_unlock_wake,
> which has no line number information.
> 0x000000304fa0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function _L_unlock_766,
> which has no line number information.
> 0x000000304fa0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> (gdb)
>
>
> Any ideas?
>
> Rafael Sarres de Almeida
>
>
_______________________________________________
Ntop-dev mailing list
Ntop-dev at listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
More information about the Ntop-dev
mailing list