More numbers

Now that the cluster update is done, I was able to start experiments again. It appears that my strange numbers are still here. I installed some more precise monitoring tools like sar, and tried to figure out what is going on. Here is the CPU usage graph for my application that simples receives UDP/SNMP heartbeat traps sent at a 1ms time interval.

CPU Usage graph

As you can see, the period signal is still there. Trying to figure out what is going on, I did some profiling on my java code to figure out what is going. Here is the result of the profiling.

rank self accum count trace method
1 99.74% 99.74% 295735 300260
2 0.03% 99.77% 95 300335 org.snmp4j.smi.VariableBinding.<init>
3 0.02% 99.79% 70 300330 org.snmp4j.smi.OctetString.<init>
4 0.02% 99.81% 63 300332 org.snmp4j.smi.OID.<init>
5 0.02% 99.83% 62 300329<init>

So my code is statistically doing nothing, just waiting for the next UDP packet to arrive, at the same time, I get CPU usage that goes up and down, including in supervisor mode. You can also see that the system and user level CPU usages are correlated, so it is clearly UDP packet reception that triggers this stuff. The only good thing is that the problem clearly does not lie in my code.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.