Many threads "stuck" when using async loggers

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Many threads "stuck" when using async loggers

John.E.Gregg
Hi all,

I'm using Log4j2 2.11.2 and am evaluating switching from custom async appenders to async loggers.

I've been running some scalability tests with async loggers and sometimes get good results but mostly get terrible results at high loads.  At high loads, I typically see many threads in this state:

http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 - nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
stackTrace:
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
at org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageInfo(AsyncLoggerDisruptor.java:222)
at org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:181)
at org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:166)
at org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:161)
at org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:131)
at org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217)
at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108)
at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002)
at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974)

Is this a sign that the log rate is too high?

If I count the number of log messages per minute, they go from 70k (1x load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages per second.  Then the log message volume drops way down to under 1k for a few minutes before rising to a few thousand per minute.  There were only a few hundred log messages per minute when the thread dump was taken.  CPU was 100% at the start of 3x (no surprise) then dropped to around 50% even though application throughput was almost 0.  Context switching was higher than I've ever seen it (hundreds of thousands.)  It's interesting that the CPU usage was as high as it was even though application throughput was so low.  Could there have been a large backlog of log messages that were getting processed?

The application did recover when I dropped the load to 1x.  CPU usage and log volume went back to normal.

Garbage collection logs reflect the fact that almost no work was being done.

We're using -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector.

All of the log messages go to a single RollingFileAppender.  Disk space was about 2/3 used.

Another common thing that I see in the thread dumps when that happens is a thread like this blocking many others:

"http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0 tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
   java.lang.Thread.State: RUNNABLE
                at java.lang.Thread.setPriority0(Native Method)
                at java.lang.Thread.setPriority(Thread.java:1095)
                at java.lang.Thread.init(Thread.java:417)
                at java.lang.Thread.init(Thread.java:349)
                at java.lang.Thread.<init>(Thread.java:499)
              ...

We have a custom multi-threaded async appender that can handle this load.

Any suggestions?



Thanks

John
Reply | Threaded
Open this post in threaded view
|

Re: Many threads "stuck" when using async loggers

Anthony Maire
Hi John

If you have this stack trace in the threads that are producing messages,
then your disruptor is full because the appender is not capable of writing
fast enough.
Since you have a single appender, it might be useful to switch
to RollingRandomAccessFileAppender which can batch writes to disc to reach
a greater throughput if you are limited by I/O operations
In your case I would probably run a profiling with Java Mission Control or
any other sampling profiler that you are used too to see if the slowness in
the single appender thread is caused by I/O operations or by the
processing/formatting of the message itself

Best regards,
Anthony

Le mer. 27 mars 2019 à 16:23, <[hidden email]> a
écrit :

> Hi all,
>
> I'm using Log4j2 2.11.2 and am evaluating switching from custom async
> appenders to async loggers.
>
> I've been running some scalability tests with async loggers and sometimes
> get good results but mostly get terrible results at high loads.  At high
> loads, I typically see many threads in this state:
>
> http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
> nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
> stackTrace:
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
> at
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
> at
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
> at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
> at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
> at
> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageInfo(AsyncLoggerDisruptor.java:222)
> at
> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:181)
> at
> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:166)
> at
> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:161)
> at
> org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:131)
> at
> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217)
> at
> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170)
> at
> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125)
> at
> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108)
> at
> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002)
> at
> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974)
>
> Is this a sign that the log rate is too high?
>
> If I count the number of log messages per minute, they go from 70k (1x
> load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages per
> second.  Then the log message volume drops way down to under 1k for a few
> minutes before rising to a few thousand per minute.  There were only a few
> hundred log messages per minute when the thread dump was taken.  CPU was
> 100% at the start of 3x (no surprise) then dropped to around 50% even
> though application throughput was almost 0.  Context switching was higher
> than I've ever seen it (hundreds of thousands.)  It's interesting that the
> CPU usage was as high as it was even though application throughput was so
> low.  Could there have been a large backlog of log messages that were
> getting processed?
>
> The application did recover when I dropped the load to 1x.  CPU usage and
> log volume went back to normal.
>
> Garbage collection logs reflect the fact that almost no work was being
> done.
>
> We're using
> -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector.
>
> All of the log messages go to a single RollingFileAppender.  Disk space
> was about 2/3 used.
>
> Another common thing that I see in the thread dumps when that happens is a
> thread like this blocking many others:
>
> "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
> tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
>    java.lang.Thread.State: RUNNABLE
>                 at java.lang.Thread.setPriority0(Native Method)
>                 at java.lang.Thread.setPriority(Thread.java:1095)
>                 at java.lang.Thread.init(Thread.java:417)
>                 at java.lang.Thread.init(Thread.java:349)
>                 at java.lang.Thread.<init>(Thread.java:499)
>               ...
>
> We have a custom multi-threaded async appender that can handle this load.
>
> Any suggestions?
>
>
>
> Thanks
>
> John
>
Reply | Threaded
Open this post in threaded view
|

Re: Many threads "stuck" when using async loggers

Remko Popma-2
The first stack trace indicates that the application produces log events faster than they can be consumed. So let’s take a look at the components that sit downstream from the ringbuffer.  

You mentioned you’re using a RollingFileAppender. This should usually not be a bottleneck, depending on your rollover policy. How often do you rollover?

(RollingRandomAccessFileAppender is a bit faster but not that much, see https://logging.apache.org/log4j/2.x/performance.html#tradeoffs ).

Another potential source of slowness is logging with location (https://logging.apache.org/log4j/2.x/performance.html#asyncLoggingWithLocation).
Are you using layout options like HTML locationInfo, or one of the patterns %C or $class, %F or %file, %l or %location, %L or %line, %M or %method?

Also, do you have any custom components like Filters or Lookups configured?

Finally, I don’t know yet what to think of that 2nd stack trace with setPriority... It could be significant, but it could also be a red herring. Let’s look further into that after excluding the usual suspects I mentioned above.

Remko.

(Shameless plug) Every java main() method deserves http://picocli.info

> On Mar 28, 2019, at 0:50, Anthony Maire <[hidden email]> wrote:
>
> Hi John
>
> If you have this stack trace in the threads that are producing messages,
> then your disruptor is full because the appender is not capable of writing
> fast enough.
> Since you have a single appender, it might be useful to switch
> to RollingRandomAccessFileAppender which can batch writes to disc to reach
> a greater throughput if you are limited by I/O operations
> In your case I would probably run a profiling with Java Mission Control or
> any other sampling profiler that you are used too to see if the slowness in
> the single appender thread is caused by I/O operations or by the
> processing/formatting of the message itself
>
> Best regards,
> Anthony
>
> Le mer. 27 mars 2019 à 16:23, <[hidden email]> a
> écrit :
>
>> Hi all,
>>
>> I'm using Log4j2 2.11.2 and am evaluating switching from custom async
>> appenders to async loggers.
>>
>> I've been running some scalability tests with async loggers and sometimes
>> get good results but mostly get terrible results at high loads.  At high
>> loads, I typically see many threads in this state:
>>
>> http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
>> nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
>> stackTrace:
>> java.lang.Thread.State: TIMED_WAITING (parking)
>> at sun.misc.Unsafe.park(Native Method)
>> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
>> at
>> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
>> at
>> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
>> at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
>> at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
>> at
>> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageInfo(AsyncLoggerDisruptor.java:222)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:181)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:166)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:161)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:131)
>> at
>> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974)
>>
>> Is this a sign that the log rate is too high?
>>
>> If I count the number of log messages per minute, they go from 70k (1x
>> load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages per
>> second.  Then the log message volume drops way down to under 1k for a few
>> minutes before rising to a few thousand per minute.  There were only a few
>> hundred log messages per minute when the thread dump was taken.  CPU was
>> 100% at the start of 3x (no surprise) then dropped to around 50% even
>> though application throughput was almost 0.  Context switching was higher
>> than I've ever seen it (hundreds of thousands.)  It's interesting that the
>> CPU usage was as high as it was even though application throughput was so
>> low.  Could there have been a large backlog of log messages that were
>> getting processed?
>>
>> The application did recover when I dropped the load to 1x.  CPU usage and
>> log volume went back to normal.
>>
>> Garbage collection logs reflect the fact that almost no work was being
>> done.
>>
>> We're using
>> -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector.
>>
>> All of the log messages go to a single RollingFileAppender.  Disk space
>> was about 2/3 used.
>>
>> Another common thing that I see in the thread dumps when that happens is a
>> thread like this blocking many others:
>>
>> "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
>> tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
>>   java.lang.Thread.State: RUNNABLE
>>                at java.lang.Thread.setPriority0(Native Method)
>>                at java.lang.Thread.setPriority(Thread.java:1095)
>>                at java.lang.Thread.init(Thread.java:417)
>>                at java.lang.Thread.init(Thread.java:349)
>>                at java.lang.Thread.<init>(Thread.java:499)
>>              ...
>>
>> We have a custom multi-threaded async appender that can handle this load.
>>
>> Any suggestions?
>>
>>
>>
>> Thanks
>>
>> John
>>
Reply | Threaded
Open this post in threaded view
|

RE: Many threads "stuck" when using async loggers

John.E.Gregg
In reply to this post by Anthony Maire
Thanks Anthony,

I tried the RollingRandomAccessFileAppender but got the same results.

I think the issue is not IO but rather the masking of sensitive data that we do.  The actual masking is done in an implementation of AbstractStringLayout.  This occurs on the single appender thread Log4j2-TF-1-AsyncLogger.

Is there an OOTB way to use more threads for that purpose?  Previously we used our own async appender implementation that used multiple threads with good results.

Alternatively, is there a point where we can hook in some custom code to do that masking on the producer threads?

Thanks


> -----Original Message-----
> From: Anthony Maire <[hidden email]>
> Sent: Wednesday, March 27, 2019 10:50 AM
> To: Log4J Users List <[hidden email]>
> Subject: Re: Many threads "stuck" when using async loggers
>
> Hi John
>
> If you have this stack trace in the threads that are producing messages, then
> your disruptor is full because the appender is not capable of writing fast
> enough.
> Since you have a single appender, it might be useful to switch to
> RollingRandomAccessFileAppender which can batch writes to disc to reach a
> greater throughput if you are limited by I/O operations In your case I would
> probably run a profiling with Java Mission Control or any other sampling
> profiler that you are used too to see if the slowness in the single appender
> thread is caused by I/O operations or by the processing/formatting of the
> message itself
>
> Best regards,
> Anthony
>
> Le mer. 27 mars 2019 à 16:23, <[hidden email]> a écrit
> :
>
> > Hi all,
> >
> > I'm using Log4j2 2.11.2 and am evaluating switching from custom async
> > appenders to async loggers.
> >
> > I've been running some scalability tests with async loggers and
> > sometimes get good results but mostly get terrible results at high
> > loads.  At high loads, I typically see many threads in this state:
> >
> > http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
> > nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
> > stackTrace:
> > java.lang.Thread.State: TIMED_WAITING (parking) at
> > sun.misc.Unsafe.park(Native Method) at
> > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
> > at
> >
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> .
> > java:136)
> > at
> >
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> .
> > java:105) at
> > com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
> > at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
> > at
> >
> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMes
> > sageInfo(AsyncLoggerDisruptor.java:222)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(A
> > syncLogger.java:181)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.ja
> > va:166)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTran
> > slator(AsyncLogger.java:161)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger
> > .java:131)
> > at
> >
> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(Extended
> > LoggerWrapper.java:217)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogg
> > er.java:2170)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(A
> > bstractLogger.java:2125)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractL
> > ogger.java:2108)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.
> > java:2002)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogge
> > r.java:1974)
> >
> > Is this a sign that the log rate is too high?
> >
> > If I count the number of log messages per minute, they go from 70k (1x
> > load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages
> > per second.  Then the log message volume drops way down to under 1k
> > for a few minutes before rising to a few thousand per minute.  There
> > were only a few hundred log messages per minute when the thread dump
> > was taken.  CPU was 100% at the start of 3x (no surprise) then dropped
> > to around 50% even though application throughput was almost 0.
> > Context switching was higher than I've ever seen it (hundreds of
> > thousands.)  It's interesting that the CPU usage was as high as it was
> > even though application throughput was so low.  Could there have been
> > a large backlog of log messages that were getting processed?
> >
> > The application did recover when I dropped the load to 1x.  CPU usage
> > and log volume went back to normal.
> >
> > Garbage collection logs reflect the fact that almost no work was being
> > done.
> >
> > We're using
> > -
> Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerCo
> ntextSelector.
> >
> > All of the log messages go to a single RollingFileAppender.  Disk
> > space was about 2/3 used.
> >
> > Another common thing that I see in the thread dumps when that happens
> > is a thread like this blocking many others:
> >
> > "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
> > tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
> >    java.lang.Thread.State: RUNNABLE
> >                 at java.lang.Thread.setPriority0(Native Method)
> >                 at java.lang.Thread.setPriority(Thread.java:1095)
> >                 at java.lang.Thread.init(Thread.java:417)
> >                 at java.lang.Thread.init(Thread.java:349)
> >                 at java.lang.Thread.<init>(Thread.java:499)
> >               ...
> >
> > We have a custom multi-threaded async appender that can handle this
> load.
> >
> > Any suggestions?
> >
> >
> >
> > Thanks
> >
> > John
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Many threads "stuck" when using async loggers

Gary Gregory-4
Maybe a custom org.apache.logging.log4j.message.MessageFactory2 ?

There might be a general solution, feel free to offer a PR.

Gary

On Fri, Mar 29, 2019 at 9:13 AM <[hidden email]> wrote:

> Thanks Anthony,
>
> I tried the RollingRandomAccessFileAppender but got the same results.
>
> I think the issue is not IO but rather the masking of sensitive data that
> we do.  The actual masking is done in an implementation of
> AbstractStringLayout.  This occurs on the single appender thread
> Log4j2-TF-1-AsyncLogger.
>
> Is there an OOTB way to use more threads for that purpose?  Previously we
> used our own async appender implementation that used multiple threads with
> good results.
>
> Alternatively, is there a point where we can hook in some custom code to
> do that masking on the producer threads?
>
> Thanks
>
>
> > -----Original Message-----
> > From: Anthony Maire <[hidden email]>
> > Sent: Wednesday, March 27, 2019 10:50 AM
> > To: Log4J Users List <[hidden email]>
> > Subject: Re: Many threads "stuck" when using async loggers
> >
> > Hi John
> >
> > If you have this stack trace in the threads that are producing messages,
> then
> > your disruptor is full because the appender is not capable of writing
> fast
> > enough.
> > Since you have a single appender, it might be useful to switch to
> > RollingRandomAccessFileAppender which can batch writes to disc to reach a
> > greater throughput if you are limited by I/O operations In your case I
> would
> > probably run a profiling with Java Mission Control or any other sampling
> > profiler that you are used too to see if the slowness in the single
> appender
> > thread is caused by I/O operations or by the processing/formatting of the
> > message itself
> >
> > Best regards,
> > Anthony
> >
> > Le mer. 27 mars 2019 à 16:23, <[hidden email]> a
> écrit
> > :
> >
> > > Hi all,
> > >
> > > I'm using Log4j2 2.11.2 and am evaluating switching from custom async
> > > appenders to async loggers.
> > >
> > > I've been running some scalability tests with async loggers and
> > > sometimes get good results but mostly get terrible results at high
> > > loads.  At high loads, I typically see many threads in this state:
> > >
> > > http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
> > > nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
> > > stackTrace:
> > > java.lang.Thread.State: TIMED_WAITING (parking) at
> > > sun.misc.Unsafe.park(Native Method) at
> > > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
> > > at
> > >
> > com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> > .
> > > java:136)
> > > at
> > >
> > com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> > .
> > > java:105) at
> > > com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
> > > at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
> > > at
> > >
> > org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMes
> > > sageInfo(AsyncLoggerDisruptor.java:222)
> > > at
> > > org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(A
> > > syncLogger.java:181)
> > > at
> > > org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.ja
> > > va:166)
> > > at
> > > org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTran
> > > slator(AsyncLogger.java:161)
> > > at
> > > org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger
> > > .java:131)
> > > at
> > >
> > org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(Extended
> > > LoggerWrapper.java:217)
> > > at
> > > org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogg
> > > er.java:2170)
> > > at
> > > org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(A
> > > bstractLogger.java:2125)
> > > at
> > > org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractL
> > > ogger.java:2108)
> > > at
> > > org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.
> > > java:2002)
> > > at
> > > org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogge
> > > r.java:1974)
> > >
> > > Is this a sign that the log rate is too high?
> > >
> > > If I count the number of log messages per minute, they go from 70k (1x
> > > load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages
> > > per second.  Then the log message volume drops way down to under 1k
> > > for a few minutes before rising to a few thousand per minute.  There
> > > were only a few hundred log messages per minute when the thread dump
> > > was taken.  CPU was 100% at the start of 3x (no surprise) then dropped
> > > to around 50% even though application throughput was almost 0.
> > > Context switching was higher than I've ever seen it (hundreds of
> > > thousands.)  It's interesting that the CPU usage was as high as it was
> > > even though application throughput was so low.  Could there have been
> > > a large backlog of log messages that were getting processed?
> > >
> > > The application did recover when I dropped the load to 1x.  CPU usage
> > > and log volume went back to normal.
> > >
> > > Garbage collection logs reflect the fact that almost no work was being
> > > done.
> > >
> > > We're using
> > > -
> > Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerCo
> > ntextSelector.
> > >
> > > All of the log messages go to a single RollingFileAppender.  Disk
> > > space was about 2/3 used.
> > >
> > > Another common thing that I see in the thread dumps when that happens
> > > is a thread like this blocking many others:
> > >
> > > "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
> > > tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
> > >    java.lang.Thread.State: RUNNABLE
> > >                 at java.lang.Thread.setPriority0(Native Method)
> > >                 at java.lang.Thread.setPriority(Thread.java:1095)
> > >                 at java.lang.Thread.init(Thread.java:417)
> > >                 at java.lang.Thread.init(Thread.java:349)
> > >                 at java.lang.Thread.<init>(Thread.java:499)
> > >               ...
> > >
> > > We have a custom multi-threaded async appender that can handle this
> > load.
> > >
> > > Any suggestions?
> > >
> > >
> > >
> > > Thanks
> > >
> > > John
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>