This article explains the meaning of the BoltProtocolBreachFatality error in Neo4j server and the possible causes of the exception.
This is an example of the exception:
2023-09-14 09:13:10.953+0000 ERROR [o.n.b.r.DefaultBoltConnection] Protocol breach detected in bolt session 'bolt-512490'. org.neo4j.bolt.runtime.BoltProtocolBreachFatality: Message 'ROLLBACK' cannot be handled by a session in the READY state.
at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.nextState(AbstractBoltStateMachine.java:155) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.process(AbstractBoltStateMachine.java:98) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.messaging.BoltRequestMessageReader.lambda$doRead$1(BoltRequestMessageReader.java:93) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.DefaultBoltConnection.lambda$enqueue$0(DefaultBoltConnection.java:156) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatchInternal(DefaultBoltConnection.java:250) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:187) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.DefaultBoltConnection.processNextBatch(DefaultBoltConnection.java:177) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.executeBatch(ExecutorBoltScheduler.java:257) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at org.neo4j.bolt.runtime.scheduling.ExecutorBoltScheduler.lambda$scheduleBatchOrHandleError$3(ExecutorBoltScheduler.java:240) ~[neo4j-bolt-4.4.23.jar:4.4.23]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.86.Final.jar:4.1.86.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
What can cause a protocol breach?
The protocol breach detected error is caused by misusing the Neo4j driver. The likely cause is a multithreaded Client application sharing a session object across threads. This results in a single connection being used to interleave two streams of data, thus breaching the protocol.
The rule for using the driver in a threaded application is:
Driver = thread-safe and can be shared, and long-lived.
Sessions = not thread-safe, and should be short-lived.
Furthermore, sharing a transaction object could technically cause the same problem, as a session encapsulates and manages transactions.
Additionally, sharing a Result cursor object across threads can lead to a protocol breach error.
In general, anything capable of holding or utilising a connection, apart from the Driver, is considered non-thread-safe.
What does the following error mean exactly?
Message 'ROLLBACK' cannot be handled by a session in the READY state
The bolt server is in the READY state when it is waiting to receive a begin transaction or run auto-commit request. It is basically the resting state and authenticated state of the bolt server.
Once it has received one of those two messages, the state machine inside it ensures that it is receiving the expected messages for any given state. If the driver user is multithreading a session you can get messages sent that are unexpected, this is what causes the server to throw the protocol violation error.
Example Scenario causing a Protocol breach error:
This is an example of a simple scenario that can cause a Protocol breach:
- A Session is created and takes a connection from the pool.
- Thread one is given the session and starts a transaction function with a query
- Thread two is also given the session and also starts a transaction function with a query.
- The bolt server receives the first begin transaction message over the connection and goes into the relevant state and expects a run message to come next.
- The bolt server then receives a second begin transaction message on that same connection. It then throws the protocol breach exception, as the protocol which specifies the run message should be next, is violated.
What to do if you get the BoltProtocolBreachFatality Error?
- Review your Neo4j client application driver code and ensure that the mentioned guidelines are followed accordingly.
- Enable Driver logs for further troubleshooting. Steps to enable the logging can be found here.
- Review the Neo4j server query.log to find out which transactions are running when the issue occurs. Make sure the transaction logging is enabled: