12c Oracle DCD – kernel call dive

As you probably read in the 12c documentation Oracle has changed the mechanism for DCD (Dead Connection Detection) from sending TNS packets to using TCP KEEPALIVE on the socket. Unfortunately, the documentation is extremely sparse about implementation details:

When I was tracking down a firewall configuration issue recently I was in need of a few more details. Specifically, I wanted to know what TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT are set to for various values of SQLNET.EXPIRE_TIME.
Since Oracle reads sqlnet.ora and sets the socket options early in the process startup (spawned by the listener) I opted for SystemTap to caputre all calls to “setsockopt”. Unfortunately, on my 7.4 Oracle Linux I ran into a stupid SystemTap bug. So, DTrace it is:

syscall::setsockopt:entry
{
  printf("%s | socket=%d | level=%d | optname=%d | optlen=%d", execname, arg0, arg1, arg2, arg4);
  self->optval = *(int *)copyin(arg3, arg4);
}

syscall::setsockopt:return
{
  printf("optval=%d\n", self->optval);
}

I ran it on dedicated lab VM to minimize cluttering of the output as the D script captures all “setsockopt” calls, not just the ones from Oracle.
But, before I show the output we need to know a few things.

The function definition for “setsockopt”:
This is the kernel function to modify socket options like TCP_KEEPALIVE, etc.

int setsockopt(
  int sockfd,
  int level,
  int optname,
  const void *optval,
  socklen_t optlen);
sockfd  => file descriptor for the socket
level   => the level on which to set the option/value (socket or any protocol)
optname => the number representing the option name according to header files
optval  => the value to set for the given option

Note, the sequence of arguments matches arg0 .. arg4 in above D script.

The relevant socket level & options (from socket.h):

# level
1 => SOL_SOCKET
 
# options
9  => SO_KEEPALIVE
20 => SO_RCVTIMEO
21 => SO_SNDTIMEO

The relevant TCP level & options (from tcp.h)

# level
6 => SOL_TCP
 
# options
1 => TCP_NODELAY
4 => TCP_KEEPIDLE
5 => TCP_KEEPINTVL
6 => TCP_KEEPCNT

Now we’ve got all the information necessary to interpret the output from my D script.

SQLNET.EXPIRE_TIME = 1

dtrace: script 'sockopt.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0    125                 setsockopt:entry tnslsnr | socket=14 | level=6 | optname=1 | optlen=4
  0    126                setsockopt:return optval=1
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=6 | optname=1 | optlen=4
  1    126                setsockopt:return optval=1
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=1 | optname=9 | optlen=4
  1    126                setsockopt:return optval=1
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=6 | optname=4 | optlen=4
  1    126                setsockopt:return optval=60
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=6 | optname=5 | optlen=4
  1    126                setsockopt:return optval=6
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=6 | optname=6 | optlen=4
  1    126                setsockopt:return optval=10

The output shows the following:

  • TCP_KEEPALIVE is enabled
  • TCP_KEEPIDLE is set to the value of SQLNET.EXPIRE_TIME in seconds
  • TCP_KEEPINTVL is set to 6
  • TCP_KEEPCNT is set to 10

OK, let’s gradually increase SQLNET.EXPIRE_TIME and see what happens.

SQLNET.EXPIRE_TIME = 10

... 
  0    125                 setsockopt:entry oracle_6400_dev | socket=14 | level=6 | optname=4 | optlen=4
  0    126                setsockopt:return optval=600
  
  0    125                 setsockopt:entry oracle_6400_dev | socket=14 | level=6 | optname=5 | optlen=4
  0    126                setsockopt:return optval=6
  
  0    125                 setsockopt:entry oracle_6400_dev | socket=14 | level=6 | optname=6 | optlen=4
  0    126                setsockopt:return optval=10
...

SQLNET.EXPIRE_TIME = 60

...
  1    125                 setsockopt:entry oracle_7147_dev | socket=14 | level=6 | optname=4 | optlen=4
  1    126                setsockopt:return optval=3600
  
  1    125                 setsockopt:entry oracle_7147_dev | socket=14 | level=6 | optname=5 | optlen=4
  1    126                setsockopt:return optval=6
  
  1    125                 setsockopt:entry oracle_7147_dev | socket=14 | level=6 | optname=6 | optlen=4
  1    126                setsockopt:return optval=10
...

SQLNET.EXPIRE_TIME = 120

...
  1    125                 setsockopt:entry oracle_7163_dev | socket=14 | level=6 | optname=4 | optlen=4
  1    126                setsockopt:return optval=7200
  
  1    125                 setsockopt:entry oracle_7163_dev | socket=14 | level=6 | optname=5 | optlen=4
  1    126                setsockopt:return optval=6
  
  1    125                 setsockopt:entry oracle_7163_dev | socket=14 | level=6 | optname=6 | optlen=4
  1    126                setsockopt:return optval=10
...

Conclusion
From this I conclude that Oracle always sets TCP_KEEPIDLE to the value of SQLNET.EXPIRE_TIME in seconds.
TCP_KEEPCNT and TCP_KEEPCNT are not adjusted based on SQLNET.EXPIRE_TIME but are always set to 6 and 10, respectively.

For the sake of completeness, here’s what happens when the connection closes.

 
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=1 | optname=21 | optlen=16
  1    126                setsockopt:return optval=0
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=1 | optname=20 | optlen=16
  1    126                setsockopt:return optval=0
  
  1    125                 setsockopt:entry oracle_7090_dev | socket=14 | level=1 | optname=9 | optlen=4
  1    126                setsockopt:return optval=0

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.