diff options
author | Eric Dumazet <edumazet@google.com> | 2023-10-20 12:57:47 +0000 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2023-10-23 09:35:01 +0100 |
commit | 614e8316aa4cafba3e204cb8ee48bd12b92f3d93 (patch) | |
tree | bc6f18ff35ef1e418218dc944a48aa6d143ae45a /net/ipv4/tcp.c | |
parent | af7721448a609d1912b57c825194ef6e17fc71a4 (diff) | |
download | linux-614e8316aa4cafba3e204cb8ee48bd12b92f3d93.tar.gz linux-614e8316aa4cafba3e204cb8ee48bd12b92f3d93.tar.bz2 linux-614e8316aa4cafba3e204cb8ee48bd12b92f3d93.zip |
tcp: add support for usec resolution in TCP TS values
Back in 2015, Van Jacobson suggested to use usec resolution in TCP TS values.
This has been implemented in our private kernels.
Goals were :
1) better observability of delays in networking stacks.
2) better disambiguation of events based on TSval/ecr values.
3) building block for congestion control modules needing usec resolution.
Back then we implemented a schem based on private SYN options
to negotiate the feature.
For upstream submission, we chose to use a route attribute,
because this feature is probably going to be used in private
networks [1] [2].
ip route add 10/8 ... features tcp_usec_ts
Note that RFC 7323 recommends a
"timestamp clock frequency in the range 1 ms to 1 sec per tick.",
but also mentions
"the maximum acceptable clock frequency is one tick every 59 ns."
[1] Unfortunately RFC 7323 5.5 (Outdated Timestamps) suggests
to invalidate TS.Recent values after a flow was idle for more
than 24 days. This is the part making usec_ts a problem
for peers following this recommendation for long living
idle flows.
[2] Attempts to standardize usec ts went nowhere:
https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp.c')
-rw-r--r-- | net/ipv4/tcp.c | 18 |
1 files changed, 14 insertions, 4 deletions
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 805f8341064f..b961364b4961 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3629,10 +3629,16 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, tp->fastopen_no_cookie = val; break; case TCP_TIMESTAMP: - if (!tp->repair) + if (!tp->repair) { err = -EPERM; - else - WRITE_ONCE(tp->tsoffset, val - tcp_clock_ts(false)); + break; + } + /* val is an opaque field, + * and low order bit contains usec_ts enable bit. + * Its a best effort, and we do not care if user makes an error. + */ + tp->tcp_usec_ts = val & 1; + WRITE_ONCE(tp->tsoffset, val - tcp_clock_ts(tp->tcp_usec_ts)); break; case TCP_REPAIR_WINDOW: err = tcp_repair_set_window(tp, optval, optlen); @@ -4143,7 +4149,11 @@ int do_tcp_getsockopt(struct sock *sk, int level, break; case TCP_TIMESTAMP: - val = tcp_clock_ts(false) + READ_ONCE(tp->tsoffset); + val = tcp_clock_ts(tp->tcp_usec_ts) + READ_ONCE(tp->tsoffset); + if (tp->tcp_usec_ts) + val |= 1; + else + val &= ~1; break; case TCP_NOTSENT_LOWAT: val = READ_ONCE(tp->notsent_lowat); |