Getting random CONNECTION RST responses from Fastly

I’ve created this small repo to showcase how the problem happens:

Here’s the logs from that program:

./connection_test --host="https://www.fastly.com"
Will attempt to connect 100 times to https://www.fastly.com
Get "https://www.fastly.com": read tcp 10.10.10.47:60884->151.101.193.57:443: read: connection reset by peer
Get "https://www.fastly.com": read tcp 10.10.10.47:60888->151.101.193.57:443: read: connection reset by peer
Get "https://www.fastly.com": read tcp 10.10.10.47:60890->151.101.193.57:443: read: connection reset by peer
Test complete.

I’ve also made a web based test, but this is a bit more flaky I’ve noticed: https://aaomidi.github.io/connection_test/

I’ve set a custom user agent on that CLI utility to make it easier to pinpoint where this problem is:

req.Header.Set("User-Agent", "https://github.com/aaomidi/connection_test")

I’m wondering if this is another case of: https://mailman.nanog.org/pipermail/nanog/2018-September/096871.html

The IP that your connection fails to is anycast so it could very well be a similar problem of unstable ECMP.

You could try a TCP traceroute to see what that shows, something like:

mtr -rwbzc 100 -T -P 443 151.101.193.57
2 Likes
HOST: Amirs-MacBook-Pro.local                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS???    10.10.10.1                                            0.0%   100    2.1   2.7   1.9   5.6   0.7
  2. AS40898  199.38.69.2                                           0.0%   100    4.0   7.7   3.1  66.5  10.8
        161.199.180.3
     AS40898  161.199.180.3
  3. AS13536  66-152-105-13.static.firstlight.net (66.152.105.13)   0.0%   100    8.4   8.0   5.9  11.4   1.2
        66-109-46-9.static.firstlight.net (66.109.46.9)
     AS13536  66-109-46-9.static.firstlight.net (66.109.46.9)
  4. AS???    be2.albnypscr1.ip.firstlight.net (66.109.52.57)       0.0%   100    6.7   8.4   5.8  16.0   1.8
        be30.bnghnyhecr1.ip.firstlight.net (66.152.98.225)
     AS13536  be30.bnghnyhecr1.ip.firstlight.net (66.152.98.225)
  5. AS13536  be24.albynypsbr2.ip.firstlight.net (66.152.97.41)     2.0%   100  3010. 825.6   6.1 5010. 1179.1
        be23.nycmnyqobr1.ip.firstlight.net (66.152.97.49)
     AS13536  be23.nycmnyqobr1.ip.firstlight.net (66.152.97.49)
  6. AS46887  eqix-ny1-1.fastly.com (198.32.118.104)                0.0%   100    9.6  10.6   9.2  13.6   1.0
        fastly-1.nyiix.net (198.32.160.22)
     AS???    fastly-1.nyiix.net (198.32.160.22)
  7. AS54113  151.101.193.57                                        0.0%   100   10.4  10.6   8.8  25.1   2.0

I’ve also noticed that this is happening to another CDN as well: Microsoft Edge Network.

I’m guessing this is probably more credence to ANYCAST handling being the culprit here, and I think it comes back down to First Light Fiber. I’m not sure what the best way to reach them is going to be…

The multi-pathing actually starts inside OEConnect AS40898 so it could be there too.

I will try to reach out to their NOC referencing this discussion.

Perhaps you can contact their support too?

1 Like

OEConnect has mostly just been forwarding these issues to First Light Fiber. But agreed that it could be starting there as well.

I’ve reached out to a few other STUB ASes that uses AS13536 as their only upstream to see if they’ve also been noticing this. If they’re not seeing similar patterns of problems, I’ll push back a bit more on OEConnect.

Cheers! The internet really is held together with glue sometimes :sweat_smile:

My ISP has solved this problem. Cheers all!

2 Likes