The Tale of FTP at Travis CI
Our Build Infrastructure Team had a “adventure” recently, and we wanted to share the findings! If you also think NATs, FTP, and (spoiler) HTTPS are pretty neat, you might find this interesting…
A Bit of History
Until roughly mid-March of 2018 it was possible for any user of travis-ci.org or travis-ci.com to create a sudo-enabled Linux job that could communicate with a remote FTP server for effectively any purpose. Since that time, the use of FTP on sudo-enabled Linux has become… less than usable. Since many, many folks still use FTP every day for all sorts of awesome projects, we wanted to help guide our users towards solutions that should work consistently.
FTP PASV & Network Address Translation
When an FTP client initiates a data transfer command with a remote FTP server, which could be something like a directory listing or file retrieval, the default method for most modern clients is to switch to “passive mode”. This is usually a good thing given that the vast majority of FTP clients are operating on private subnets or otherwise don’t have an IP address that is publicly-addressable (which incidentally, relates to everything about why IPv6 is even a thing at all anyway).
The goal with passive mode is to defer initiation of data transfer connections back from the FTP server to the FTP client so that the server need never depend on the client’s real IP address. This is good because it means FTP clients could continue doing their thing even as the internet eventually outgrew the public IPv4 address space.
Here’s a look at standard passive-mode FTP, where the “client” is a build VM and the “ftp server” is the deployment target specified in a
Now, another thing that’s good (or, at least should be) is the NAT layer we recently introduced into our sudo-enabled Linux infrastructure to make sure we have reasonably consistent outbound IP address, and improve our ability to mitigate free infrastructure misuse. It was a pretty exciting change, though largely internal. This is more maintainable, and we think, generally provides a better experience.
Back to FTP, though: once an FTP client requests a switch to passive mode, the FTP server responds with an address to use for the separate data connection, expecting the same client to start the connection. However, with our current NAT each TCP/UDP connection may be established through a different NAT host than any previous connections.
This means that the data connection is potentially being attempted by a client that, from the FTP server’s perspective, looks like a total stranger. As one might expect, FTP servers tend to get really worried about the appearance of such a stranger and flat-out reject the connection. Sorry, FTP servers. 😕
Here’s a look at what happens when passive mode communications pass through different NATs between the build vm client and deployment target FTP server - the server rejects the data transfer
Now What? (i.e. Recommendations for Travis Users)
After much consternation and deliberation, we’ve decided to keep the NAT in place, even though FTP servers will remain to be sad when approached by unfamiliar NAT hosts. As such, the FTP protocol cannot be reliably used on our sudo-enabled Linux hosted infrastructure anymore. However, there are some other options that are known to work well. Yay!
In the simplest cases, we can recommend fetching files over HTTP(S) whenever the server supports it, as is the case for many mirrored open-source resources. In cases where FTP must be used and SFTP is an option, we can recommend that switch. If SFTP is not an option, then we can recommend tunneling the FTP traffic through something like a VPN connection).
If this blog post was interesting/relevant/curious, do let us know! We’re working on sharing more of our learnings and discoveries and would love to hear your perspective on what’s helpful. Give us a shout on twitter (@travisci), start a conversation on travis-ci.community, or email the fine folk in Customer Success at email@example.com. Looking forward to chatting! 💖