Moving To A More Elastic Future: Upcoming Build Infrastructure Migration

Updated: 2015-12-15

We’ve discovered that some parts of our system for creating and destroying VMs on GCE is too aggressive for the various API quotas that GCE has in place. While we’ve been working with GCE Support over the last week to add better back-off and rate limiting functionality to our services, we’ve not gotten to a point of stability that we feel comfortable moving our entire workload onto GCE yet.

We’re pausing the migration in the following state:

travis-ci.com:

  • All Trusty beta builds run on GCE
  • All Legacy Precise builds still run on Blue Box

travis-ci.org:

  • All Legacy Precise and Trusty builds run on GCE, except certain users who’ve specifically opted into still running on Blue Box

We will resume the work with GCE and re-forecast the migration schedule in the first week of January 2016.

If you have any questions, please email support@travis-ci.com,

Thanks for all your patience and understanding as we work to finish out this migration.

Regards,

Brandon Burton
Infrastructure Manager


Over the past couple of years, our growth has changed dramatically and with it, our demands on the infrastructure we’re using. Daily build utilization has grown from some 7000 jobs per day in 2012 to now more than 270,000 jobs.

As Travis CI grew, so did our need for computing capacity. In 2013, we found the best option to evolve beyond a VirtualBox-based infrastructure was a private cloud infrastructure based on OpenVZ and that infrastructure has helped us immensely in growing and expanding over the last 2.5+ years.

But as the ecosystem of public cloud providers has grown, the available options for utilizing a purely elastic capacity have too. Earlier this year we began experimenting with using Google Compute Engine (GCE) for running our fully virtualized Ubuntu Trusty builds and we’ve had great success with it. Some of GCE’s features, like a 10 minute billing minimum, and per minute after that and their preemptible instance support, have proven to be an excellent fit for Travis’s workload and because of that, we’re taking the next step in moving towards a more fully elastic world by migrating our sudo enabled Ubuntu Precise builds over to GCE as well.

Transition

Starting in the first week of December 2015 (next week!) we’ll begin the process of migrating the entire workload off our OpenVZ platform and onto out new GCE setup. This migration will proceed approximately as outlined below:

Summary

  • travis-ci.org (public) builds will begin on Tuesday, Dec. 1, 2015
  • travis-ci.com (private) builds will begin on Thursday, Dec 3, 2015

Details for travis-ci.org (public) builds

Date Action
01.12.2015 We’ll begin routing up to 10% of all Legacy Precise builds.
03.12.2015 We’ll begin routing up to 50% of all Legacy Precise builds.
07.12.2015 Our goal will be to have up to 100% of Legacy Precise builds running on GCE by the end of the business day on Dec 9th, Pacific time, depending on regression support requests.

Details for travis-ci.com (private) builds

Date Action
03.12.2015 We’ll begin routing up to 10% of all Legacy Precise builds.
07.12.2015 We’ll begin routing up to 50% of all Legacy Precise builds.
09.12.2015 Our goal will be to have up to 100% of Legacy Precise builds running on GCE by the end of the business day on Dec 9th, Pacific time, depending on regression support requests.

Effects

This is migration should be mostly transparent. As we begin to route builds to the new infrastructure, you’ll be able to tell if your build is running on the new infrastructure because you will see something similar to the following near the top of your build log (in particular the image name in the instance:, e.g. travis-ci-python-precise-1448037712):

Worker information
hostname: travis-worker-gce-org-prod-2:011c873a-832c-4337-8f7b-33f9ef
version: v1.2.0
instance: testing-gce-f32f9cf5-8b5c-42a4-8fc7-7c5a61e0ae8e:
          travis-ci-python-precise-1448037712

We’ve done our best to ensure that software installed in the build images on GCE are identical to the existing build image on the previous infrastructure, but there may be some changes due to updates in Ubuntu provided packages or tools that were installed from Github by our chef cookbooks.

IPv6 no longer present

The major change that is coming with this migration is that local and external IPv6 networking will no longer be present. This has been present and while technically not considered to be a feature, it has been available. With the move to GCE this will no longer work, until such time as as GCE adds IPv6 support. We do understand this may cause a disruption for some use cases and while we have considered numerous ways to try to provide IPv6 in the cloud, none of the current available options are suitable for a large production deployment. We ask for your understanding and patience with the fact that is will no be supported for the near future.

Try it now?

You can opt-in to trying out your builds on the new infrastructure by reviewing the steps outlined here

If you see any issues

Our support and infrastructure teams will be giving primary focus to any regressions that may be experience during this migration process.

If you see problems with the new infrastructure’s Precise image as we begin migrating, please open a GitHub issue with [precise-gce] in the subject or email support@travis-ci.com and include [precise-gce]in the email subject line.

Updates to containerized builds?

Since the new Precise images on GCE do include some updates, we’re planning to update the images used in our containerized builds to match, shortly after we get to 100% in the migration. Look for a blog post announcing that and details on what changes will be included in the update.

Happy testing!