New Schedule For Sydney Datacenter Migration

We are continuing to experience intermittent issues on Adobe Business Catalyst's legacy Sydney datacenter. All dates/times in this post are in Australian Eastern Daylight Saving Time (+11GMT) We have systems engineers working on the issue. I know this is the 3rd business day in a row and it is really getting long in the tooth for Partners and site owners alike. Paul Gubbay, a VP of Engineering at Adobe will be posting on the blog shortly to share some of his thoughts on this very serious situation.

  • Issue: Business Catalyst services hosted on the legacy Sydney Primus Datacenter are exhibiting slow response times. Webpages for these sites were being served slowly in addition to customers reporting problems accessing Admin UI or transferring data through FTP. This is the 3rd business day in a row this has occurred.
  • Time of Incident Start: 2 Feb 2011 11:18AM Australian Eastern Daylight Time
  • Time of Incident End: Ongoing - ETA is unknown
  • Technical Action: Although we installed an extra switch and an extra firewall yesterday into the environment and moved OpenSRS migration to use the secondary switch/firewall, system engineers suspect there's still  too much HTTP traffic coming through the primary firewall. I mentioned that we were going to put in a load balancer for the 2 firewalls as well but this was not required in the end because we put the second firewall on the second switch. Our plan now involves moving 2 web servers (out of 3) across to the network using the secondary firewall/switch combination to balance the load (resulting in 5-10 minutes downtime).

New Schedule For Migration

Onto the topic of datacenter migration; given all the feedback we've received in the comments below, we've now scheduled the migration to occur at 1:00AM Sunday 13 February (check local times here) to give the lowest customer impact possible for Australian businesses.

To give you some background, we originally chose 9am Saturday morning because we thought it would help those partners and site owners with externally hosted DNS make the switch in sync with the migration, however this isn't required anymore. Additionally you have all made it clear that the impact to your customers' businesses is unacceptable if we were to do the migration at the original time. With this in mind, the updated details are as follows:

  • What's Happening?: We are migrating all sites and BC application infrastructure from Sydney Primus to Sydney Ultimo in one bulk-move
  • Target Start Date/Time: 1:00AM Sunday 13 February 2011 (Australian EDT) | 6:00AM Saturday 12 February 2011 (US Pacific) | 2:00PM Saturday 12 February 2011 (London) | check local times here
  • Target End Date/Time: 6:00AM Sunday 13 February 2011 (Australian EDT) | 11:00AM Saturday 12 February 2011 (US Pacific) | 7:00PM Saturday 12 February 2011 (London) | check local times here
  • How Long Will It Take? We will have a scheduled maintenance window of 5 hours, during which all sites hosted on Sydney Primus will be unavailable. Partner Portal access and new site creation will be unavailable at this time as well.
  • What are we doing? Simply put, we are going to replicate all databases between Sydney Primus and Sydney Ultimo. We will also setup a high-speed direct datalink between the 2 locations, to ensure databases are kept in sync prior to the migration. At the scheduled time of the migration we will reconfigure DNS settings and make other related BC architectural changes to point to the new Ultimo Datacenter. We will also need to restart all web servers.
  • Customer Impact - Worldwide: During the migration you will not be able to create new BC sites on any datacenter. You will not be able to access the Partner Portal during the maintenance window. No action is required from you.
  • Customer Impact - sites hosted on legacy Sydney DC with redelegated DNS: In addition to the above, all sites hosted on legacy Sydney DC will be offline for the maintenance window of 5 hours. There will be no front-end pages being served or Admin console access. No action is required from you.
  • Customer Impact - sites hosted on legacy Sydney DC with externally hosted DNS: In addition to the 2 points above you will be required to change your DNS settings with your DNS host e.g MelbourneIT, GoDaddy etc, to point to the IP address of the new datacenter after the migration has started.

Sites with Externally Hosted DNS - Action Required

There's been some questions around what happens for sites with their DNS externally hosted. The Engineering team are looking into an improved solution right now which is to keep a proxy server in the legacy Sydney Datacenter so that all requests coming in to the old legacy IP addresses will get routed through to the new datacenter transparently. Likewise the pages being served will come from the new DC through the proxy server back out to customers. This is not a permanent solution but gives you a longer window in which to make your DNS changes and also lessens impact to your customers when you do make the change. We will likely keep the proxy server running for a minimum of at least 30 days after the migration before we fully decommission the legacy datacenter.

For partners or site owners with externally hosted DNS, we advise you to set the TTL for your records down to 1800 (30 minutes) during the next week in preparation for the migration so that when you do make an IP address change following the migration, the settings will take a shorter amount of time to propagate

Thanks for reading and check back in a bit for Paul's post.
Eddy Chan
Business Catalyst Product Manager
View Comments

Legacy Sydney Datacenter Issues Update

At the time of writing, we continue to experience issues on BC's legacy Sydney datacenter. We have systems engineers working on the issue and I am posting an official update on the situation. Please note that for the purposes of this post, all dates/times are posted as Australian Eastern Daylight Saving (+11 GMT) time.

To give you some background surrounding these issues, we originally had 2 Watchguard firewalls in place in our legacy (Sydney Primus) datacenter, one acting as the primary, the other as a backup. The primary firewall developed a hardware issue causing last Friday's outage and we failed-over to the backup firewall.

Yesterday, we suffered another major outage from 11am to 5:30pm due to the backup firewall being unable to handle the load. To rectify this, we have installed an additional firewall with a load balancer to distribute the load across 2 firewalls, and to try and stabilize the situation. We are also adding another network switch which will take approximately 2 hours. We are working with the vendor to procure another primary firewall as soon as possible, giving us triple redundancy.

Other actions we are taking to improve stability in the Sydney Primus datacenter include:

  1. Rebooting the NAS server tonight (1AM Wed 2 February 2011) - this will result in 25 minutes of downtime during off-peak hours, however the reboot will free up system resources and improve performance of that server
  2. Throttling OpenSRS mail migration - given that we are experiencing load issues on our firewall we have taken steps to throttle our OpenSRS migration from Sydney Primus. The legacy mail server was physically located in the same location behind the same firewall as the other servers. This has unfortunately extended our mail migration period for another 72 hours.
System engineers are monitoring the situation 24/7 and you can be assured they are doing everything possible to keep the system stable.

Plan for Migrating to Sydney Ultimo

Obviously, keeping the old DC stable isn't our final fix for these on-going issues. Our medium term goal is to migrate all sites from Sydney Primus to Sydney Ultimo as soon as possible, with the least amount of customer impact. I've just finished meeting with the Engineering and Systems teams, who have put together a technical plan which I'm sharing publicly to keep you informed of the situation. Please be aware that the following is subject to change over the next 10 days.

  • What's Happening?: We are migrating all sites and BC application infrastructure from Sydney Primus to Sydney Ultimo in one bulk-move
  • Target Date/Time: 7am Saturday 12 February 2011 (AEDT). This is 2 weekends from now.
  • How Long Will It Take? We will have a scheduled maintenance window of 5 hours, during which all sites hosted on Sydney Primus will be unavailable
  • What are we doing? Simply put, we are going to replicate all databases between Sydney Primus and Sydney Ultimo. We will also setup a high-speed direct datalink between the 2 locations, to ensure databases are kept in sync prior to the migration. At the scheduled time of the migration we will reconfigure DNS settings and make other related BC architectural changes to point to the new Ultimo Datacenter. We will also need to restart all web servers.
  • Customer Impact - Worldwide: During the migration you will not be able to create new BC sites on any datacenter. You will not be able to access the Partner Portal during the maintenance window. No action is required from you.
  • Customer Impact - sites hosted on legacy Sydney DC with redelegated DNS: In addition to the above, all sites hosted on legacy Sydney DC will be offline for the maintenance window of 5 hours. There will be no front-end pages being served or Admin console access. No action is required from you.
  • Customer Impact - sites hosted on legacy Sydney DC with externally hosted DNS: In addition to the 2 points above you will be required to change your DNS settings with your DNS host e.g MelbourneIT, GoDaddy etc, to point to the IP address of the new datacenter. More details and instructions on this in the near future.

Over the coming days I will be posting regular communications around this datacenter migration, including detailed instructions if action is required from you or your customers, and more technical details around the plan as well. We've learnt some important lessons from the mail migration communication process, thank you for the feedback you've provided.

Finally, I want to thank all our partners for sticking with us through these trying times. I read the forums and the comments on this blog and I understand that many of you have built businesses on BC and that you're feeling pain. We know that this is disruptive to you and we are throwing everything we can at the problem to fix it. I will be posting daily updates to the blog on the situation and try to answer as many questions as possible via this channel.

Thanks for reading,
Eddy Chan
Business Catalyst Product Manager
View Comments

Update: Blog Post was edited at 3pm Monday Pacific Time with details on how 'Email Only' users can update their passwords

With the ongoing intermittent IMAP service issues, our engineers have been working over the weekend to speed up completion of all the work necessary for a migration from MailEnable to OpenSRS. We're hoping this will happen towards the end of this week. Here's an update on what's happened so far:

  • Completed implementation of our DNS migration tool. This tool will be responsible for making the correct MX record and A record modifications only for sites which use Business Catalyst mail hosting. Code review and testing is currently in progress.
  • Because OpenSRS uses a different password tokenization algorithm we've also had to make some modifications to our authentication layer so we're able to pass the correct password token to them when you login to the Admin Console. Code review is in progress, we will begin testing this tonight
  • As a consequence of the changes to the BC authentication layer, we also need to add a few UI changes to notify users of said changes when they login. This notice will also provide them with steps they should take to update their password. I have covered this point in more detail in below in my 'Customer Action Required' section
These modifications are not insignificant and it means we also need to prepare a medium sized release to push all of them out onto production servers so we can get the mail migration going. I would've liked to make all of this happen quicker but please remember we're dealing with critical infrastructure level components. Our team needs enough time to make sure everything is reviewed and tested thoroughly, to ensure the disruption is kept to a minimum for you and your clients.

Customer Action Required

So what do all these changes mean from a customers standpoint? After the release with the above-mentioned changes has been completed, the next time you or your clients' sign in to Business Catalyst you will be provided with a notice including instructions which will tell you to update your password.

This is because we need to generate a new password token for the OpenSRS mail service based on the new password you provide for us. We never stored raw passwords for security compliance so we were unable to generate a new token automatically. This will affect ALL users, not just users with BC hosted email as the authentication changes we made are system-wide. For sites with BC hosted email there's several extra points to take note of: (edited 3pm Monday Pacific Time)

  • If you are an 'Email Only' user, you will need to go to a special URL that we will provide where you can reset or change your password. You will need to enter your current email address, current password, new password and then confirm the new password before you can login to your new Webmail interface. The URL you need to go to will be provided on the new Webmail login interface that will continue to be accessed from 'mail.clientdomain.com'.
  • If you are a Site Administrator with 'Email Only' user accounts on your site you should pro-actively notify your users that they will need to go to a special URL to reset/change their passwords so they can login to their new new OpenSRS email account.
Even though we are providing system-wide 'Please Update Your Password' notices on login it's best if partners take steps to pre-warn their clients that they will need to change their passwords in the coming days using the information provided in this blog post.

Latest Migration Schedule

To be completely transparent about what is happening at BC, below is the latest tentative schedule for mail migration (all times are US Pacific):
  • Thursday (early AM) 27 January: Release DNS and Security related changes onto Production servers
  • Thursday (afternoon) 27 January: Provision all mail existing accounts with a new OpenSRS mailbox
  • Friday (early AM) 28 January: Switch DNS MX records for all sites from MailEnable to OpenSRS - at this point your mail will be redirected to your new mailbox. By 7:00AM, the DNS migration will be complete, you will be able to login to the new mail account after you have updated your password inside the Admin Console.
  • Friday (afternoon) 28 January: Start migrating existing mail across servers. Because we've had to accelerate this process we haven't had enough time to test and verify how long this will take. This will continue into the weekend and we y estimate it will take around 24 hours to move the mail across but could take anything up to 72 hours in the worst case

Throughout the week I'll be posting regular updates as we get closer to the Migration date especially if there are any changes, please watch out for these on the BC Blog. For on-going status updates, please also make sure you follow the bc_obnw Twitter account.

Thanks for reading,
Eddy Chan
Business Catalyst Product Manager
View Comments

Great news for our Business Catalyst Partner community in Australia! Earlier this week the systems and engineering team flicked the 'On' switch for our much anticipated Asia-Pacific Datacenter. Located in Sydney, Australia, it's now online and all new sites that are being created with "Australia" chosen as the data center location will reside on the brand new hardware there.

This is an important milestone in building a more reliable and scalable platform for you to deploy Business Catalyst sites on. Since the Adobe acquisition of Business Catalyst, we have now built and deployed 3 new data centers in 3 geographies across USA, Europe, and Asia-Pacific. They are now all accepting new customers.

Closing the Legacy Datacenters

With 3 of 3 planned datacenters online, one of our highest priority engineering goals shifts to closing down the legacy data centers. The engineering team continues to focus on developing the site migration tools that will enable us to move existing sites from the legacy datacenters to the new datacenters automatically. Building tools for moving sites for an all-in-one hosted application like Business Catalyst are complex and latest indications are that these will be available in sometime in the first half of 2011. I will of course keep the community up-to-date with our progress along the way.

In the meantime we will continue to monitor and maintain the existing data centers with the same level of effort as our new data centers. Moving sites from legacy data centers to new ones will not be possible and Business Catalyst cannot assist with this until our migration tools are available.

Until next time, your friendly neighborhood Product Manager :)
View Comments