Share
FacebookTwitterGoogle PlusLinkedInEmail
Mark Mader's blog - August 26, 2013 - 5:57 pm

Many Smartsheeters experienced degraded performance this morning, including slow sheet loading and an inability to view your data. We want you to know that we do not consider this level of performance acceptable and are taking this issue very seriously.

An apology

Before we explain what happened, we want to apologize for the work interruption. We consider job #1 to be productivity, and clearly we did not meet our own high standards for productivity this morning.

What happened: The simple version

On Saturday we took the system offline in order to perform planned maintenance and add some new features. We announced this in advance and had planned to be offline for about two hours.

Unfortunately, the update did not go as quickly as planned. When we realized that we had passed an acceptable level of time to be offline, we changed the plan in order to come back online more quickly. In the end we were offline for about four hours on Saturday afternoon PDT.

On Sunday we were able to add two high priority features (with no offline time), which are live now.

However, when traffic surged on Monday our services were much slower than normal. It took us about two hours to find and fix the problem. 

During the offline period, we posted live updates on all social channels (Google+, Twitter, Facebook) to inform everyone of the situation as quickly as possible.

What happened: The technical post mortem

On Saturday morning, we took the site down for a scheduled upgrade. As part of that upgrade, we applied several recommended database patches that appear to have prevented the database from coming back online. We immediately called our database vendor and worked together to bring the service back up, but once the issue was clear that a fix would not be immediate, we swapped over to our hot backup data center and came back up on Saturday afternoon. 

As the Smartsheet workload spiked this morning a number of people were not able to log into the service.  We were able to trace the issue to the fact that an earlier networking upgrade in the backup data center was incomplete and Smartsheet was running with about ½ the expected throughput. It wasn’t until we saw service loads spike this morning that the networking issue became apparent and we were able to rectify that within about two hours.  

What’s next

While no data was ever at risk, and our backup data center worked well throughout, we recognize that doesn’t help much when you are waiting to get back to work. We are already making big investments in service infrastructure to enable low-to-no-downtime updates. We will now also take a hard look at what happened and reconsider the way we handle planned maintenance and service updates going forward.

Thanks very much to everyone who took the time to notify us of the issues they were experiencing, and again we are truly sorry for anyone whose work was affected by our service issues today.

- The Smartsheet Team

Comments

Today is the first time I have not been able to login to Smartsheet and I cam concerned because I wanted to export my data before my trial ran out (my local ICT department has a licence for me but is working out how to activate it and not lose my trial data. I believe I have 1 day left in the trial. How can I make sure I do not lose that data?

Hi Dena, Sorry to hear that! Our Customer Support team will be reaching out shortly to help trouble shoot, we should be able to help get you into Smartsheet in no time. In the future, you can always email support@smartsheet.com with similar questions. Thanks! - Kelly

That is a major concern for me... being able to access the data offline. Is there an alternative to having to be online? We work in areas that internet is not always available and yet we need to update, call up charts, etc. Can we do that with Smartsheet? Thanks, Terra Agueda

Post new comment