This past week Tweetboard experienced a downtime period that extended over four days. Given the service level we have committed to provide, we believe that an explanation is owed to our loyal users.
For a long time we have been planning upgrading our database to utilize a table partitioning approach which would help better scale our storage hardware. We scheduled the upgrade for this past weekend.
The first stage went without an issue, the database was updated quickly and easily with less than 5min downtime while all services were restarted. On Saturday however, while working on the partitioning, the process required we drop the primary key index in our messages table and recreate a regular index instead. However, during the process, myisamchk ended up with a zero file size MYI for the messages table. This was a MAJOR PROBLEM as it meant that all of the messages (over 200 million posts) that we have built up over the past 7 months were gone. The same file was corrupted in our database backup during a simultaneous move due to space constraints.
A data recovery procedure would have required to physically remove the hardware to send it out, OR a remote attempt at recovering the data which would have cost $4K to $9K. Our best option was to attempt to recover the data ourselves.
We then moved to trying various “disk forensic” file recovering tools, including: “Foremost”, “TestDisk”, “PhotoRec”, “ext3undelete”, “sleuthkit” and others, but none were able to fully recover the data, due to the file being of such a big size, fragmented, and missing header/footer info.
We turned to manual analysis of the partition and even took the time to learn the ins and outs of low-level ext3 filesystem. The drive was instantly unmounted once we found about the corruption, which stopped data from being written over most blocks. That allowed us to reconstruct the chain of blocks, since we knew the location, size, and last deletion time of the file.
On Monday morning we had managed to recover 93% of the corrupt file which would in turn mean that approx. 90% of the actual database could be recovered. This was brilliant news but there was still a lot of work to be done and there was still a possibility that the database would not validate.
The team continued to press forward and yesterday (Thursday, 28th Jan) the recovery process had been completed and we could resume the original task of partitioning the database and updating the Tweetboard scripts to work with it. Finally, yesterday Tweetboard was brought back up and there was only one task left to do, write this blog and tell you our story.
What do we expect moving forward? Well, the database is back up as is Tweetboard, but we did still lose that 7-10% so we will be running background scripts over the next few days to recover all tweets that were lost.
We thank you all for your support and we do apologize for the downtime.

When we started Tweetboard we never anticipated the rate in which our message database would grow, we showed you previously how quickly our message database was
The 2000 invites will be released slowly and you will be notified via Twitter once you are accepted. The approval “tweet” will come from one of two accounts, @




The Tweetboard team has been working hard behind the scenes, and is getting ready to put a bunch of new features out the door.







