Monday, May 01, 2006

Lost Writes

One of the best kept secrets of the Storage industry is about "lost writes". Some of you are probably not aware of this, mostly because it's a rare condition, but in my mind if it happens once, that's one too many times, especially since it compromises the integrity of your data.

There are cases where a drive will signal to the application that a block has been written to disk when in fact, it either hasn't or it has been written to the wrong place. Yikes!!!

Most vendors I know of offer no such protection, therefore a single occurrence, will have a direct effect on the integrity of the data followed by necessary application recovery procedures.

The only vendor I know of that offers "lost write" protection is Netapp with the DataONTAP 7.x release. Again, the goodness of WAFL comes into play here. Because Netapp has the ability to control both the RAID and the filesystem, DataONTAP provides the unique ability to catch errors such as this and recover. Along with the block checksum DataONTAP also stores WAFL metadata(i.e inode # of a file containing the block) that provide the ability to verify the validity of a block being read. So if the block being read does not match what WAFL expects, the data gets reconstructed thus providing solid data protection scheme even for a unlikely scenario such as this.

No comments: