Keep Data Safe By Storing It Everywhere
PROJECT OVERVIEW - CleversafeWiki:
The Dispersed Storage Project is the central point of development and idea exchange for developers around the world to contribute to innovative storage solutions leveraging dispersed storage methodology.The project uses Information Dispersal Algorithms, IDAs, to separate data into unrecognizable DataSlices™ and distribute them, via secure Internet connections, to storage locations throughout the world, creating a storage grid. With dispersed storage, transmission and storage of data is inherently private and secure. No single entire copy of the data is in one location, and only a subset of the nodes need to be available in order to perfectly retrieve the data.
Data on the grid remains private and secure in the face of natural catastrophes, or failures of hardware, connection, facility, or IT management. Moreover, the individual data slices do not carry enough information for an unauthorized viewer to determine the original content.
The Cleversafe Dispersed Storage software includes client software with both a comprehensive command line interface (CLI) as well as a complete programming interface (DSAPI) to support any type of storage application. This software further includes grid server software for creating a dispersed storage grid. The Cleversafe software also manages metadata for file systems stored on a Dispersed Storage grid. In addition, this project includes the multi-terabyte Cleversafe Research Storage Grid at eleven separate hosting facilities.
This feels like RAID on a global scale!
If you found this post useful, why don't you buy me a cup of coffee to show your gratitude?
6 Responses to “Keep Data Safe By Storing It Everywhere”
-
Joshua Says:
August 1st, 2007 at 12:23 pmAwesome - until the world blows up! How safe is your data then? Yeah… that’s what I thought. That’s why I’m planning a project to store your data across the universe! Hahaha…
-
Robert H Says:
August 1st, 2007 at 2:11 pmThis sounds really similar to Freenet, except that its purpose is to protect your data rather than hide your dirty dirty porn.
-
Chavo Says:
August 1st, 2007 at 3:30 pmFirst off, I love your site after happening upon it a few weeks ago.
Secondly, I did read the article and peruse the website.
My actual point: This architecture seems inherently flawed. There is little protection against network/hardware failure and would seem to encourage data loss. For example, RAID uses striping to enhance performance (not security although some may use it for that purpose). RAID uses mirroring for data retention in the event of a failure. This technology has very limited mirroring capabilities and definately cannot propagate them in the same way as RAID. Look at DNS as an example, it can take up to 72 hours to propagate an authoritative DNS change across the globe. The ’striping’ ability, which in theory could allow you much more affective transfer on the same principles as P2P networking, fails in the event that any of the parts is unavailable.
Imagine trying to access an important document that is located across 5 different servers. It is feasible to allow a .1% downtime for any given server. However, if you need all five servers to be up to have access to the file, you have an effective downtime of up to .5% (please correct me if that math is wrong). While that doesn’t seem like a big deal in itself, you can see how it would be a major problem if your file is much bigger (meaning more slices and more servers) or distributed across a much larger network (which means more frequent downtime for any given path).
If you have ever used Rapidshare to try to send a big file to a friend, you know you have to split it up into several pieces and upload them individually [to seperate servers]. When your recipient goes to download the file as a whole, it is much more inconvenient and likely to fail than say a slow, personal ftp server.
The only place I see this being useful is in the LAN environment where fast propagation can make striping effective and downtime can be kept to a minimum. However, NFS solutions already exist that do this better (ie RAID arrays with clustering offsite).
-
Tim Fehlman Says:
August 1st, 2007 at 3:51 pmI may be off base here, but this project seems to me to be like the parchive files that we used to see before bittorrent became popular.
Essentially, you would create the par files and then distribute them. When you collected enough of the par files to recreate your data, then you could proceed. This would not mean that you would need all of the par files, just a percentage of them. So, if you had a failure in one out of the five servers, all of your data would still be available.
Tim
-
JD Says:
August 7th, 2007 at 3:05 pmI was thinking about this the other day. But in say an office setting and not over the entire internet. All our users have Mid to high range desktops that they don’t utilize hardly at all. Plus they have lots of unused disk space. What about an enterprise tool that you would put on all the desktops in an office to do this sort of thing. You could have a decent amount of distributed processing power as well as a pretty big file store. Anyone?
-
SMcG Says:
August 16th, 2007 at 8:29 pmYou’re a frikin genius. Tomorrow when I’m not drunk, I’m gonna read this again to confirm your genius status. BTW, are you ever on the AutoIt forum?
