Cloud Computing has arrived. Whether it's the new Windows Azure, Google's AppEngine, Amazon's S3/EC2, or something a little less obvious like the SalesForce platform, you have options available. Unfortunately, they all have one thing in common: none of them really take advantage of a cloud. In every single one of these offerings your app ultimately runs on a single server in a traditional datacenter. All they've done is abstract things away a bit so that you don't need to know which server it is. That's neat, but it's not really what I want from a cloud system.
What I'd like to see is a cloud platform that's designed for a smaller scale- your corporate LAN. As it is, a company provides each office worker with a desktop computer. After the next refresh at your company, these computers will all have at least a dual core processor and 120GB hard drives. That's a very powerful system, and it's at the low-end. Unfortunately, most of this capacity sits idle. What would really be cool would be a system that lets you harness this idle power.
Let's think a minute about how this would work. You would still want a traditional server, but the purpose of this server would be coordination. You wouldn't ask it to do any heavy lifting on it's own. Then you would need to deploy a client application to every computer in the company, and this class of application would likely need to be specifically supported by the operating system. It could be patched in via a mechanism like a virtual device driver, but that's messy. If it turns out kernel mode access is required, than operating system support would be preferred. Fortunately this already exists, in the form of virtual machine support.
Once installed and configured, the application would cordon off a segment of each computer's hard drive and make it available to the coordinating server. The server then turns around and exposes this space to the network as a normal file share. It needs to be smart enough to tell a remote system to send data to the desired location directly, rather than have to retransmit things itself, but that's just a simple logistical problem. So the first service I'm talking about is a SAN. I'll build on this to do other useful things, but let's talk about the characteristics of the SAN first.
Because individual desktops are unreliable, you would need large amounts of redundancy. You wouldn't want the failure of a switch serving a 20 node work group to cut off the entire company from parts of it's data, for example. To solve this issue, two things have to happen.
The first is that it needs to take a "local machine first" approach. If the data is already there, don't retransmit it over the network. Whenever a user requests data from the server's share that isn't already on the local machine that data is forwarded to the local machine (all messages encrypted, of course). If the user needs that data again, now it will be there. This should make the network bandwidth required for keeping data synced manageable, because now for many requests data never needs to enter the network at all. This should also make access time much faster than existing SANs.
The second is that the server administrator will need to divide the clients into an appropriate number of groups during setup. Each client will need to belong to a group, and each group will need a complete copy of the data. A key here is that the groups are only for redundancy: a client can retrieve data from any group and should not have a preference for it's own, except in the case where the data is on the local machine. This will allow the server to load balance the system, such that you should always get an efficient response for requests. And that will free up more bandwidth to use for syncing data across groups.
For very small networks (up to, say, 20 nodes) one group would probably be enough. For larger networks many groups may be needed. Special care will be needed for WAN networks: your intuition would tell you to put a group at each WAN site, but this would be wrong. It would force any every change to always go through the WAN connection. Similarly, you would need to ensure that there is at least one group NOT represented at any WAN site, or a loss of the WAN connection would take the entire company down rather than just the remote site. The server would also need to be smart enough to notice when it's down to one complete group and be able to take steps to fix that situation.
I want to consider back ups, as well. First of all, under this scheme it's not necessary to have historical backups. I'll mention it briefly later, but for what I envision document versioning will be built in. That means you only need to make sure you have adequate redundancy on the current system. To make the backup itself, an administrator could simply set up a server with adequate hard disk space in it's own group, and clone that server's disk at whatever frequency he wants.
I still have to overcome the limitation that write operations can be slow, because the coordination server needs to know about every write and coordinate with nodes from other groups to ensure redundancy. To makes things worse, this would need to be transactional.
So now we have a high performance SAN where data is often cached right on the local machine, and we only need a little setup and one simple server- better performance and more space at potentially a fraction of the cost. The only downside is the potential for more network traffic and slower writes. I don't really know how much impact these issues will have, but I believe they can be overcome.
Now a SAN is nice. If well implemented, this SAN system by itself would be worth a fortune. But we can do more. We want to be able to build real applications. Of course, these applications would need to be highly parallel in nature, but fortunately the three applications I have in mind just happen to fit the bill.
The first application is, of course, a database. By building a database directly into the platform you can make everything else much more efficient. All files and data streams are just records in this database, and they can be indexed, cached, or anything else you might use a database to do.
Next there's search. As long you have individual machines shunting documents (records) all over the place, they might as well index them while they're at it. This work will of course be spread over individual clients, make the process much faster. We can even have the database structure all prepared. Then the client program can provide an interface into your new document repository, as well as allow you to attach categories (and versions!) to files. It's an instant enterprise document library and distributed source control.
Finally, there'd be a built in web server. This is the platform on which developers would create their own applications, because web pages are naturally parallel: each page access can run in it's own process. So much the better if that process happens to run on the exact computer that requested the page.
So this is the cloud system I'm waiting for. Not something that runs in someone else's data center, but the ability to take advantage of the hardware infrastructure that's already available in the office. It provides instant redundancy and scales automatically with your organization.
It would ideally be implemented to run existing technologies (IE: an open source version would adapt MySQL and Apache, Microsoft would adapt SQL Server/IIS) so that businesses could easily move their existing intranet to the system. There's some real power. There are a few admitted weaknesses: slow write times, and a single point of failure at the coordination server. The client application would need to be configured to not use too much cpu or memory. But I think these concerns can be overcome.
Now if only I had the low-level development experience to even try to implement it.
Comments
Incredible idea - your right though, so much processing power goes to waste in the typical office setup.