Alternators are used in automobiles to charge the battery and to power the electrical system when the engine is running. Alternators are expected in an automobile like object storage is an expected component in today’s cloud computing environments. In my opinion, as an alternator is to a car, so is object storage to a cloud platform. Before I explain, let’s do a basic review of Storage.
Object storage has been around commercially for about fifteen years. It is the third leg of the three-legged storage stool, the other legs being block and file.
It’s said that “necessity is the mother of invention” and this is certainly true with object storage. As data creation, collection and retention increased faster than corporate IT budgets, companies searched for a more cost effective solution as an alternative to block or file storage. The advent of tiered storage utilizing different drive interfaces and spindle speeds helped some, but large-scale implementation was still a challenge. Flexible access, where you did not need to mount a file system, and the desire to access the storage from anywhere over an IP network, were fundamental drivers for the creation of object storage.
Here is a high-level comparison of object storage to the other two storage types:
Object storage certainly has some positive attributes (namely cost, scale and interface).
When you go to purchase an automobile, you normally would not ask the dealer the kilowatt rating of the alternator under the hood. In most car buying experiences, you are not concerned about the amount of power required to drive the stereo speakers, navigation system, a backup camera, entertainment system or Internet-enabled connectivity. Assuming the proper alternator has been installed, you inherently expect those features to work. Although you don’t buy a car based on the alternator, if you were told that the car was sold without an alternator you would probably look for a different car that has all the necessary parts. Object storage is similar in today’s cloud computing environments.
Consider Amazon Web Services (AWS). With over forty services available in the AWS console interface, Amazon’s object storage service, S3 (Simple Storage Service), was the second service they released (March 2006). Although the previously mentioned needs for object storage ignited the fire for the creation of S3, the plan that S3 would be a fundamental component used to enable other services was key to the overall vision. Today, a significant percentage of the forty plus AWS services utilize S3 in some way behind the scenes. Like an alternator powers features provided in today’s automobiles, S3 enables the AWS suite of cloud-based services.
So, if object storage is a critical component in a cloud computing environment, why is it so challenging to create a viable object storage solution? The reasons are cost, time and long-term vision.
To have a viable and competitive (build vs. buy) object storage offering you must have several things:
- A technology and implementation that scales while utilizing commodity hardware
- Global deployment and connectivity
- A Content Delivery Network (CDN) option
- A large amount of storage capacity
- An API that is widely accepted and used
- Very efficient operations
These items require significant upfront cost. Once built, the profit margins are very tight but more on that later.
A technology and implementation that scales while utilizing commodity hardware
There are some very good object storage technologies available. For example, Swift from OpenStack scales well and can utilize commodity hardware but even larger corporate deployments often require outside help for the implementation and operations which impacts the overall cost.
Global deployment and connectivity
Object storage is accessible over an IP network using REST based API calls. In the global economy we live in, storing the data in the right location due to government regulation or data sovereignty laws requires deployments in multiple countries on multiple continents. Once again, this drives up the cost of deployment.
A Content Delivery Network (CDN)
With global deployments comes latency in moving objects, especially large objects which represent a typical object storage use case. A viable object storage offering must have an integrated CDN option to improve performance for content distribution when needed.
A large amount of Storage capacity
One of the great things about object storage technology is that it scales. Users of object storage expect to have infinite storage to write to and utilize whenever and wherever they need it. Although the hardware (servers and storage) may consist of commodity components, at a large scale the cost is still significant, especially when you figure in the network cost. Also, for data protection, most solutions either use erasure coding or multiple copies of the data, which adds to the raw capacity required to run an object storage offering.
An API that is widely accepted and used
Regardless of the technology solution used to create your object storage solution, due to the wide adoption and use of the AWS S3 API, your object storage offering must be able to emulate the S3 API. Although porting code from one object storage API to another might not be that significant of a task, S3 contains features (like multipart upload) that most object storage users have come to expect and depend upon.
Very Efficient Operations
Running day-to-day operations for a global object storage environment is challenging. As mentioned above, object storage designs normally use erasure coding or multiple copies to protect data so storage usually needs to be added in specific quantities and in specific locations for the system to operate correctly. In addition, since object storage still uses physical disk drives, planning for data center floor space, power and cooling are critical. Also, break/fix events can be a load on operations. In large deployments, disk drives die every day of the week. Object storage architectures are designed to handle this fact so there is not a race to replace the drives; however, due to the amount of replacements, vendor management for the break/fix events must be efficient. In some cases, stocking local drives is considered even when stocking parts onsite is not a standard practice in the data center. Lastly, good event monitoring is essential because it can be challenging to monitor the infrastructure and the global network that connects it.
With all that is required for creating a successful object storage environment and the costs involved, how can a company save money creating their own object storage environment or a service provider make money selling object storage? This is where vision comes in. The object storage price wars are brutal and margins are somewhere south of “thin.” In my experience, the only way to have a viable object storage solution is to have as many services as possible utilizing it. These services usually create large cost savings (in the case of an internal DIY implementation) or have better margins (in the case of a service provider) and help make the overall business case work. This is where the wheels can fall for some implementations. Since it takes time (years) to create these services, the probability of someone in finance being patient enough for this to occur is slim unless a strong vision is in place.
My recommendation? If you take on the challenge and expense of building a cloud object storage offering, make sure you have a long-term vision and the financial commitment that will make it sustainable.