We are living in an interesting time where we are seeing technology innovations happening at much faster pace than ever. This is driven from the ever growing requirement of doing things faster, with extreme data volume, simplicity and at lower cost!
Many software-driven solutions have sprung up in recent years to leverage commodity hardware to provide very cost effective and easy to use infrastructure for running various workloads. While classical workloads (based on Databases – OLTP, OLAP, Exchange etc.) still drive the enterprise data center, newer workloads (based on Object stores, NoSQL etc.) have seen rapid adoption. Emerging scalable Analytics solutions are providing deeper insights, thereby enabling better decisions from rapidly/massively growing data (BigData). HyperVisors from various vendors have dramatically simplified the management of variety of workloads and have maximized the utilization of hardware. Public cloud vendors (Amazon, Azure etc.) and private/converged cloud vendors (VCE, Nutanix etc.) have rolled out tightly integrated hypervisors, management apps with scalability software IPs on off-the-shelf hardware to deliver infrastructure where workloads can be deployed and run with few clicks thereby greatly simplifying the job of data center admins. The Software Defined Data Center is no longer just a buzz word, it is happening now. Users are shifting from building their own infrastructure – by independently buying server, switch, storage, software – to either public or private clouds where resources are already integrated and ready to use!
These changes create exciting times for everyone in the data center!
But New Disruptive Innovations Are Happening in Hardware…
While this first wave of software-led innovation on commodity hardware of ‘today’, is continuing and maturing – a fundamental shift has begun to happen in underlying hardware technologies. These new hardware technologies are quite disruptive. As they transition from being mere ideas, to real products – another wave of software innovation is inevitable. These new hardware are not only showing the early signs of enormous benefits for the applications of today, but are also uncovering newer use cases. There is a great deal of excitement around the arrival of persistent memory (Xpoint etc), low latency interconnect product/solutions (RoCE, etc.), low overhead container technologies and recognition of new roles for FPGA/GPU. All these technologies are moving towards the same goal of accelerating workloads in a cost effective way.
So… What Does It Mean To Software And What Solution Opportunities They Present?
As most of these hardware components are making their way into the eco-system, they are also showing the need for the software stack to evolve. The software stack needs to adapt to consume one or ‘combination’ of these new components, in meaningful way, for dramatic improvement of workloads.
Let’s take a look at an example of two of these hardware innovation and potential gaps in today’s software stack preventing their full exploitation. The new ‘persistent memory’ and ‘low latency network interconnect’ technologies are promising that building a rack with following ingredients will be possible in near future:
Large persistent Memory (for storage) with ‘1µsec’ latency
Network interconnect with ‘1µsec’ latency
That’s an order of magnitude better than the combined latencies (100s µsec) that exist today for equivalent components within a rack. So, imagine the impact when access to persistent data both – ‘within’ and ‘across’ compute nodes can be super-efficient. It is very disruptive! These have the potential to help accelerate many of today’s workloads (5X/10X/20X acceleration?) irrespective of whether they are single threaded (1 queue depth) or multi-threaded (with multi queue depth). That means a rack built with these capabilities can run many more workloads (and faster) than can be done today in equivalent footprint. This has significant implication on business agility, power savings, real estate etc. But that’s not all. The new storage access models (persistent memory and low latency network interconnect) also promise to dramatically improve/simplify programming of quite a few applications. These innovations will have a larger impact on workloads than all-flash arrays had when they arrived in the data center!
However, the software stack of today is not ready to truly leverage the benefits offered by these upcoming disruptive hardware. The overhead of the current system software stack (in IO path and data services path) masks benefits these technologies offer. A research paper from Georgia Institute of Technology (Systems and Applications for Persistent Memory), notes:
“…Research has shown that, as storage becomes faster, software overheads tend to become the most dominant source of wasted effort, therefore necessitating rethinking of the software stacks . As discussed earlier, traditional storage stacks assume that storage is in a different address space, and operate on a block device abstraction. They implement intermediate layers such as page cache to stage the data. When using PM (persistent memory), such a layered design results in unnecessary copies and translations in the software. It is possible to eliminate these overheads by completely avoiding the page cache and the block layer abstraction. Providing low overhead (but managed) access to PM is critical to ensure that applications harness the full potential of PM… ”
Given these hardware components are coming and will become ‘commodity’ at some point, solving the Software stack (especially the IO path and data services path) problems of today, are a significant opportunity. Furthermore, because these components and software are not available in ‘usable overall product’ form, innovation to provide these capabilities in an integrated product is a tremendous opportunity. Someone needs to take a step back and build a solution which can glue together these discrete but related pieces of innovation in a usable ‘finished product’ form. Essentially build a user consumable end-product – which integrates these new components with innovative changes in software stack!
Well… Quite a Few Research and Efforts Are Already in Works…
Several open source initiatives are in play and many companies are collaborating together to standardize interfaces and show results on benefits for various workloads. Many possible solutions and workload transitions are being discussed.
Persistent Memory Programming Model
- “For many years computer applications organize their data between two tiers: memory and storage. We believe the emerging persistent memory technologies introduce a third tier. Persistent memory (or pmem for short) is accessed like volatile memory, using processor load and store instructions, but it retains its contents across power loss like storage.”
Georgia Institute of Technology
- SYSTEMS AND APPLICATIONS FOR PERSISTENT MEMORY
- “Emerging non-volatile (or persistent) memories bridge the performance and capacity gap between memory and storage, thereby introducing a new tier. To harness the full potential of future hybrid memory systems coupling DRAM with PM, we must build new system software and application mechanisms that enable the optimal use of PM as both fast storage and scalable low cost (but slower) memory“
- “A new programming model for persistent memory (PM) – NVM hardware designed to be treated by software similarly to system memory”
So… What Products/Solutions and Markets We are Talking About?
Momentum is building and recognition is growing about the existence and potential of these innovations as they make their way into the market and the expectation is that they will be the ‘commodity’ hardware in future. It takes time for the data center ecosystem to embrace a change unless the change is – transparent, dramatically improves existing architectures and is available for easy experimentation.
Accordingly, to really shake things up and move things at faster pace, there is a very attractive opportunity to roll out a solution/product – a software stack packaged with these innovative components – wherein the solution:
- Tangibly and transparently delivers the benefit to existing workloads and tremendously accelerates them.
- Creates opportunity for easier experimentation and deployment of newer workload by supporting the new open standards to let newer applications be developed on the platform.
I am very sure there are many similar minds out there who already are working on a product which will accomplish similar goals. I would love to hear from you on your progress or thoughts in general 🙂
In my next blog, I will cover a few other hardware innovations, their impact and why they need be part of overall solution too and how they may impact the converged solutions of today.
This blog is a product of research paper reviews, my past experience but most importantly the discussions I had on my initial idea with several experts in my network. I am thankful to all of them for their time in brainstorming. Further, thanks to all researchers who are publishing great papers in this area and keeping everyone enlightened with their results