Alluxio, a company that traces its roots to the 2014 open source project Tachyon at UC Berkeley’s AMPLab, launched the Alluxio Enterprise AI (AEAI) platform focused on deep learning and generative AI applications. The new platform runs on the established Alluxio data platform and adds GPU resources and certain machine learning (ML) and deep learning library optimizations.
A serious start arrives at AI’s destination
While the original Alluxio data platform has always seemed similar to a data virtualization platform, it is actually quite different from most platforms with that label. Adit Mayan, director of product management at Alluxio, said in a briefing on The New Stack: “We have a lot of features like distributed caching and global access to data, regardless of where it comes from,” he continued. […] With people starting to adopt hybrid clouds and multiple object stores. […] For people with a multi-cloud strategy. ”
Further background: Alluxio provides distributed storage at in-memory speeds
Mayan explained that Alluxio Enterprise AI is a latecomer in this regard, but has a particular focus on deep learning and large-scale model training. It provides optimizations for ML and deep learning libraries such as Spark, PyTorch, and TensorFlow, and uses Alluxio’s new Distributed Object Repository Architecture (DORA) to provide high-performance I/O on commodity storage. DORA works with both cloud object storage (helping you avoid egress fees in the process) and open source and commercial on-premises storage platforms such as Hadoop Distributed File System (HDFS) and Minio. DORA also provides workload-specific optimizations for ML training and analysis. Other features include his Kubernetes operator and data preload for fast model deployments, which he says Alluxio says can reduce production deployment times by two to three times. I am.
In doing all this, Alluxio Enterprise AI has the ability to accelerate end-to-end AI pipelines such as large-scale language models (LLM), natural language processing, and computer vision. Alluxio can handle complex pipelines in this regard. This includes, for example, combining cloud and on-premises training jobs using data from both cloud object storage and HDFS, deploying new or updated models to the cloud, and service inference requests. (Call a model trained in production from a downstream application.
GPU optimization
However, when extended, AAEI does more than just accelerate performance. It also optimizes the use of your GPU (graphics processing unit, a specialized chip that powers demanding AI workloads). The trick is being aware of the multiple data source platforms as well as the ML and deep learning libraries that access the data. Mayan said: “We are tightly integrated with higher-level computing applications, so we are aware of the PyTorch program, for example, and in it we have the concept of a DataLoader, which is responsible for I/O access. It can intelligently detect if and when access is being accessed and connect neatly to that infrastructure to provide this benefit.”
check out:
Alluxio says its AEAI platform reduces the percentage of training execution time spent on PyTorch’s DataLoader from more than 80% to less than 2%, resulting in GPU utilization from less than 20% to less than 90%. It is said that it will jump up. more. Alluxio Enterprise AI can also be trained on cloud-based GPUs using on-premises training data. GPUs are expensive and may be in short supply within certain organizations, so this is a very strategic approach. Optimizing GPU usage can reduce the cost of training as well as his AI model inference, but Mayan says the main benefit of GPU optimization is that it provides more capacity, allowing more model experimentation and training.
LLM dividend
While this alone is good enough, there is another benefit that is unique to the generative AI context. According to the company, the Alluxio Enterprise AI platform can also accelerate fine-tuning of pre-trained underlying models. The fine-tuning process allows the generalized LLM to be tailored to the specific data and context of a specific organization. This is effectively a customized training path that is key to leveraging your LLM in a corporate environment.
With the introduction of these new capabilities, Alluxio has found a use case for its approach in data platform architecture and AI that, for some, outweighs the data access benefits it provides. Alluxio Enterprise AI now makes machine learning, deep learning, LLM training, inference, and tuning more practical and economical, replacing the query and analytics caching and acceleration layers that some people confuse with data virtualization platforms. This is the processing layer to perform. This applies to large corporate organizations as well as small, cost-conscious organizations.
Enabling AI at enterprise scale
It is difficult to predict whether AEAI will further increase Alluxio’s market traction. But even if it doesn’t, this approach is laudable and could create a new paradigm for AI platforms. Enterprises can’t just meander along, combining AI cloud services with a general-purpose data analytics layer. Instead, we need a platform that eliminates the extra layers of processing, computation, and resource usage that can result from these hodgepodge approaches. That’s the only way to scale AI, and without that scale, AI can’t become ubiquitous and practical.
While the cloud is an economical place to store the large amounts of data needed to train today’s AI models, it can be an inefficient place. access Get that data and complete the training. By highly optimizing file access through cloud object storage and leveraging GPUs to make data available to customers wherever they are, AEAI increases efficiency that simply seemed elegant on the Alluxio data platform. Realize tangible economic and time-to-market benefits. The world of applied AI continues to grow.