Alluxio 2.9 major new enhancements include scaling out with tenant-level isolation, simplifying DevOps on Kubernetes, and strengthening the security of S3 API
SAN MATEO, CA – November 16, 2022 - Alluxio, the developer of the open source data orchestration platform for data driven workloads such as large-scale analytics and AI/ML, today announced the immediate availability of version 2.9 of its Data Orchestration Platform. This new release strengthens its position as the key layer between compute engines and storage systems by delivering support for a scale-out, multi-tenant architecture with a new cross-environment synchronization feature, enhanced manageability with significant improvement in the tooling and guidelines for deploying Alluxio on Kubernetes, and improved security and performance with a strengthened S3 API and POSIX API.
"We are running one thousand nodes of Alluxio to optimize model training jobs and interactive queries,” said Peng Chen, Engineer Manager in the Big Data Team, Tencent. “Alluxio has become the de-facto choice for large internet companies to accelerate the development of their data analytics and AI applications.We are excited about the enhanced Kubernetes feature of the new release, which will make managing Alluxio even easier.”
"We have been using Alluxio as the data cache layer on top of multiple data centers to speed up the data access performance,” said Luo Li, Director of Data Infrastructure, Shopee. “Alluxio’s architecture enables us to support data ‘servitization.’ Furthermore, Alluxio has reduced our data infrastructure team's management overhead, especially for data distributed in multiple data centers, or even across countries.”
“Tenant-dedicated satellite clusters have become more common while architecting data platforms,” said Adit Madan, Director of Product Management, Alluxio. “Alluxio’s ability to actively synchronize metadata across multiple environments is significant, making the adoption of such an architecture easier than ever.”
Tenant isolation provides the scale and economic benefits of a multi-tenant architecture while rigorously preventing different teams from competing for access to shared data lake storage. With the new cross-environment synchronization feature, Alluxio evolves its architecture with significantly improved scalability and manageability enabling data platform teams to deploy multiple per-tenant Alluxio clusters between compute and storage cluster across any environment, based on workload capacity. Running Alluxio on Kubernetes helps standardize deployment methodologies across cloud, multi-cloud, hybrid-cloud, and on-premises environments. This new release introduces the Alluxio operator, which simplifies deploying, configuring, provisioning, and managing multiple Alluxio clusters, reducing DevOps complexity. Alluxio on Kubernetes also makes data stack portable to any environment, preventing vendor lock-in. Lastly, in Alluxio 2.9, authentication and access policies are now centralized through the communications between compute engines and Alluxio via S3 API. Therefore, Alluxio provides a unified security experience across heterogeneous storage either on-premise or in the cloud.
“Alluxio’s data orchestration platform aims to simplify, secure, and accelerate data access in heterogeneous analytics environments,” said Kevin Petrie, VP of Research, Eckerson Group. “These v2.9 enhancements seek to give new analytics users, applications, and projects the resources they need, with less effort and higher confidence in meeting SLAs. Alluxio does this by helping enterprises manage metadata, containerized deployments, and the security of its APIs more effectively.”
Alluxio 2.9 Community and Enterprise Edition features new capabilities, including:
Multi-Environment Cluster Synchronization
Alluxio 2.9 introduces the new cross-environment synchronization feature. This feature makes one Alluxio cluster aware of another Alluxio cluster by automatically syncing the metadata between Alluxio clusters. Deploying Alluxio clusters across any environment can achieve tenant-level isolation with the metadata of Alluxio clusters in sync at scale. This feature is particularly useful when adopting satellite architecture with compute clusters segregated across team-level tenants for isolation. With this new feature, multi-tenant architecture with Alluxio allows the platform to scale out and onboard new use cases without a central resource bottleneck, ensuring SLAs and simplifying metadata management operations.
Extended Manageability for Kubernetes
The new Alluxio 2.9 has added the Alluxio operator for Kubernetes. Administrators can now deploy and manage Alluxio on Kubernetes easily through the newly introduced Alluxio operator with CRD (custom resource definitions). The operator offers configuration management for deployment, connections to under storage, configuration updates, and uninstallation. Using the Alluxio operator removes the burden of deploying Alluxio on different environments, greatly reduces the amount of manual work and simplifies DevOps when managing multiple instances of Alluxio.
Enhanced S3 API Security with Better User Experience
Alluxio 2.9 further strengthens its S3 API providing a unified security model to applications with better user experience. By adopting the open authentication protocol for S3 API, Alluxio users will be verified before their requests are processed. This new feature allows data platform teams to connect to more advanced identity management systems (such as PingFederate) and leverage Single-Sign on (SSO) to enhance user experience. With a uniform authentication and authorization model, applications connected to Alluxio are portable across on-premises, hybrid or multi-cloud.
Availability
Free downloads of Alluxio 2.9 open source Community Edition and trials of Alluxio Enterprise Edition are immediately available here: https://www.alluxio.io/download/
Supporting Resources
- Check out these case studies to learn more about the multi-environment data platform architecture:
- Expedia Group has implemented Alluxio to federate cross-region data lakes in AWS. Alluxio unifies geo-distributed data silos without replication, enabling consistent and high performance with about 50% reduced costs: https://www.alluxio.io/blog/unifying-cross-region-access-in-the-cloud-at-expedia-group-the-path-toward-data-mesh-in-the-brand-world/
- A Fortune 50 technology company has successfully implemented Alluxio to achieve a hybrid-cloud strategy, become multi-cloud ready, cut costs, and boost agility: https://www.alluxio.io/app/uploads/2022/10/Fortune-50-Case-Study-2-pager.pdf
- Join the webinar to get a deep dive into Alluxio 2.9: https://us06web.zoom.us/webinar/register/WN_OhwAKADTQbenc9AZIErXFA
- Come visit Alluxio at PrestoCon
- Register now: https://events.linuxfoundation.org/prestocon/register/
- Join the session on 1:45 pm - 2:20 pm PT on Thursday, December 8: https://sched.co/1CzYl
Tweet this: @AlluxioIO re-imagines architecture for multi-tenant environments at scale #cloud #opensource #analytics #AI https://bit.ly/3NqQIub
About Alluxio
Alluxio, a leading provider of the high performance data platform for analytics and AI,
accelerates time-to-value of data and AI initiatives and maximizes infrastructure ROI. Uniquely
positioned at the intersection of compute and storage systems, Alluxio has a universal view of
workloads on the data platform across stages of a data pipeline. This enables Alluxio to provide
high performance data access regardless of where the data resides, simplify data engineering,
optimize GPU utilization, and reduce cloud and storage costs. With Alluxio, organizations can
achieve magnitudes faster model training and serving without the need for specialized storage,
and build AI infrastructure on existing data lakes. Backed by leading investors, Alluxio powers
technology, internet, financial services, and telecom companies, including 9 out of the top 10
internet companies globally. To learn more, visit www.alluxio.io.
Media Contact:
Beth Winkowski
Winkowski Public Relations, LLC for Alluxio
978-649-7189
beth@alluxio.com
News & Press
The team here at insideAI News is deeply entrenched in keeping the pulse on the big data ecosystem of companies from around the globe. We’re in close contact with the movers and shakers making waves in the technology areas of big data, data science, machine learning, AI and deep learning.
Alluxio provides a powerful unified framework for managing AI workloads across various distributed environments. By leveraging its capabilities, organizations can optimize their data access and processing, ensuring that AI models are trained efficiently and effectively.
Alluxio Enterprise AI serves as a robust distributed filesystem designed to streamline the management of AI workloads across various infrastructure environments. This platform enables seamless data sharing across business units and geographical locations, effectively eliminating the bottlenecks typically associated with data lake silos.