Alluxio learning center
Beginner to advanced topics on analytics, AI/ML, storage, and cloud concepts
Presto was originally designed at Facebook to run interactive queries against large data warehouses in Hadoop and run fast queries against data warehouses storing petabytes of data.
A typical Presto deployment will include one Presto Coordinator and any number of Presto Workers. In practice, you might deploy Presto in the cloud or on-prem.
What is a query engine, more specifically, a SQL query engine? Learn about the benefits of using, along with examples.
What is a query engine, more specifically, a SQL query engine? Learn about the benefits of using, along with examples.
Amazon Elastic MapReduce (EMR) is a tool for processing and analyzing big data quickly. Using query tools like Spark, Hive, HBase, and Presto along with storage (like S3) and compute capacity (like EC2).
The key differences between Amazon EMR and EC2, and how EMR works.
Amazon EMR provides scalable compute in the cloud, including interactive queries with Presto, for big data in S3 storage.
GPU acceleration, or graphics processing unit acceleration is a computing technique that utilizes not only central processing units (CPU), but also graphics processing units (GPU) to accelerate performance of data intensive applications.
Computer Vision is the ability of computers to recognize, analyze, and process visual contents using the way humans do. With AI technologies and algorithms, computers can learn to understand the patterns and traits of visual data.
Apache Spark is an open source analytics framework for big data, AI, and machine learning best used for large-scale data processing.
Apache Spark includes Spark Core and four libraries: Spark SQL, MLlib, GraphX, and Spark Streaming. Individual applications will typically require Spark Core and at least one of these libraries.
Hadoop Distributed File System (HDFS) is the primary data storage system under Hadoop applications. It is a distributed file system and provides high-throughput access to application data.
Learn basic HDFS commands in Linux, enabling you to create and list directories, move, delete, read files, and more.