MySQL HeatWave Lakehouse

MySQL HeatWave Lakehouse (beta)

MySQL HeatWave enables users to process and query hundreds of terabytes of data in the object store—in a variety of file formats, such as CSV, Parquet, and Aurora/Redshift export files. The data remains in the object store and customer can query it with standard SQL syntax. With this capability, , MySQL HeatWave provides one service for transaction processing, analytics across data warehouses and data lakes, and machine learning—without ETL across cloud services. There is no additional cost for this capability except the cost of storing the data in object store.

Faster than Snowflake and Amazon Redshift

As demonstrated by a 400 TB TPC-H benchmark, the query performance of MySQL HeatWave Lakehouse is 17X faster than Snowflake and 6X faster than Amazon Redshift. Loading data into MySQL HeatWave Lakehouse is also significantly faster. The load performance of MySQL HeatWave Lakehouse is 8X faster than Amazon Redshift and 2.7X faster than Snowflake, as demonstrated by the 400 TB TPC-H benchmark.

Fast lakehouse analytics on all data

Customers can query transactional data in MySQL databases, data in various formats in object storage, or a combination of both using standard MySQL commands. Querying the data in the database is as fast as querying data in the object store, as demonstrated by 10 TB and 30 TB TPC-H benchmarks.

MySQL HeatWave Lakehouse

MySQL HeatWave Lakehouse, lets users process and query hundreds of terabytes of data in the object store.

Scale-out architecture for data management and query processing

The massively partitioned architecture of HeatWave enables a scale-out architecture for MySQL HeatWave Lakehouse. Query processing and data management operations, such as loading/reloading data or node recovery, scale with the size of data. Customers can query up to 400 TB of data with MySQL HeatWave Lakehouse, and the HeatWave cluster scales to 512 nodes.

Increase performance and ease of use with machine learning–powered automation

MySQL Autopilot capabilities such as auto provisioning and auto query plan improvement have been enhanced for MySQL HeatWave Lakehouse, which further reduces database administration overhead and improves performance. New MySQL Autopilot capabilities are also available for MySQL HeatWave Lakehouse.

  • Auto schema inference automatically infers the mapping of file data to data types in the database. As a result, customers don’t need to manually specify the mapping for each new file to be queried by MySQL HeatWave Lakehouse, saving time and effort.
  • Adaptive data sampling intelligently samples portions of files in object storage, collecting accurate statistics with minimal data access. MySQL HeatWave uses these statistics to generate and improve query plans, determine the optimal schema mapping, and other purposes.
  • Auto load analyzes data to predict the load time into HeatWave, determines the mapping of data types, and automatically generates loading scripts. Users don’t have to manually specify the mapping of files to database schemas and tables.
  • Adaptive data flow: MySQL HeatWave Lakehouse dynamically adapts to the performance of the underlying object store. As a result, MySQL HeatWave gets the maximum available performance from the underlying cloud infrastructure, which improves overall performance, price-performance, and availability.

Additional Resources