Saturday 20 November 2021

HeatWave in MySQL

 

HeatWave

HeatWave provisions HeatWave clusters for accelerated processing of analytic queries. A HeatWave cluster consists of a MySQL DB System node and two or more HeatWave nodes. The MySQL DB System node includes a HeatWave plugin that is responsible for cluster management, query scheduling, and returning query results to the MySQL DB System. HeatWave nodes store data in memory and process queries.

When a HeatWave cluster is enabled, queries that meet certain prerequisites are automatically offloaded from the MySQL DB System to the HeatWave cluster for accelerated execution. Queries are issued from a MySQL client or application that interacts with the HeatWave cluster by connecting to the MySQL DB System node. Results are returned to the MySQL DB System node and to the MySQL client or application that issued the query.

This guide describes how to deploy and manage HeatWave clusters. It is intended for HeatWave administrators and assumes familiarity with MySQL and related tools. After deploying a HeatWave cluster, refer to the HeatWave User's Guide for information about how to load data and run queries.

Alternatively, refer to one of the HeatWave Quickstarts:

  • The HeatWave tpch Quickstart walks you through adding a HeatWave cluster to a MySQL DB System, importing the tpch sample database into the into the DB System using the MySQL Shell Parallel Table Import Utility, manually loading data into HeatWave, and running queries.
  • The HeatWave airportdb Quickstart walks you through adding a HeatWave cluster to a MySQL DB System, importing the airportdb sample database into the DB System using the MySQL Shell Dump Load utility, loading data into HeatWave using Auto Parallel Load, and running queries.

For information about MySQL, see the product documentation available at MySQL Documentation.

Adding a HeatWave Cluster to a DB System

Ensure the following:
  • The DB System was created using a MySQL.HeatWave.VM.Standard.E3 or MySQL.HeatWave.BM.Standard.E3 shape. The MySQL.HeatWave.BM.Standard.E3 shape is recommended for data sets of 10TB or more.
To add a HeatWave cluster to an existing MySQL DB System:
  1. Open the navigation menu and select Databases. Under MySQL, click DB Systems.
  2. Add your HeatWave cluster in one of the following ways:
    • Select the DB System and choose Add HeatWave Cluster from the Actions icon (three dots) on the same line as your DB System.
    • Open the DB System and select HeatWave from the Resources list. On the HeatWave Cluster Information frame, select Add HeatWave Cluster.
    The Add HeatWave Cluster dialog is displayed.
  3. On the Add HeatWave Cluster dialog, provide the following details:
    • Shape: The shape for the HeatWave nodes. Shapes are resource templates, defining the number of OCPUs, RAM, and so on. Click Change Shape to select a shape for your HeatWave nodes.
       Note

      Currently, only the MySQL.HeatWave.VM.Standard.E3 shape is supported for HeatWave nodes.
    • Node Count: The number of HeatWave nodes to create. A maximum of 64 nodes is supported. The minimum number of nodes is 2. Optionally, click Estimate Node Count to generate a node count estimate to determine the number of nodes required based on the shape you selected and the size of your data. For instructions, see Generating a Node Count Estimate.
  4. Click Add HeatWave Cluster to create the HeatWave cluster.

Managing HeatWave Clusters

Editing a HeatWave Cluster

This topic describes how to edit a HeatWave cluster.

  1. Open the navigation menu and select Databases. Under MySQL, click DB Systems.
    A list of DB Systems is displayed.
  2. In the HeatWave filter, select Attached to filter the DB Systems by those with HeatWave clusters attached.
  3. Edit your HeatWave cluster in one of the following ways:
    • Select the DB System and choose Edit HeatWave Cluster from the Actions icon (three dots) on the same line as your DB System.
    • Open the DB System and select HeatWave from the Resources list.
  4. Click Edit.
    The Edit HeatWave Cluster dialog is displayed.
  5. On the Edit HeatWave Cluster dialog, you can change the number of nodes in the HeatWave cluster and run node count estimations.
    To determine the number of nodes you should create, click Estimate Node Count. For instructions, see Generating a Node Count Estimate.
  6. When the edits are complete, click Save Changes to apply the changes.

HeatWave Cluster Information

This section describes the HeatWave Cluster Information and HeatWave Node sections of the MySQL DB System Details page, where you can view, manage, and edit your HeatWave cluster .

Table 14-1 HeatWave Cluster Information

FieldDescription
ActionsThe following actions are possible:
Cluster SizeThe number of nodes in the cluster.
ShapeThe shape used for HeatWave cluster nodes.
StateThe state of the HeatWave cluster:
  • CREATING: Yellow icon. Resources are being reserved for the HeatWave cluster, and the cluster is being created. Provisioning can take several minutes. The cluster is not ready to use yet.

  • ACTIVE: Green icon. The HeatWave cluster was successfully created.

  • UPDATING: Yellow icon. The HeatWave cluster is in the process of starting, stopping, restarting, or updating after an edit.

  • INACTIVE: Grey icon. The HeatWave cluster is powered off by the stop action in the console or API.

  • DELETING: Orange icon. The HeatWave cluster is being deleted by the terminate action in the console or API.

  • DELETED: Grey icon. The HeatWave cluster has been deleted and is no longer available.

  • FAILED: Red icon. An error condition prevented the creation or continued operation of the HeatWave cluster.

CreatedThe date and time the HeatWave cluster was created.
Last UpdatedThe date and time the HeatWave cluster was last updated.

Table 14-2 HeatWave Nodes

ColumnDescription
Node IdThe name of the HeatWave node.
StateCurrent state of the HeatWave node:
  • CREATING: Yellow icon. Resources are being reserved for the HeatWave node, and the node is being created. Provisioning can take several minutes. The node is not ready to use yet.

  • ACTIVE: Green icon. The HeatWave node was successfully created.

  • UPDATING: Yellow icon. The HeatWave node is in the process of starting, stopping, restarting, or updating.

  • INACTIVE: Grey icon. The HeatWave node is powered off by the stop or restart action in the Console or API.

  • DELETING: Orange icon. The HeatWave node is being deleted by the terminate action in the Console or API.

  • DELETED: Grey icon. The HeatWave node has been deleted and is no longer available.

  • FAILED: Red icon. An error condition prevented the creation or continued operation of the HeatWave node.

CreatedThe date and time the HeatWave node was created.

Checking HeatWave Cluster Status

This topic describes how to check the status of a HeatWave cluster.

  1. Open the navigation menu and select Databases. Under MySQL, click DB Systems.
    A list of DB Systems is displayed.
  2. In the HeatWave filter, select Attached to filter the DB Systems by those with HeatWave clusters attached.
  3. Choose your Compartment.
  4. In the list of DB Systems, find yours and check the icon in the HeatWave State column. The color of the icon and the associated text indicate the status of the HeatWave cluster.
    • CREATING: Yellow icon. Resources are being reserved for the HeatWave cluster, and the HeatWave cluster is being created. Provisioning can take several minutes. The HeatWave cluster is not ready to use yet.

    • ACTIVE: Green icon. The HeatWave cluster was successfully created.

    • UPDATING: Yellow icon. The HeatWave cluster is in the process of starting, stopping, restarting, or updating.

    • INACTIVE: Grey icon. The HeatWave cluster is powered off by the stop or restart action in the Console or API.

    • DELETING: Orange icon. The HeatWave cluster is being deleted by the terminate action in the Console or API.

    • DELETED: Grey icon. The HeatWave cluster has been deleted and is no longer available.

    • FAILED: Red icon. An error condition prevented the creation or continued operation of the HeatWave cluster.

  5. To check the status of HeatWave nodes, click the name of the DB System to open the DB System Details page, then select HeatWave from the Resources list.

HeatWave Cluster Failure and Recovery

When a HeatWave node failure is detected, HeatWave automatically attempts to bring the node back online and reform the cluster.

 Note

Status of the HeatWave cluster can be monitored on the MySQL DB Systems list page. See Checking HeatWave Cluster Status for more information. HeatWave monitors node status using a heartbeat mechanism. If there is no response from a node after 60 seconds, the node is considered down, which triggers the recovery process.

When the HeatWave cluster is reformed, HeatWave attempts to reload data that was previously loaded. By default, data is reloaded from an Object Storage persistence layer that is created when the HeatWave cluster is enabled for the first time. To facilitate recovery, data is persisted to Object Storage when it is loaded into HeatWave and when data changes are propagated from the DB System to HeatWave. If reloading data from Object Storage fails for any reason, HeatWave reloads the data from the DB System. Loading data from Object Storage is faster because the data does not need to be converted to the HeatWave storage format, as is required when loading data from the DB System. Data removed from HeatWave when a table is unloaded is removed from Object Storage in a background operation.

To monitor the recovery process, you can connect to the DB System using a MySQL client and query the rapid_service_status variable, which reports the status of the HeatWave cluster. For information about connecting to the DB System, see Connecting to a DB System.

mysql> SHOW STATUS LIKE 'rapid_service_status';
+----------------------+--------------+
| Variable_name        | Value        |
+----------------------+--------------+
| rapid_service_status | ONLINE       |
+----------------------+--------------+
Possible status values are:
  • OFFLINE: The HeatWave cluster has not formed yet. No operations can run.

  • CLUSTERREADY: The HeatWave cluster has formed but some tables are pending recovery. Operations are permitted on tables that are loaded. To determine which tables are loaded, you can query the HeatWave Performance Schema tables.

    mysql> USE performance_schema;
    mysql> SELECT NAME, LOAD_STATUS FROM rpd_tables,rpd_table_id WHERE rpd_tables.ID = rpd_table_id.ID;
  • ONLINE: The HeatWave cluster is established and previously loaded tables are reloaded. Operations are permitted.

  • RECOVERYFAILED: There was an unsuccessful attempt to recover the tables after a HeatWave cluster failure. The cause of the failure could be a network error, a software failure, a hardware failure, or some other issue that cannot be resolved by the automated recovery process. In most such scenarios, stopping and starting the HeatWave cluster can resolve the issue.

    One possible cause is that the HeatWave cluster has run out of memory. After the HeatWave cluster is back online, ensure that you have enough nodes for your data by performing a node count estimate. See Generating a Node Count Estimate.

Starting and Stopping a HeatWave Cluster

This topic describes how to start or stop a HeatWave cluster.

  1. Open the navigation menu and select Databases. Under MySQL, click DB Systems.
    A list of DB Systems is displayed.
  2. In the HeatWave filter, select Attached to filter the DB Systems by those with HeatWave clusters attached.
  3. Choose your DB System.
  4. In the Resources list, select HeatWave.
    The HeatWave Cluster Information section is displayed.
  5. Select one of the following actions:
    • Start: Starts a stopped HeatWave cluster. After the HeatWave cluster is started, the Stop action is enabled and the Start option is disabled.
    • Stop: Stops a running HeatWave cluster. After the HeatWave cluster is stopped, the Start action is enabled.
    • Restart: Shuts down a HeatWave cluster, and restarts it. You must reload data to the cluster.
    Stopping the HeatWave cluster stops billing for the cluster. Billing resumes if you restart the HeatWave cluster.
     Note

    These actions have no affect on the DB System to which the HeatWave cluster is attached. However, start, stop, or restart actions on the DB System also affect the attached HeatWave cluster.

Deleting a HeatWave Cluster

This topic describes how to delete a HeatWave cluster. Deleting a HeatWave cluster removes the HeatWave cluster permanently.

  1. Open the navigation menu and select Databases. Under MySQL, click DB Systems.
    A list of DB Systems is displayed.
  2. In the HeatWave filter, select Attached to filter the DB Systems by those with a HeatWave cluster attached.
  3. Open your DB System and select HeatWave from the Resources list.
  4. To delete the HeatWave cluster, click Delete and follow the instructions.
    Deleting the HeatWave cluster does not delete the DB System or any of the data.
     Note

    Deleting the HeatWave cluster has no affect on the DB System to which the HeatWave cluster is attached. However, deleting the DB System also deletes the attached HeatWave cluster.

HeatWave Cluster Size Estimates

This topic describes HeatWave Cluster Node Count Estimates.

HeatWave Cluster Node Count Estimates provide a recommendation for the number of HeatWave nodes needed to run a workload. Recommendations are based on machine learning. When the service is started, database tables on which HeatWave queries are run need to be loaded to HeatWave cluster memory. The size of the HeatWave cluster needed depends on the tables and columns to be loaded, and the compression achieved in memory for this data. Under-provisioning the HeatWave cluster results in a data load or query execution failure due to space limitations. Over-provisioning the HeatWave cluster results in additional costs for unneeded resources. Based on the database tables the user intends to load to memory, machine learning is used to intelligently predict the number of HeatWave nodes needed.

The following topics are described:

Generating a Node Count Estimate

This topic describes how to generate a HeatWave node count estimate to determine the number of nodes required for your data.

A node count estimate is generated using machine learning techniques based on the node shape that you select and the data present in the DB System associated with the HeatWave cluster. You can generate a node count estimate when adding a HeatWave cluster to a DB System, or at any time after to adjust the number of nodes as your data increases or decreases in size.

Prerequisites:
  • The data you intend to load into the HeatWave cluster must be present on the DB System.
  • Optionally, log into your DB System and run ANALYZE TABLE on tables you intend to load into the HeatWave cluster. Estimates should generally be valid without running ANALYZE TABLE, but running ANALYZE TABLE ensures that estimates are as accurate as possible.
To generate a node count estimate:
  1. Click Estimate Node Count on either the Add HeatWave Cluster or Edit HeatWave Cluster dialogs.
    The Estimate Node Count dialog is displayed.
  2. On the Estimate Node Count dialog, click Generate Estimate.
    If you recently generated a node count estimate, the previous estimate details are displayed. Click Regenerate Estimate to create a new estimate.
    The operation may take several minutes depending on the size and properties of your data. When the operation completes, you are presented with a table that displays information about the schemas that were evaluated. The information shown includes:
    • Name: The name of the schema.
    • Memory Size Estimate: The estimated amount of memory required for the schema.
    • Information: The number of tables in the schema and the number of tables with errors.
  3. Select the schemas that you want to include in the node count estimate.
    The estimate details in the Summary dialog box are adjusted automatically after modifying the schema selection.
  4. Optionally, expand the schema rows to view information about individual tables. Deselect tables that you do not want to include in the estimate.
     Note

    The Information column reports errors if there are problems with a table. For example, an error is reported for tables with unsupported column data types, tables without a primary key, or tables with too many columns. Tables with errors are not included in the node count estimate. You can regenerate the node count estimate after resolving the errors. For information about potential table errors, see Node Count Estimate Table Errors.
    The estimate details in the Summary are adjusted automatically after modifying the table selection.
  5. Review the estimate details in the Summary, which provides the following information:
    • Shape: The selected HeatWave node shape.
    • CPU Core Count: The CPU core count of the selected HeatWave node shape.
    • Memory Size: The memory size of the selected HeatWave node shape.
    • Node Count: The estimated number of HeatWave nodes required based on the data size and the selected HeatWave node shape.
    • Total Memory Required: The estimated amount of memory required for the HeatWave cluster based on the data size.
    • Total Memory Size: The total HeatWave cluster memory size, which is the memory size of the selected HeatWave node shape multiplied by the Node Count estimate.
     Note

    The load command that appears below the Summary is generated based on the schemas and tables selected for the node count estimate. The command can be used after the HeatWave cluster is provisioned to load the selected schemas and tables. You can run the command from any MySQL client that is connected to the DB System.
  6. To apply the node count estimate, click Apply Node Count Estimate.

Node Count Estimate Table Errors

This topic describes table errors that may appear in node count estimate results.

Node Count Estimate Table Errors

Table ErrorDescription
TOO MANY COLUMNS TO LOADThe table has too many columns. The column limit is 470.
ALL COLUMNS MARKED AS NOT SECONDARYThere are no columns to load. All table columns are defined as NOT SECONDARY.
CONTAINS VARLEN COLUMN WITH >8000 BYTESVARLEN column exceeds the 8000 byte limit. For more information on VARLEN, see VARLEN Encoding
ESTIMATION COULD NOT BE CALCULATEDThe estimate could not be calculated. For example, a table estimate may not be available if statistics for VARLEN columns are unavailable.
UNABLE TO LOAD TABLE WITHOUT PRIMARY KEYA table must be defined with a primary key before it can be loaded into HeatWave.

HeatWave tpch Quickstart

The HeatWave tpch Quickstart walks you through adding a HeatWave cluster to a MySQL DB System, importing data into the into the DB System using the MySQL Shell Parallel Table Import Utility, manually loading data into HeatWave, and running queries.

The following topics are described:

Prerequisites for the tpch Quickstart

This topic lists the prerequisites for the HeatWave tpch Quickstart.

Before you begin, ensure that you have the following:
  • An operational MySQL DB System created using the MySQL.HeatWave.VM.Standard.E3 or MySQL.HeatWave.BM.Standard.E3 shape. If you do not have a MySQL DB System, refer to Getting Started with MySQL Database Service for the steps required to create one. Make note of the IP address of the DB System's Endpoint, and the administration user and password.
  • A running Compute instance (Oracle Linux is used in this Quickstart) attached to a public subnet on the same VCN as the MySQL DB System. Make note of the public IP address of the compute instance. For information about setting up a Compute instance, refer to Creating a Compute Instance.
  • MySQL Shell 8.0.22 or higher installed on the Compute instance. For installation instructions, see Connecting to the MySQL DB System with SSH and MySQL Shell.
  • Access to Object Storage and an existing bucket.
  • A valid OCI CLI configuration file. See SDK and CLI Configuration File. If you have installed and configured the Command Line Interface in the default location, you have a valid configuration file. If you have not installed and configured the CLI, you must either install it, or create a configuration file manually. MySQL Shell references the CLI configuration file when accessing the Object storage bucket.
  • In addition to the mandatory policies for the MySQL DB System, you, or your group, have been granted the mysql-analytics policies described in Policy Details for MySQL Database Service.
  • The tpch sample database referenced in the Quickstart is an ad-hoc decision support database derived from the TPC Benchmark™ H (TPC-H) specification. The database consist of eight separate tables. For an overview of the database schema, refer to the Logical Database Design section of the specification document. Example DDL statements are provided for creating the schema and tables, but you must supply your own data files for populating the tpch sample database. You can generate data using the dbgen tool provided in the TPC-H Tools download from TPC Download Current. At least 1GB of data is recommended. For instructions, refer to the reference section of the HeatWave User Guide.

Uploading tpch Sample Data to Object Storage

This topic describes how to upload tpch sample data files to an Object Storage bucket.

The tpch sample data files referenced in the following instructions are in CSV format with fields terminated by the pipe "|" character. In total, the files should contain approximately 1GB of data. The tpch sample data files include:
  • nation.tbl
  • region.tbl
  • part.tbl
  • supplier.tbl
  • partsupp.tbl
  • customer.tbl
  • orders.tbl
  • lineitem.tbl
To upload the tpch sample data files to the Object Storage bucket:
  1. In the Oracle Cloud Infrastructure Console navigation menu, go to Object Storage, and then select Object Storage.
  2. From the list of Buckets, click Create Bucket or select your object storage bucket.
  3. On the Bucket Details page, under Objects, click Upload.
  4. In the Upload Objects panel, drag and drop the tpch sample database files to the dropzone, or click select files to locate them on your machine.
  5. Click Upload, and then click Close.
  6. Note the object URLs for each file that you uploaded. In the list of objects, click the Actions icon (three dots) for an object, and then click View Object Details. The object URLs are required later when loading data.

Creating the tpch Sample Database and Importing Data

This topic describes how to create the tpch sample database on the MySQL DB System and import the sample data. The sample data must be available on the MySQL DB System before it can be loaded into the HeatWave Cluster.

Sample database creation and import operations are performed using MySQL Shell. The MySQL Shell parallel table import utility provides fast data import for large data files. The utility analyzes an input data file, divides it into chunks, and uploads the chunks to the target MySQL DB System using parallel connections. The utility is capable of completing a large data import many times faster than a standard single-threaded upload using a LOAD DATA statement. For additional information, see Parallel Table Import Utility.

 Note

If you prefer to use your own data instead of the tpch sample database, refer to Importing and Exporting for information about loading data into the MySQL DB System. After loading data into the MySQL DB System, refer to the HeatWave User Guide for information about preparing data and loading it into the HeatWave Cluster.
To create the tpch sample database on the MySQL DB System and import data:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh --mysql Username@IPAddressOfMySQLDBSystemEndpoint 
    Please provide the password for 'Username@IPAddressOfMySQLDBSystemEndpoint':
    Save password for 'Username@IPAddressOfMySQLDBSystemEndpoint'? 
    [Y]es/[N]o/Ne[v]er (default No):

    The --mysql option opens a ClassicSession, which is required when using the MySQL Shell Parallel Table Import Utility.

    MySQL Shell opens in JavaScript execution mode by default.

    MySQL>JS>
  3. Change the MySQL Shell execution mode from JavaScript to SQL:
    MySQL>JS> \sql
  4. Create the tpch sample database and tables:
    CREATE DATABASE tpch character set utf8mb4;
    USE tpch;
    
    CREATE TABLE nation  ( N_NATIONKEY INTEGER primary key,
        N_NAME       CHAR(25) NOT NULL,
        N_REGIONKEY  INTEGER NOT NULL,
        N_COMMENT    VARCHAR(152));
    							
    CREATE TABLE region  ( R_REGIONKEY INTEGER primary key,
        R_NAME       CHAR(25) NOT NULL,
        R_COMMENT    VARCHAR(152));
    							
    CREATE TABLE part  ( P_PARTKEY INTEGER primary key,
        P_NAME        VARCHAR(55) NOT NULL,
        P_MFGR        CHAR(25) NOT NULL,
        P_BRAND       CHAR(10) NOT NULL,
        P_TYPE        VARCHAR(25) NOT NULL,
        P_SIZE        INTEGER NOT NULL,
        P_CONTAINER   CHAR(10) NOT NULL,
        P_RETAILPRICE DECIMAL(15,2) NOT NULL,
        P_COMMENT     VARCHAR(23) NOT NULL );
    						  
    CREATE TABLE supplier  ( S_SUPPKEY INTEGER primary key,
        S_NAME        CHAR(25) NOT NULL,
        S_ADDRESS     VARCHAR(40) NOT NULL,
        S_NATIONKEY   INTEGER NOT NULL,
        S_PHONE       CHAR(15) NOT NULL,
        S_ACCTBAL     DECIMAL(15,2) NOT NULL,
        S_COMMENT     VARCHAR(101) NOT NULL);
    							 
    CREATE TABLE partsupp  ( PS_PARTKEY INTEGER NOT NULL,
        PS_SUPPKEY     INTEGER NOT NULL,
        PS_AVAILQTY    INTEGER NOT NULL,
        PS_SUPPLYCOST  DECIMAL(15,2)  NOT NULL,
        PS_COMMENT     VARCHAR(199) NOT NULL, primary key (ps_partkey, ps_suppkey) );
    						
    CREATE TABLE customer  ( C_CUSTKEY INTEGER primary key,
        C_NAME        VARCHAR(25) NOT NULL,
        C_ADDRESS     VARCHAR(40) NOT NULL,
        C_NATIONKEY   INTEGER NOT NULL,
        C_PHONE       CHAR(15) NOT NULL,
        C_ACCTBAL     DECIMAL(15,2)   NOT NULL,
        C_MKTSEGMENT  CHAR(10) NOT NULL,
        C_COMMENT     VARCHAR(117) NOT NULL);
    							 
    CREATE TABLE orders  ( O_ORDERKEY INTEGER primary key,
        O_CUSTKEY        INTEGER NOT NULL,
        O_ORDERSTATUS    CHAR(1) NOT NULL,
        O_TOTALPRICE     DECIMAL(15,2) NOT NULL,
        O_ORDERDATE      DATE NOT NULL,
        O_ORDERPRIORITY  CHAR(15) NOT NULL,
        O_CLERK          CHAR(15) NOT NULL,
        O_SHIPPRIORITY   INTEGER NOT NULL,
        O_COMMENT        VARCHAR(79) NOT NULL);
    						   
    CREATE TABLE lineitem ( L_ORDERKEY INTEGER NOT NULL,
        L_PARTKEY     INTEGER NOT NULL,
        L_SUPPKEY     INTEGER NOT NULL,
        L_LINENUMBER  INTEGER NOT NULL,
        L_QUANTITY    DECIMAL(15,2) NOT NULL,
        L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
        L_DISCOUNT    DECIMAL(15,2) NOT NULL,
        L_TAX         DECIMAL(15,2) NOT NULL,
        L_RETURNFLAG  CHAR(1) NOT NULL,
        L_LINESTATUS  CHAR(1) NOT NULL,
        L_SHIPDATE    DATE NOT NULL,
        L_COMMITDATE  DATE NOT NULL,
        L_RECEIPTDATE DATE NOT NULL,
        L_SHIPINSTRUCT CHAR(25) NOT NULL,
        L_SHIPMODE     CHAR(10) NOT NULL,
        L_COMMENT      VARCHAR(44) NOT NULL,
        primary key(L_ORDERKEY,L_LINENUMBER));
  5. Verify that the tpch schema and tables were created:
    MySQL>SQL> SHOW TABLES;
    +----------------+
    | Tables_in_tpch |
    +----------------+
    | customer       |
    | lineitem       |
    | nation         |
    | orders         |
    | part           |
    | partsupp       |
    | region         |
    | supplier       |
    +----------------+
  6. Change back to JavaScript execution mode to use the parallel table import utility:
    MySQL>SQL> \js
  7. Execute the following operations to import the data into the tpch database on the MySQL DB System. The object URLs for your data files will appear similar to "https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89rs/b/bucket-name/o/nation.tbl". Object URLs are found in the Object Storage bucket. In the list of objects, click the Actions icon (three dots) for an object, and then click View Object Details.
     Note

    For information about the util.importTable() options used in the following commands, see MySQL Shell Parallel Table Import Utility. The number of parallel threads specified using the threads option depends on the number of CPU cores of the shape. It is assumed that sample data fields are terminated by the pipe "|" character.
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/nation.tbl", {schema:"tpch", table:"nation", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})/nation.tbl", {schema:"tpch", table:"nation", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16, skipRows:1})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/region.tbl", {schema:"tpch", table:"region", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/part.tbl", {schema:"tpch", table:"part", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/supplier.tbl", {schema:"tpch", table:"supplier", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/partsupp.tbl", {schema:"tpch", table:"partsupp", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/customer.tbl", {schema:"tpch", table:"customer", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/orders.tbl", {schema:"tpch", table:"orders", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})
    MySQL>JS> util.importTable("https://objectstorage.us-ashburn-1.oraclecloud.com/n/axipyfug89sr/b/bucket-name/o/lineitem.tbl", {schema:"tpch", table:"lineitem", fieldsTerminatedBy:"|", bytesPerChunk:"100M", threads:16})

Adding a HeatWave Cluster

This topic describes how to add a HeatWave cluster to your MySQL DB System.

To add a HeatWave cluster:

  1. Open the navigation menu and select Databases. Under MySQL, click DB System.
  2. On the MySQL DB Systems page, select your DB System.
  3. On the MySQL DB Systems Details page, select Add HeatWave Cluster from the More Actions drop-down menu.
  4. On the Add HeatWave Cluster dialog, click Select Shape to select a shape for your HeatWave nodes. The shape defines the number of CPU cores, the amount of RAM, and so on. Currently, only the MySQL.HeatWave.VM.Standard.E3 shape is supported for HeatWave nodes.
  5. Define the number of HeatWave nodes to create. The default number of nodes is 2, which is a sufficient number of nodes for the HeatWave tpch Quickstart, assuming roughly 1GB of data. Optionally, you can perform a node count estimate to determine an appropriate number of nodes to create. For details, see Generating a Node Count Estimate.
  6. Click Add HeatWave Cluster to create the HeatWave cluster.

Loading tpch Data Into the HeatWave Cluster

This topic describes how to load tpch data into the HeatWave cluster.

Loading data into the HeatWave cluster typically involves identifying the tables you want to load, excluding columns that are not required or supported, encoding string columns, defining RAPID as the secondary engine, and executing table load operations.
For detailed information about preparing and loading tables, refer to the HeatWave User Guide.
To load the tpch sample data into the HeatWave cluster:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh Username@IPAddressOfMySQLDBSystemEndpoint
    MySQL Shell opens in JavaScript execution mode by default.
    MySQL>JS>
  3. Change the MySQL Shell execution mode to SQL:
    MySQL>JS> \sql
  4. Execute the following operations to prepare the tpch sample database tables and load them into the HeatWave cluster. The operations performed include defining string column encodings, defining the secondary engine, and executing SECONDARY_LOAD operations.
    USE tpch;
    
    ALTER TABLE nation modify `N_NAME` CHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE nation modify `N_COMMENT` VARCHAR(152) COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE nation SECONDARY_ENGINE=RAPID;
    ALTER TABLE nation SECONDARY_LOAD;
    
    ALTER TABLE region modify `R_NAME` CHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE region modify `R_COMMENT` VARCHAR(152) COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE region SECONDARY_ENGINE=RAPID;
    ALTER TABLE region SECONDARY_LOAD;
    
    ALTER TABLE part modify `P_MFGR` CHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE part modify `P_BRAND` CHAR(10) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE part modify `P_CONTAINER` CHAR(10) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE part modify `P_COMMENT` VARCHAR(23) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE part SECONDARY_ENGINE=RAPID;
    ALTER TABLE part SECONDARY_LOAD;
    
    ALTER TABLE supplier modify `S_NAME` CHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE supplier modify `S_ADDRESS` VARCHAR(40) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE supplier modify `S_PHONE` CHAR(15) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE supplier SECONDARY_ENGINE=RAPID;
    ALTER TABLE supplier SECONDARY_LOAD;
    
    ALTER TABLE partsupp modify `PS_COMMENT` VARCHAR(199) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE partsupp SECONDARY_ENGINE=RAPID;
    ALTER TABLE partsupp SECONDARY_LOAD;
    
    ALTER TABLE customer modify `C_NAME` VARCHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE customer modify `C_ADDRESS` VARCHAR(40) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE customer modify `C_MKTSEGMENT` CHAR(10) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE customer modify `C_COMMENT` VARCHAR(117) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE customer SECONDARY_ENGINE=RAPID;
    ALTER TABLE customer SECONDARY_LOAD;
    
    ALTER TABLE orders modify `O_ORDERSTATUS` CHAR(1) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE orders modify `O_ORDERPRIORITY` CHAR(15) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE orders modify `O_CLERK` CHAR(15) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE orders SECONDARY_ENGINE=RAPID;
    ALTER TABLE orders SECONDARY_LOAD;
    
    ALTER TABLE lineitem modify `L_RETURNFLAG` CHAR(1) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE lineitem modify `L_LINESTATUS` CHAR(1) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE lineitem modify `L_SHIPINSTRUCT` CHAR(25) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE lineitem modify `L_SHIPMODE` CHAR(10) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE lineitem modify `L_COMMENT` VARCHAR(44) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=SORTED';
    ALTER TABLE lineitem SECONDARY_ENGINE=RAPID;
    ALTER TABLE lineitem SECONDARY_LOAD;
  5. Verify that the tpch sample database tables are loaded in the HeatWave cluster by querying LOAD_STATUS data from the HeatWave Performance Schema tables. Loaded tables have an AVAIL_RPDGSTABSTATE load status.
    MySQL>SQL> USE performance_schema;
    MySQL>SQL> SELECT NAME, LOAD_STATUS FROM rpd_tables,rpd_table_id
               WHERE rpd_tables.ID = rpd_table_id.ID;
    +------------------------------+---------------------+
    | NAME                         | LOAD_STATUS         |
    +------------------------------+---------------------+
    | tpch.supplier                | AVAIL_RPDGSTABSTATE |
    | tpch.partsupp                | AVAIL_RPDGSTABSTATE |
    | tpch.orders                  | AVAIL_RPDGSTABSTATE |
    | tpch.lineitem                | AVAIL_RPDGSTABSTATE |
    | tpch.customer                | AVAIL_RPDGSTABSTATE |
    | tpch.nation                  | AVAIL_RPDGSTABSTATE |
    | tpch.region                  | AVAIL_RPDGSTABSTATE |
    | tpch.part                    | AVAIL_RPDGSTABSTATE |
    +------------------------------+---------------------+

Running tpch Queries

This topic describes how to query tpch data in the HeatWave cluster.

After tables are loaded into the HeatWave cluster, queries that qualify are automatically offloaded to the HeatWave cluster for accelerated processing. To run queries:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh Username@IPAddressOfMySQLDBSystemEndpoint
    MySQL Shell opens in JavaScript execution mode by default.
    MySQL>JS>
  3. Change the MySQL Shell execution mode to SQL:
    MySQL>JS> \sql
  4. Change to the tpch database.
    MySQL>SQL> USE tpch;
    Default schema set to `tpch`.Fetching table and column names from `tpch` for 
    auto-completion... Press ^C to stop.
  5. Before running a query, use EXPLAIN to verify that the query can be offloaded to the HeatWave cluster. For example:
    MySQL>SQL> EXPLAIN SELECT SUM(l_extendedprice * l_discount) AS revenue 
               FROM lineitem WHERE l_shipdate >= date '1994-01-01'\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: lineitem
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 56834662
         filtered: 33.33
            Extra: Using where; Using secondary engine RAPID
    If the query can be offloaded, the Extra column in the EXPLAIN output reports "Using secondary engine RAPID".
  6. After verifying that the query can be offloaded, run the query and note the execution time.
    MySQL>SQL> SELECT SUM(l_extendedprice * l_discount) AS revenue 
    FROM lineitem WHERE l_shipdate >= date '1994-01-01';
    +------------------+
    | revenue          |
    +------------------+
    | 82752894454.9036 |
    +------------------+
    1 row in set (0.04 sec)
  7. To compare the HeatWave execution time with MySQL DB System execution time, disable the use_secondary_engine variable to see how long it takes to run the same query on the MySQL DB System. For example:
    MySQL>SQL> SET SESSION use_secondary_engine=OFF;
    MySQL>SQL> SELECT SUM(l_extendedprice * l_discount) AS revenue FROM lineitem WHERE l_shipdate >= date '1994-01-01';
    +------------------+
    | revenue          |
    +------------------+
    | 82752894454.9036 |
    +------------------+
    1 row in set (24.20 sec)

For other tpch sample database queries that you can run, see Additional tpch Queries. For more information about running queries, refer to the HeatWave User Guide.

Additional tpch Queries

This topic provides additional tpch queries that you can run to test the HeatWave Cluster.

TPCH-Q1: Pricing Summary Report Query

As described in the TPC Benchmark™ H (TPC-H) specification: "The Pricing Summary Report Query provides a summary pricing report for all lineitems shipped as of a given date. The date is within 60 - 120 days of the greatest ship date contained in the database. The query lists totals for extended price, discounted extended price, discounted extended price plus tax, average quantity, average extended price, and average discount. These aggregates are grouped by RETURNFLAG and LINESTATUS, and listed in ascending order of RETURNFLAG and LINESTATUS. A count of the number of lineitems in each group is included."

SELECT
    l_returnflag,
    l_linestatus,
    SUM(l_quantity) AS sum_qty,
    SUM(l_extendedprice) AS sum_base_price,
    SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    AVG(l_quantity) AS avg_qty,
    AVG(l_extendedprice) AS avg_price,
    AVG(l_discount) AS avg_disc,
    COUNT(*) AS count_order
FROM
    lineitem
WHERE
    l_shipdate <= DATE '1998-12-01' - INTERVAL '90' DAY
GROUP BY l_returnflag , l_linestatus
ORDER BY l_returnflag , l_linestatus;

TPCH-Q3: Shipping Priority Query

As described in the TPC Benchmark™ H (TPC-H) specification: "The Shipping Priority Query retrieves the shipping priority and potential revenue, defined as the sum of l_extendedprice * (1-l_discount), of the orders having the largest revenue among those that had not been shipped as of a given date. Orders are listed in decreasing order of revenue. If more than 10 unshipped orders exist, only the 10 orders with the largest revenue are listed."

SELECT
    l_orderkey,
    SUM(l_extendedprice * (1 - l_discount)) AS revenue,
    o_orderdate,
    o_shippriority
FROM
    customer,
    orders,
    lineitem
WHERE
    c_mktsegment = 'BUILDING'
        AND c_custkey = o_custkey
        AND l_orderkey = o_orderkey
        AND o_orderdate < DATE '1995-03-15'
        AND l_shipdate > DATE '1995-03-15'
GROUP BY l_orderkey , o_orderdate , o_shippriority
ORDER BY revenue DESC , o_orderdate
LIMIT 10;

TPCH-Q9: Product Type Profit Measure Query

As described in the TPC Benchmark™ H (TPC-H) specification: "The Product Type Profit Measure Query finds, for each nation and each year, the profit for all parts ordered in that year that contain a specified substring in their names and that were filled by a supplier in that nation. The profit is defined as the sum of [(l_extendedprice*(1-l_discount)) - (ps_supplycost * l_quantity)] for all lineitems describing parts in the specified line. The query lists the nations in ascending alphabetical order and, for each nation, the year and profit in descending order by year (most recent first). "

SELECT
    nation, o_year, SUM(amount) AS sum_profit
FROM
    (SELECT
        n_name AS nation,
            YEAR(o_ORDERdate) AS o_year,
            l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity AS amount
    FROM
        part
    STRAIGHT_JOIN partsupp
    STRAIGHT_JOIN lineitem
    STRAIGHT_JOIN supplier
    STRAIGHT_JOIN orders
    STRAIGHT_JOIN nation
    WHERE
        s_suppkey = l_suppkey
            AND ps_suppkey = l_suppkey
            AND ps_partkey = l_partkey
            AND p_partkey = l_partkey
            AND o_ORDERkey = l_ORDERkey
            AND s_nationkey = n_nationkey
            AND p_name LIKE '%green%') AS profit
GROUP BY nation , o_year
ORDER BY nation , o_year DESC;

HeatWave airportdb Quickstart

The HeatWave airportdb Quickstart walks you through adding a HeatWave cluster to a MySQL DB System, importing data into the DB System using the MySQL Shell Dump Load utility, loading data into HeatWave using Auto Parallel Load, and running queries.

The following topics are described:

 Note

The airportdb Quickstart involves uploading the airportdb data files to an Object Storage bucket and loading airportdb into a DB System from the Object Storage bucket. Alternatively, if you do not want to install from an Object Storage bucket, you can download airportdb to a Compute instance and load it into the DB System directly from the Compute instance. For instructions, see Installing airportdb from a Compute Instance.

Prerequisites for the airportdb Quickstart

This topic lists the prerequisites for the HeatWave airportdb Quickstart.

Before you begin, ensure that you have the following:
  • An operational MySQL DB System created using the MySQL.HeatWave.VM.Standard.E3 or MySQL.HeatWave.BM.Standard.E3 shape. If you do not have a MySQL DB System, refer to Getting Started with MySQL Database Service for the steps required to create one. Make note of the IP address of the DB System's Endpoint, and the administration user and password.
  • A running Compute instance (Oracle Linux is used in this Quickstart) attached to a public subnet on the same VCN as the MySQL DB System. Make note of the public IP address of the compute instance. For information about setting up a Compute instance, refer to Creating a Compute Instance.
  • MySQL Shell 8.0.22 or higher installed on the Compute instance. For installation instructions, see Connecting to the MySQL DB System with SSH and MySQL Shell.
  • Access to Object Storage and an existing bucket.
  • A valid OCI CLI configuration file. See SDK and CLI Configuration File. If you have installed and configured the Command Line Interface in the default location, you have a valid configuration file. If you have not installed and configured the CLI, you must either install it, or create a configuration file manually. MySQL Shell references the CLI configuration file when accessing the Object storage bucket.
  • In addition to the mandatory policies for the MySQL DB System, you, or your group, have been granted the mysql-analytics policies described in Policy Details for MySQL Database Service.

Uploading airportdb Sample Data to Object Storage

This topic describes how to upload the airportdb data files to an Object Storage bucket.

To upload the data files to the Object Storage bucket:
  1. Download the airportdb sample database to a temporary location such as /tmp/ or C:\temp\ and unpack it. The airportdb sample database is available for download from Other MySQL Documentation as a compressed tar or Zip archive, which is approximately 640 MBs in size. The database consists of 14 tables and is approximately 2GB in size when installed. For an overview of the airportdb schema, refer to Setting Up the airportdb Database.
  2. In the Oracle Cloud Infrastructure Console navigation menu, go to Object Storage, and then select Object Storage.
  3. From the list of buckets, click Create Bucket or select your object storage bucket.
  4. On the Bucket Details page, under Objects, click Upload.
  5. In the Upload Objects panel, drag and drop the sample database files to the dropzone, or click select files to locate them on your machine.
  6. Click Upload, and then click Close.

Loading airportdb on the DB System

This topic describes how to load the airportdb sample database on the MySQL DB System using the MySQL Shell dump loading utility. Data must be available on the MySQL DB System before it can be loaded into the HeatWave Cluster.

The MySQL Shell dump loading utility util.loadDump() supports loading schemas on a DB System. Using parallel threads, the utility is capable of completing a large data loads many times faster than a standard single-threaded upload. For additional information, see Dump Loading Utility.

 Note

If you prefer to use your own data instead of the airportdb sample database, refer to Importing and Exporting for information about loading data into the MySQL DB System. After loading your own data into the MySQL DB System, refer to the HeatWave User Guide for information about preparing and loading data into the HeatWave Cluster.
To load airportdb sample database on the MySQL DB System:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh Username@IPAddressOfMySQLDBSystemEndpoint
  3. Load the airportdb database into the MySQL DB System using the MySQL Shell dump loading utility. The bucket name and namespace are required.
    JS> util.loadDump("airportdb", {threads: 16, loadIndexes: "false", osBucketName: "bucket-name", 
        ignoreVersion: true})
     Note

    For information about the util.loadDump() options used in the preceding command, see Dump Loading Utility. The number of parallel threads specified using the threads option depends on the number of CPU cores of the shape. The loadIndexes: "false" option prevents the creation of secondary indexes, which are not used by the HeatWave Cluster.

Adding a HeatWave Cluster

This topic describes how to add a HeatWave cluster to your MySQL DB System.

To add a HeatWave cluster:

  1. Open the navigation menu and select Databases. Under MySQL, click DB System.
  2. On the MySQL DB Systems page, select your DB System.
  3. On the MySQL DB Systems Details page, select Add HeatWave Cluster from the More Actions drop-down menu.
  4. On the Add HeatWave Cluster dialog, click Select Shape to select a shape for your HeatWave nodes. The shape defines the number of CPU cores, the amount of RAM, and so on. Currently, only the MySQL.HeatWave.VM.Standard.E3 shape is supported for HeatWave nodes.
  5. Define the number of HeatWave nodes to create. The default number of nodes is 2, which is a sufficient number of nodes for the HeatWave airportdb Quickstart, assuming roughly 2GB of data. Optionally, you can perform a node count estimate to determine an appropriate number of nodes to create. For details, see Generating a Node Count Estimate.
  6. Click Add HeatWave Cluster to create the HeatWave cluster.

Loading airportdb Data Into the HeatWave Cluster

This topic describes how to load airportdb data into HeatWave using Auto Parallel Load.

Loading data into HeatWave typically involves several manual steps such as excluding columns that are not required or supported, defining RAPID as the secondary engine, and executing table load operations. Auto Parallel Load facilitates this process by automating those steps and optimizing the number of parallel load threads for each table. For more information, see Auto Parallel Load.
To load airportdb data into HeatWave:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh Username@IPAddressOfMySQLDBSystemEndpoint
    MySQL Shell opens in JavaScript execution mode by default.
    MySQL>JS>
  3. Change the MySQL Shell execution mode to SQL and run Auto Parallel Load to load airportdb data into HeatWave:
    MySQL>JS> \sql
    MySQL>SQL> CALL sys.heatwave_load(JSON_ARRAY('airportdb'), NULL);
    

Running airportdb Queries

This topic describes how to query airportdb data in the HeatWave cluster.

After airportdb sample database tables are loaded into the HeatWave cluster, queries that qualify are automatically offloaded to the HeatWave cluster for accelerated processing. To run queries:
  1. SSH into the Compute instance using the public IP address of the Compute instance.
    ssh opc@computeInstancePublicIP
  2. Start MySQL Shell and connect to the MySQL DB System's endpoint:
    shell> mysqlsh Username@IPAddressOfMySQLDBSystemEndpoint
    MySQL Shell opens in JavaScript execution mode by default.
    MySQL>JS>
  3. Change the MySQL Shell execution mode to SQL:
    MySQL>JS> \sql
  4. Change to the airportdb database.
    MySQL>SQL> USE airportdb;
    Default schema set to `airportdb`.Fetching table and column names from `airportdb` for 
    auto-completion... Press ^C to stop.
  5. Before running a query, use EXPLAIN to verify that the query can be offloaded to the HeatWave cluster. For example:
    MySQL>SQL> EXPLAIN SELECT booking.price, count(*) FROM booking WHERE booking.price > 500 GROUP BY booking.price ORDER BY booking.price LIMIT 10;
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: booking
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 54081693
         filtered: 33.32999801635742
            Extra: Using where; Using temporary; Using filesort; Using secondary engine RAPID
    If the query can be offloaded, the Extra column in the EXPLAIN output reports "Using secondary engine RAPID".
  6. After verifying that the query can be offloaded, run the query and note the execution time.
    MySQL>SQL> SELECT booking.price, count(*) FROM booking WHERE booking.price > 500 GROUP BY booking.price ORDER BY booking.price LIMIT 10;
    +--------+----------+
    | price  | count(*) |
    +--------+----------+
    | 500.01 |      860 |
    | 500.02 |     1207 |
    | 500.03 |     1135 |
    | 500.04 |     1010 |
    | 500.05 |     1016 |
    | 500.06 |     1039 |
    | 500.07 |     1002 |
    | 500.08 |     1095 |
    | 500.09 |     1117 |
    | 500.10 |     1106 |
    +--------+----------+
    10 rows in set (0.0537 sec)
  7. To compare the HeatWave execution time with MySQL DB System execution time, disable the use_secondary_engine variable to see how long it takes to run the same query on the MySQL DB System. For example:
    MySQL>SQL> SET SESSION use_secondary_engine=OFF;
    MySQL>SQL> SELECT booking.price, count(*) FROM booking WHERE booking.price > 500 GROUP BY booking.price ORDER BY booking.price LIMIT 10;
    +--------+----------+
    | price  | count(*) |
    +--------+----------+
    | 500.01 |      860 |
    | 500.02 |     1207 |
    | 500.03 |     1135 |
    | 500.04 |     1010 |
    | 500.05 |     1016 |
    | 500.06 |     1039 |
    | 500.07 |     1002 |
    | 500.08 |     1095 |
    | 500.09 |     1117 |
    | 500.10 |     1106 |
    +--------+----------+
    10 rows in set (9.3859 sec)

For other airportdb sample database queries that you can run, see Additional airportdb Queries. For more information about running queries, refer to the HeatWave User Guide.

Additional airportdb Queries

This topic provides additional airportdb queries that you can run to test the HeatWave Cluster.

Query 1: Number of Tickets > $500.00, Grouped By Price

SELECT
    booking.price,
    count(*)
FROM
    booking
WHERE
    booking.price > 500
GROUP BY
    booking.price
ORDER BY
    booking.price
LIMIT
    10;

Query 2: Average Age of Passengers By Country, Per Airline

SELECT
    airline.airlinename,
    AVG(datediff(departure,birthdate)/365.25) as avg_age,
    count(*) as nb_people
FROM
    booking, flight, airline, passengerdetails
WHERE
    booking.flight_id=flight.flight_id AND
    airline.airline_id=flight.airline_id AND
    booking.passenger_id=passengerdetails.passenger_id AND
    country IN ("SWITZERLAND", "FRANCE", "ITALY")
GROUP BY
    airline.airlinename
ORDER BY
    airline.airlinename, avg_age
LIMIT 10;

Query 3: Most Tickets Sales by Airline for Departures from US Airports


SELECT
    airline.airlinename,
    SUM(booking.price) as price_tickets,
    count(*) as nb_tickets
FROM
    booking, flight, airline, airport_geo
WHERE
    booking.flight_id=flight.flight_id AND
    airline.airline_id=flight.airline_id AND
    flight.from=airport_geo.airport_id AND
    airport_geo.country = "UNITED STATES"
GROUP BY
    airline.airlinename
ORDER BY
    nb_tickets desc, airline.airlinename
LIMIT 10;

No comments:

Post a Comment