Forecast multiple time series with a TimesFM univariate model

This tutorial teaches you how to use the AI.FORECAST function with BigQuery ML's built-in TimesFM univariate model to forecast the future value for a given column, based on the historical value of that column.

This tutorial uses data from the public bigquery-public-data.san_francisco_bikeshare.bikeshare_trips table.

Forecast a single bike share trips time series

Forecast future time series values by using the AI.FORECAST function.

The following query forecasts the number of subscriber bike share trips per hour for the next month (approximately 720 hours), based on the previous four months of historical data. The confidence_level argument indicates that the query generates a prediction interval with a 95% confidence level.

Follow these steps to forecast data with the TimesFM model:

  1. In the Cloud de Confiance console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and click Run:

    SELECT *
    FROM
      AI.FORECAST(
        (
          SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips
    FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
    WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01')
    GROUP BY TIMESTAMP_TRUNC(start_date, HOUR)
        ),
        horizon => 720,
        confidence_level => 0.95,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips');

    The results look similar to the following:

    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | forecast_timestamp      | forecast_value    | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 00:00:00 UTC | 26.3045959...     |            0.95  | 21.7088378...                   | 30.9003540...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 01:00:00 UTC | 34.0890502...     |            0.95  | 2.47682913...                   | 65.7012714...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | 2018-05-01 02:00:00 UTC | 24.2154693...     |            0.95  | 2.87621605...                   | 45.5547226...                   |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | ...                     | ...               |  ...             | ...                             |  ...                            |                    |
    +-------------------------+-------------------+------------------+---------------------------------+---------------------------------+--------------------+
    

Compare the forecasted data to the input data

Chart the AI.FORECAST function output alongside a subset of the function input data to see how they compare.

Follow these steps to chart the function output:

  1. In the Cloud de Confiance console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and click Run:

    WITH historical AS (
    SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, COUNT(*) as num_trips
    FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
    WHERE subscriber_type = 'Subscriber' AND start_date >= TIMESTAMP('2018-01-01')
    GROUP BY TIMESTAMP_TRUNC(start_date, HOUR)
    ORDER BY TIMESTAMP_TRUNC(start_date, HOUR)
    )
    SELECT * 
    FROM 
    (
    (SELECT
        trip_hour as date,
        num_trips AS historical_value,
        NULL as forecast_value,
        'historical' as type,
        NULL as prediction_interval_lower_bound,
        NULL as prediction_interval_upper_bound
    FROM
        historical
    ORDER BY historical.trip_hour DESC
    LIMIT 400)
    UNION ALL
    (SELECT forecast_timestamp AS date,
            NULL as historical_value,
            forecast_value as forecast_value,
            'forecast' as type,
            prediction_interval_lower_bound,
            prediction_interval_upper_bound
    FROM
        AI.FORECAST(
        (
        SELECT * FROM historical
        ),
        horizon => 720,
        confidence_level => 0.99,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips')))
    ORDER BY date asc;
  3. When the query is finished running, click the Visualization tab in the Query results pane. The resulting chart looks similar to the following:

    Graph 100 time points of input data along with the AI.FORECAST function output data to evaluate their similarity.

    You can see that the input data and the forecasted data show similar bike share usage. You can also see that the prediction interval lower and upper bounds increase as the forecasted time points get further into the future.

Forecast multiple bike share trips time series

The following query forecasts the number of bike share trips per subscriber type and per hour for the next month (approximately 720 hours), based on the previous four months of historical data. The confidence_level argument indicates that the query generates a prediction interval with a 95% confidence level.

Follow these steps to forecast data with the TimesFM model:

  1. In the Cloud de Confiance console, go to the BigQuery page.

    Go to BigQuery

  2. In the query editor, paste in the following query and click Run:

    SELECT *
    FROM
      AI.FORECAST(
        (
          SELECT TIMESTAMP_TRUNC(start_date, HOUR) as trip_hour, subscriber_type, COUNT(*) as num_trips
          FROM `bigquery-public-data.san_francisco_bikeshare.bikeshare_trips`
          WHERE start_date >= TIMESTAMP('2018-01-01')
          GROUP BY TIMESTAMP_TRUNC(start_date, HOUR), subscriber_type
        ),
        horizon => 720,
        confidence_level => 0.95,
        timestamp_col => 'trip_hour',
        data_col => 'num_trips',
        id_cols => ['subscriber_type']);

    The results look similar to the following:

    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | subscriber_type     | forecast_timestamp       | forecast_value   | confidence_level | prediction_interval_lower_bound | prediction_interval_upper_bound | ai_forecast_status |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          | 2018-05-01 00:00:00 UTC  | 26.3045959...    |            0.95  | 21.7088378...                   | 30.9003540...                   |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          |  2018-05-01 01:00:00 UTC | 34.0890502...    |            0.95  | 2.47682913...                   | 65.7012714...                   |                    |
    +---------------------+-------------------+------------------+-------------------------+---------------------------------+---------------------------------+--------------------+
    | Subscriber          |  2018-05-01 02:00:00 UTC | 24.2154693...    |            0.95  | 2.87621605...                   | 45.5547226...                   |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+
    | ...                 | ...                      |  ...             | ...              | ...                             |  ...                            |                    |
    +---------------------+--------------------------+------------------+------------------+---------------------------------+---------------------------------+--------------------+