Reference documentation and code samples for the BigQuery API class Google::Cloud::Bigquery::External::CsvSource.
CsvSource
CsvSource is a subclass of DataSource and represents a CSV external data source that can be queried from directly, such as Google Cloud Storage or Google Drive, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
Example
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.autodetect = true csv.skip_leading_rows = 1 end data = bigquery.query "SELECT * FROM my_ext_table", external: { my_ext_table: csv_table } # Iterate over the first page of results data.each do |row| puts row[:name] end # Retrieve the next page of results data = data.next if data.next?
Methods
#delimiter
def delimiter() -> String
The separator for fields in a CSV file.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.delimiter = "|" end csv_table.delimiter #=> "|"
#delimiter=
def delimiter=(new_delimiter)
Set the separator for fields in a CSV file.
- new_delimiter (String) — New delimiter value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.delimiter = "|" end csv_table.delimiter #=> "|"
#encoding
def encoding() -> String
The character encoding of the data.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8"
#encoding=
def encoding=(new_encoding)
Set the character encoding of the data.
- new_encoding (String) — New encoding value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8"
#fields
def fields() -> Array<Schema::Field>
The fields of the schema.
- (Array<Schema::Field>) — An array of field objects.
#headers
def headers() -> Array<Symbol>
The names of the columns in the schema.
- (Array<Symbol>) — An array of column names.
#iso8859_1?
def iso8859_1?() -> Boolean
Checks if the character encoding of the data is "ISO-8859-1".
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "ISO-8859-1" end csv_table.encoding #=> "ISO-8859-1" csv_table.iso8859_1? #=> true
#jagged_rows
def jagged_rows() -> Boolean
Indicates if BigQuery should accept rows that are missing trailing optional columns.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.jagged_rows = true end csv_table.jagged_rows #=> true
#jagged_rows=
def jagged_rows=(new_jagged_rows)
Set whether BigQuery should accept rows that are missing trailing optional columns.
- new_jagged_rows (Boolean) — New jagged_rows value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.jagged_rows = true end csv_table.jagged_rows #=> true
#null_marker
def null_marker() -> String, nil
Specifies a string that represents a null value in a CSV file. For
example, if you specify \N
, BigQuery interprets \N
as a null value when
querying a CSV file. The default value is the empty string. If you set this
property to a custom value, BigQuery throws an error if an empty string is
present for all data types except for STRING and BYTE. For STRING and BYTE
columns, BigQuery interprets the empty string as an empty value.
-
(String, nil) — The null marker string.
nil
if not set.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.null_marker = "\N" end csv_table.null_marker #=> "\N"
#null_marker=
def null_marker=(null_marker)
Sets a string that represents a null value in a CSV file. For
example, if you specify \N
, BigQuery interprets \N
as a null value when
querying a CSV file. The default value is the empty string. If you set this
property to a custom value, BigQuery throws an error if an empty string is
present for all data types except for STRING and BYTE. For STRING and BYTE
columns, BigQuery interprets the empty string as an empty value.
-
null_marker (String, nil) — The null marker string.
nil
to unset.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.null_marker = "\N" end csv_table.null_marker #=> "\N"
#null_markers
def null_markers() -> Array<String>
The list of strings represented as SQL NULL value in a CSV file. null_marker and null_markers can't be set at the same time. If null_marker is set, null_markers has to be not set. If null_markers is set, null_marker has to be not set. If both null_marker and null_markers are set at the same time, a user error would be thrown. Any strings listed in null_markers, including empty string would be interpreted as SQL NULL. This applies to all column types.
- (Array<String>) — The array of null marker strings.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.null_markers = ["\N", "NULL"] end csv_table.null_markers #=> ["\N", "NULL"]
#null_markers=
def null_markers=(null_markers)
Sets the list of strings represented as SQL NULL value in a CSV file. null_marker and null_markers can't be set at the same time. If null_marker is set, null_markers has to be not set. If null_markers is set, null_marker has to be not set. If both null_marker and null_markers are set at the same time, a user error would be thrown. Any strings listed in null_markers, including empty string would be interpreted as SQL NULL. This applies to all column types.
- null_markers (Array<String>) — The array of null marker strings.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.null_markers = ["\N", "NULL"] end csv_table.null_markers #=> ["\N", "NULL"]
#param_types
def param_types() -> Hash
The types of the fields in the data in the schema, using the same format as the optional query parameter types.
- (Hash) — A hash with field names as keys, and types as values.
#quote
def quote() -> String
The value that is used to quote data sections in a CSV file.
- (String)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quote = "'" end csv_table.quote #=> "'"
#quote=
def quote=(new_quote)
Set the value that is used to quote data sections in a CSV file.
- new_quote (String) — New quote value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quote = "'" end csv_table.quote #=> "'"
#quoted_newlines
def quoted_newlines() -> Boolean
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quoted_newlines = true end csv_table.quoted_newlines #=> true
#quoted_newlines=
def quoted_newlines=(new_quoted_newlines)
Set whether BigQuery should allow quoted data sections that contain newline characters in a CSV file.
- new_quoted_newlines (Boolean) — New quoted_newlines value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.quoted_newlines = true end csv_table.quoted_newlines #=> true
#schema
def schema(replace: false) { |schema| ... } -> Google::Cloud::Bigquery::Schema
The schema for the data.
-
replace (Boolean) (defaults to: false) — Whether to replace the existing schema with
the new schema. If
true
, the fields will replace the existing schema. Iffalse
, the fields will be added to the existing schema. The default value isfalse
.
- (schema) — a block for setting the schema
- schema (Schema) — the object accepting the schema
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.schema do |schema| schema.string "name", mode: :required schema.string "email", mode: :required schema.integer "age", mode: :required schema.boolean "active", mode: :required end end
#schema=
def schema=(new_schema)
Set the schema for the data.
- new_schema (Schema) — The schema object.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_shema = bigquery.schema do |schema| schema.string "name", mode: :required schema.string "email", mode: :required schema.integer "age", mode: :required schema.boolean "active", mode: :required end csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url csv_table.schema = csv_shema
#skip_leading_rows
def skip_leading_rows() -> Integer
The number of rows at the top of a CSV file that BigQuery will skip when reading the data.
- (Integer)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.skip_leading_rows = 1 end csv_table.skip_leading_rows #=> 1
#skip_leading_rows=
def skip_leading_rows=(row_count)
Set the number of rows at the top of a CSV file that BigQuery will skip when reading the data.
- row_count (Integer) — New skip_leading_rows value
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.skip_leading_rows = 1 end csv_table.skip_leading_rows #=> 1
#source_column_match
def source_column_match() -> String, nil
Controls the strategy used to match loaded columns to the schema. If not set, a sensible default is chosen based on how the schema is provided. If autodetect is used, then columns are matched by name. Otherwise, columns are matched by position. This is done to keep the behavior backward-compatible.
Acceptable values are:
POSITION
: matches by position. Assumes columns are ordered the same way as the schema.NAME
: matches by name. Reads the header row as column names and reorders columns to match the schema.
-
(String, nil) — The source column match value.
nil
if not set.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.source_column_match = "NAME" end csv_table.source_column_match #=> "NAME"
#source_column_match=
def source_column_match=(source_column_match)
Sets the strategy used to match loaded columns to the schema. If not set, a sensible default is chosen based on how the schema is provided. If autodetect is used, then columns are matched by name. Otherwise, columns are matched by position. This is done to keep the behavior backward-compatible. Optional.
Acceptable values are:
POSITION
: matches by position. Assumes columns are ordered the same way as the schema.NAME
: matches by name. Reads the header row as column names and reorders columns to match the schema.
-
source_column_match (String, nil) — The new source column match value.
nil
to unset.
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.source_column_match = "NAME" end csv_table.source_column_match #=> "NAME"
#utf8?
def utf8?() -> Boolean
Checks if the character encoding of the data is "UTF-8". This is the default.
- (Boolean)
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new csv_url = "gs://bucket/path/to/data.csv" csv_table = bigquery.external csv_url do |csv| csv.encoding = "UTF-8" end csv_table.encoding #=> "UTF-8" csv_table.utf8? #=> true