- 4.73.0 (latest)
- 4.72.0
- 4.71.0
- 4.70.0
- 4.69.0
- 4.67.0
- 4.65.0
- 4.64.0
- 4.61.0
- 4.60.0
- 4.59.0
- 4.57.0
- 4.56.0
- 4.55.0
- 4.54.0
- 4.53.0
- 4.52.0
- 4.51.0
- 4.50.0
- 4.49.0
- 4.48.0
- 4.46.0
- 4.45.0
- 4.44.0
- 4.43.0
- 4.42.0
- 4.41.0
- 4.40.0
- 4.39.0
- 4.38.0
- 4.37.0
- 4.36.0
- 4.34.0
- 4.33.0
- 4.32.0
- 4.31.0
- 4.30.0
- 4.29.0
- 4.28.0
- 4.27.0
- 4.26.0
- 4.25.0
- 4.24.0
- 4.21.0
- 4.20.0
- 4.19.0
- 4.18.0
- 4.17.0
- 4.16.0
- 4.15.0
- 4.14.0
- 4.13.0
- 4.12.0
- 4.11.0
- 4.10.0
- 4.9.0
- 4.8.0
- 4.6.0
- 4.5.0
- 4.4.0
- 4.3.0
- 4.2.0
- 4.1.0
- 4.0.8
- 3.1.2
- 3.0.3
- 2.3.1
google-cloud-dataproc overview (4.57.0)
Key Reference Links
Dataproc Description: Is a faster, easier, more cost-effective way to run Apache Spark and Apache Hadoop.
| Dataproc Product Reference | GitHub Repository (includes samples) | Maven artifact | 
Getting Started
In order to use this library, you first need to go through the following steps:
- Install a JDK (Java Development Kit)
- Select or create a Cloud Platform project
- Enable billing for your project
- Enable the API
- Set up authentication
Use the Dataproc for Java
To ensure that your project uses compatible versions of the libraries
and their component artifacts, import com.google.cloud:libraries-bom and use
the BOM to specify dependency versions.  Be sure to remove any versions that you
set previously. For more information about
BOMs, see Google Cloud Platform Libraries BOM.
Maven
Import the BOM in the dependencyManagement section of your pom.xml file.
Include specific artifacts you depend on in the dependencies section, but don't
specify the artifacts' versions in the dependencies section.
The example below demonstrates how you would import the BOM and include the google-cloud-dataproc artifact.
<dependencyManagement> <dependencies> <dependency> <groupId>com.google.cloud</groupId> <artifactId>libraries-bom</artifactId> <version> 26.59.0</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-dataproc</artifactId> </dependency> </dependencies>
Gradle
BOMs are supported by default in Gradle 5.x or later. Add a platform
dependency on com.google.cloud:libraries-bom and remove the version from the
dependency declarations in the artifact's build.gradle file.
The example below demonstrates how you would import the BOM and include the google-cloud-dataproc artifact.
implementation(platform("com.google.cloud:libraries-bom: 26.59.0")) implementation("com.google.cloud:google-cloud-dataproc")
The platform and enforcedPlatform keywords supply dependency versions
declared in a BOM. The enforcedPlatform keyword enforces the dependency
versions declared in the BOM and thus overrides what you specified.
For more details of the platform and enforcedPlatform keywords Gradle 5.x or higher, see
Gradle: Importing Maven BOMs.
If you're using Gradle 4.6 or later, add
enableFeaturePreview('IMPROVED_POM_SUPPORT') to your settings.gradle file. For details, see
Gradle 4.6 Release Notes: BOM import.
Versions of Gradle earlier than 4.6 don't support BOMs.
SBT
SBT doesn't support BOMs. You can find recommended versions of libraries from a particular BOM version on the dashboard and set the versions manually. To use the latest version of this library, add this to your dependencies:
libraryDependencies += "com.google.cloud" % "google-cloud-dataproc" % "4.57.0"
Which version ID should I get started with?
For this library, we recommend using com.google.cloud.dataproc.v1 for new applications.
Understanding Version ID and Library Versions
When using a Cloud client library, it's important to distinguish between two types of versions:
- Library Version: The version of the software package (the client library) that helps you interact with the Cloud service. These libraries are released and updated frequently with bug fixes, improvements, and support for new service features and versions. The version selector at the top of this page represents the client library version.
- Version ID: The version of the Cloud service itself (e.g. Dataproc). New Version IDs are introduced infrequently, and often involve changes to the core functionality and structure of the Cloud service itself. The packages in the lefthand navigation represent packages tied to a specific Version ID of the Cloud service.
Managing Library Versions
We recommend using the com.google.cloud:libraries-bom installation method detailed above to streamline dependency management
across multiple Cloud Java client libraries. This ensures compatibility and simplifies updates.
Choosing the Right Version ID
Each Cloud Java client library may contain packages tied to specific Version IDs (e.g., v1, v2alpha). For new production applications, use
the latest stable Version ID. This is identified by the highest version number without a suffix (like "alpha" or "beta"). You can read more about
Cloud API versioning strategy here.
Important: Unstable Version ID releases (those with suffixes) are subject to breaking changes when upgrading. Use them only for testing or if you specifically need their experimental features.