Announced May 2025: Dataproc Serverless is now Google Cloud Serverless for Apache Spark

Serverless for Apache Spark logo

Google Cloud Serverless for Apache Spark

Focus on your code, not your infrastructure

Run your Apache Spark jobs easier on a customizable zero-ops platform, smarter with Gemini assistance, and faster with the performance of Lightning Engine.

Get $300 in credits
Contact sales

Apache Spark is a trademark of The Apache Software Foundation.


Features

Industry-leading performance

Supercharge your jobs with Lightning Engine, our next-generation vectorized engine. Get over 4.3x faster performance and lower TCO for your serverless Spark workloads, automatically.

Zero-Ops with intelligent autoscaling

Eliminate cluster management with intelligent autoscaling. Resources scale up, and down automatically to perfectly match your job's needs, ensuring maximum performance, and cost-efficiency without paying for idle time.

AI-powered development

Accelerate your entire workflow. Write and debug PySpark, Scala, and Java code with Gemini Code Assist in BigQuery Studio and launch GPU-accelerated environments with pre-configured ML Runtimes.

Unified Spark and SQL experience

Eliminate context switching. Develop and run your workloads in a single environment like BigQuery Studio, seamlessly blending powerful SQL with the flexibility of PySpark in the same notebook.


Two tiers of performance

Two tiers of performanceTiers to match your specific needs, from standard batch processing to the most demanding, performance-critical jobs.
TierBest for

Standard

Ideal for cost-effective batch processing, data transformations, and general-purpose Spark jobs.

  • General purpose Spark ETL
  • Scheduled data pipelines
  • Cost-sensitive batch jobs

Premium

For the most demanding workloads, offering maximum performance with Lightning Engine, AI/ML acceleration, and interactive capabilities.

  • Performance-critical jobs powered by Lightning Engine for 4.3x boost
  • Interactive data science and analysis
  • GPU-accelerated AI and ML
  • Complex, large-scale data processing

Two tiers of performance

Tiers to match your specific needs, from standard batch processing to the most demanding, performance-critical jobs.

Standard

Best for

Ideal for cost-effective batch processing, data transformations, and general-purpose Spark jobs.

  • General purpose Spark ETL
  • Scheduled data pipelines
  • Cost-sensitive batch jobs

Premium

Best for

For the most demanding workloads, offering maximum performance with Lightning Engine, AI/ML acceleration, and interactive capabilities.

  • Performance-critical jobs powered by Lightning Engine for 4.3x boost
  • Interactive data science and analysis
  • GPU-accelerated AI and ML
  • Complex, large-scale data processing

How It Works

Develop your Apache Spark application in your favorite tools, including BigQuery Studio notebooks. Submit your serverless Spark job with a single command, and let Google handle the rest—no clusters to create, configure, or manage.


View documentation

Common Uses

Interactive Data Science

Empower data scientists to explore data and rapidly iterate on Spark ML models. Unify SQL and Spark in a single BigQuery Studio notebook, moving seamlessly from data exploration with SQL to model building with PySpark without ever managing infrastructure.

Learn how to run PySpark code in BigQuery Studio notebooks
BQ Studio notebook
build

Tutorials, quickstarts, & labs

Interactive Data Science

Empower data scientists to explore data and rapidly iterate on Spark ML models. Unify SQL and Spark in a single BigQuery Studio notebook, moving seamlessly from data exploration with SQL to model building with PySpark without ever managing infrastructure.

Learn how to run PySpark code in BigQuery Studio notebooks
BQ Studio notebook

Automated ETL Pipelines

 Build robust, event-driven Spark ETL pipelines that automatically scale on demand. Pay only for what you use, making it perfect for spiky or unpredictable workloads.

Learn how to apply data lineage
From data to Spark production, faster.
    build

    Tutorials, quickstarts, & labs

    Automated ETL Pipelines

     Build robust, event-driven Spark ETL pipelines that automatically scale on demand. Pay only for what you use, making it perfect for spiky or unpredictable workloads.

    Learn how to apply data lineage
    From data to Spark production, faster.

      AI/ML at scale

      Accelerate large-scale model training and batch inference with serverless Spark. Attach NVIDIA GPUs with pre-configured libraries with a single command.

      View GPU documentation
      Slide
        book

        Learning resources

        AI/ML at scale

        Accelerate large-scale model training and batch inference with serverless Spark. Attach NVIDIA GPUs with pre-configured libraries with a single command.

        View GPU documentation
        Slide

          Pricing

          Transparent, value-driven pricingServerless for Apache Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage.
          Services and usageSubscription type Price (USD)

          Data Compute Unit (DCU)

          Standard

          Starting at

          $0.06

          per hour

          Premium

          Starting at

          $0.089

          per hour

          Shuffle storage

          Standard

          Starting at

          $0.04

          per GB/month

          Premium

          Starting at

          $0.1

          per GB/month

          Accelerator pricing

          a100 40 GB

          Starting at

          $3.52069

          per hour

          a100 80 GB

          Starting at

          $4.713696

          per hour

          L4

          Starting at

          $0.672048

          per hour

          View pricing details for Google Cloud Serverless for Apache Spark.

          Transparent, value-driven pricing

          Serverless for Apache Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage.

          Data Compute Unit (DCU)

          Subscription type

          Standard

          Price (USD)

          Starting at

          $0.06

          per hour

          Premium

          Subscription type

          Starting at

          $0.089

          per hour

          Shuffle storage

          Subscription type

          Standard

          Price (USD)

          Starting at

          $0.04

          per GB/month

          Premium

          Subscription type

          Starting at

          $0.1

          per GB/month

          Accelerator pricing

          Subscription type

          a100 40 GB

          Price (USD)

          Starting at

          $3.52069

          per hour

          a100 80 GB

          Subscription type

          Starting at

          $4.713696

          per hour

          L4

          Subscription type

          Starting at

          $0.672048

          per hour

          View pricing details for Google Cloud Serverless for Apache Spark.

          Pricing calculator

          Calculate your monthly costs by region.
          Estimate your costs

          Custom quote

          Connect with our sales team to get a custom quote for your organization.
          Request a quote

          Get started today

          Tutorial for getting started

          Get $300 in credits

          Have a large project?

          Contact sales

          Product documnetation

          Read here

          Use BigQuery connector with Serverless for Apache Spark

          Read guide

          Use GPUs with Serverless for Apache Spark

          Read guide

          Business Case

           Build your business case for Google Cloud Serverless for Apache Spark


          The economic benefits of Google Cloud Dataproc and Serverless Spark versus alternative solutions

          See how Serverless for Apache Spark delivers significant TCO savings and business value compared to on-prem and other cloud solutions.

          Download the report

          In the report:

          Discover how Dataproc and Serverless for Apache Spark can deliver 18% to 60% cost savings compared to other cloud-based Spark alternatives.

          Explore how Google Cloud Serverless for Apache Spark can provide 21% to 55% better price-performance than other serverless Spark offerings.

          Learn how Dataproc and Google Cloud Serverless for Apache Spark simplify Spark deployments and help reduce operational complexity.

          FAQ

          When should I choose Serverless for Apache Spark versus Dataproc?

          Choose Serverless for Apache Spark when you want to focus on your code and eliminate all infrastructure management. It's ideal for new Spark pipelines, interactive analysis, and jobs with unpredictable demand where speed and simplicity are the priority.

          See our decision guide.

          Do I need to install my own libraries like PyTorch or XGBoost?

          The Premium tier is designed for AI/ML and comes with pre-configured ML Runtimes that have common libraries like PyTorch, XGBoost, and scikit-learn built-in. This eliminates complex setup and allows you to get started with your data science workloads in minutes.

          Learn about GPU workloads and runtimes.

          How do I get the best performance and how does pricing work?

          For maximum performance, you can select the Premium tier, which is powered by Lightning Engine. Pricing is based on a "pay-for-what-you-use" model, where you are billed per second only for the duration of your job's execution. This is highly cost-effective as it eliminates the cost of idle clusters.

          View detailed pricing.

          Learn about running Spark on Google Cloud
          Learn about Dataproc
          Start free
          Contact us
          Google Cloud
          Morty Proxy This is a proxified and sanitized view of the page, visit original site.