Hands-On Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing

De (autor): Alfonso Antolnez Garca

Hands-On Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing

De (autor): Alfonso Antolnez Garca

This book explains how to scale Apache Spark 3 to handle massive amounts of data, either via batch or streaming processing. It covers how to use Spark's structured APIs to perform complex data transformations and analyses you can use to implement end-to-end analytics workflows. This book covers Spark 3's new features, theoretical foundations, and application architecture. The first section introduces the Apache Spark ecosystem as a unified engine for large scale data analytics, and shows you how to run and fine-tune your first application in Spark. The second section centers on batch processing suited to end-of-cycle processing, and data ingestion through files and databases. It explains Spark DataFrame API as well as structured and unstructured data with Apache Spark. The last section deals with scalable, high-throughput, fault-tolerant streaming processing workloads to process real-time data. Here you'll learn about Apache Spark Streaming's execution model, the architecture of Spark Streaming, monitoring, reporting, and recovering Spark streaming. A full chapter is devoted to future directions for Spark Streaming. With real-world use cases, code snippets, and notebooks hosted on GitHub, this book will give you an understanding of large-scale data analysis concepts--and help you put them to use.
Upon completing this book, you will have the knowledge and skills to seamlessly implement large-scale batch and streaming workloads to analyze real-time data streams with Apache Spark.
What You Will Learn

Master the concepts of Spark clusters and batch data processing
Understand data ingestion, transformation, and data storage
Gain insight into essential stream processing concepts and different streaming architectures
Implement streaming jobs and applications with Spark Streaming

Who This Book Is ForData engineers, data analysts, machine learning engineers, Python and R programmers

Citeste mai mult

-10%

transport gratuit

PRP: 448.72 Lei

Acesta este Pretul Recomandat de Producator. Pretul de vanzare al produsului este afisat mai jos.

403.85Lei

448.72 Lei

Primesti 403 puncte

Primesti puncte de fidelitate dupa fiecare comanda! 100 puncte de fidelitate reprezinta 1 leu. Foloseste-le la viitoarele achizitii!

Indisponibil

Descrierea produsului

Master the concepts of Spark clusters and batch data processing
Understand data ingestion, transformation, and data storage
Gain insight into essential stream processing concepts and different streaming architectures
Implement streaming jobs and applications with Spark Streaming

Who This Book Is ForData engineers, data analysts, machine learning engineers, Python and R programmers

Citeste mai mult

Detaliile produsului

Editie: Paperback

Nr. pagini: 403

Cod: BRT9781484293799

Afiseaza mai mult

De pe acelasi raft

-10%

transport gratuit

Beginning Apache Spark 3: With Dataframe, Spark Sql, Structured Streaming, and Spark Machine Learning Library - Hien Luu

PRP: 448.72 Lei

403.85 Lei

403.85 Lei448.72 Lei

Adauga in cos
-10%

transport gratuit

Learning Spark: Lightning-Fast Data Analytics - Jules S. Damji

PRP: 435.13 Lei

391.62 Lei

391.62 Lei435.13 Lei

Adauga in cos
-10%

transport gratuit

Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale - Tom White

PRP: 353.53 Lei

318.18 Lei

318.18 Lei353.53 Lei

Adauga in cos
-10%

transport gratuit

Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications - Fabian Hueske

PRP: 435.13 Lei

391.62 Lei

391.62 Lei435.13 Lei

Adauga in cos
-10%

transport gratuit

Beginning Apache Spark 2

PRP: 269.20 Lei

242.28 Lei

242.28 Lei269.20 Lei

Adauga in cos
-10%

transport gratuit

Data Science on AWS: Implementing End-To-End, Continuous AI and Machine Learning Pipelines - Chris Fregly

PRP: 435.13 Lei

391.62 Lei

391.62 Lei435.13 Lei

Adauga in cos
-10%

transport gratuit

Azure Databricks Cookbook: Accelerate and scale real-time analytics solutions using the Apache Spark-based analytics service - Phani Raj

PRP: 454.58 Lei

409.12 Lei

409.12 Lei454.58 Lei

Adauga in cos
-10%

transport gratuit

The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snow - Ron L'esteve

PRP: 407.92 Lei

367.13 Lei

367.13 Lei407.92 Lei

Adauga in cos
-10%

transport gratuit

Learning PySpark: Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0 - Tomasz Drabas

PRP: 404.98 Lei

364.48 Lei

364.48 Lei404.98 Lei

Adauga in cos
-10%

transport gratuit

Spark - The Definitive Guide

PRP: 614.75 Lei

553.27 Lei

553.27 Lei614.75 Lei

Adauga in cos
-10%

transport gratuit

Grokking Streaming Systems: Real-Time Event Processing - Josh Fischer

PRP: 495.92 Lei

446.33 Lei

446.33 Lei495.92 Lei

Adauga in cos
-10%

transport gratuit

Learn Pyspark: Build Python-Based Machine Learning and Deep Learning Models - Pramod Singh

PRP: 407.92 Lei

367.13 Lei

367.13 Lei407.92 Lei

Adauga in cos
-10%

transport gratuit

The Definitive Guide to Azure Data Engineering: Modern Elt, Devops, and Analytics on the Azure Cloud Platform - Ron C. L'esteve

PRP: 367.12 Lei

330.41 Lei

330.41 Lei367.12 Lei

Adauga in cos
-10%

transport gratuit

Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create scalable pipelines that ingest, curate, and aggregate complex data in a timely a - Manoj Kukreja

PRP: 404.98 Lei

364.48 Lei

364.48 Lei404.98 Lei

Adauga in cos
-10%

transport gratuit

Event Streams in Action - Alexander Dean

PRP: 305.92 Lei

275.33 Lei

275.33 Lei305.92 Lei

Adauga in cos
-10%

transport gratuit

Streaming Systems

PRP: 435.13 Lei

391.62 Lei

391.62 Lei435.13 Lei

Adauga in cos
-10%

transport gratuit

Data Engineering with Google Cloud Platform: A practical guide to operationalizing scalable data analytics systems on GCP - Adi Wijaya

PRP: 454.58 Lei

409.12 Lei

409.12 Lei454.58 Lei

Adauga in cos
-10%

transport gratuit

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark - Holden Karau

PRP: 309.94 Lei

278.95 Lei

278.95 Lei309.94 Lei

Adauga in cos
-10%

transport gratuit

Advanced Analytics with Pyspark: Patterns for Learning from Data at Scale Using Python and Spark - Akash Tandon

PRP: 371.93 Lei

334.74 Lei

334.74 Lei371.93 Lei

Adauga in cos
-10%

transport gratuit

Data Pipelines Pocket Reference: Moving and Processing Data for Analytics - James Densmore

PRP: 163.06 Lei

146.75 Lei

146.75 Lei163.06 Lei

Adauga in cos

Parerea ta e inspiratie pentru comunitatea Libris!

Hands-On Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing

De (autor): Alfonso Antolnez Garca

Hands-On Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing

De (autor): Alfonso Antolnez Garca

Descrierea produsului

De pe acelasi raft

Beginning Apache Spark 3: With Dataframe, Spark Sql, Structured Streaming, and Spark Machine Learning Library - Hien Luu

Learning Spark: Lightning-Fast Data Analytics - Jules S. Damji

Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale - Tom White

Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications - Fabian Hueske

Beginning Apache Spark 2

Data Science on AWS: Implementing End-To-End, Continuous AI and Machine Learning Pipelines - Chris Fregly

Azure Databricks Cookbook: Accelerate and scale real-time analytics solutions using the Apache Spark-based analytics service - Phani Raj

The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snow - Ron L'esteve

Learning PySpark: Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0 - Tomasz Drabas

Spark - The Definitive Guide

Grokking Streaming Systems: Real-Time Event Processing - Josh Fischer

Learn Pyspark: Build Python-Based Machine Learning and Deep Learning Models - Pramod Singh

The Definitive Guide to Azure Data Engineering: Modern Elt, Devops, and Analytics on the Azure Cloud Platform - Ron C. L'esteve

Data Engineering with Apache Spark, Delta Lake, and Lakehouse: Create scalable pipelines that ingest, curate, and aggregate complex data in a timely a - Manoj Kukreja

Event Streams in Action - Alexander Dean

Streaming Systems

Data Engineering with Google Cloud Platform: A practical guide to operationalizing scalable data analytics systems on GCP - Adi Wijaya

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark - Holden Karau

Advanced Analytics with Pyspark: Patterns for Learning from Data at Scale Using Python and Spark - Akash Tandon

Data Pipelines Pocket Reference: Moving and Processing Data for Analytics - James Densmore

Acum se comanda