11 Spark Streaming Triggers - Once, Processing Time & Continuous | Tune Kafka Streaming Performance

07 Spark Streaming Read from Files | Flatten JSON data

OpenAI's New SearchGPT Shakes Up the Industry, Google Stock CRASHES!

#Paris2024 Opening Ceremony: Olympic Cauldron is lit as Celine Dion returns to the stage

How Many Balloons Does It Take To Fly?

What it feels like cleaning up after a toddler.

10 Spark Streaming Read from Kafka | Real time streaming from Kafka

Ease With Data

zhlédnutí 2 089

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 27. 07. 2024
Video covers - How to read streaming data from Kafka? How to read real time data from Kafka? How to use Kafka as a Source for Real time Spark Streaming?
Chapters:
00:00 - Introduction
00:34 - Example Device JSON Payload
01:09 - Import Kafka JAR Libraries
03:08 - Read from Kafka Source
06:27 - Extract JSON data from column using from_json
URLs:
Github Code - github.com/subhamkharwal/spar...
Device data samples - github.com/subhamkharwal/spar...
To setup Kafka with Spark in Local environment - • 03 Spark Streaming Loc...
JSON data Flattening and reading from files - • 07 Spark Streaming Rea...
Keywords: Apache Spark, PySpark, Spark Streaming, Real-time Data Processing, Data Streaming, Big Data Analytics, PySpark Tutorial, Apache Spark Tutorial, Streaming Analytics, Spark Structured Streaming, PySpark Streaming, Big Data Processing.
New video in every 3 days ❤️
Make sure to like and Subscribe.
Věda a technologie

Komentáře • 21

@VenkatesanVenkat-fd4hg Před 5 měsíci
Superb playlist
@easewithdata Před 5 měsíci
Glad you like it
@burak3941 Před 3 měsíci
Thanks for sharing this fantastic list.
I only realized something: you are using it to run Kafka brokers port:9092, but you could also retrieve the information from 29092. What did I miss :)
Many thanks
@easewithdata Před 3 měsíci
Thanks. For external application in docker 9092 port is open but for internal containers we can use 29092. But both are correct.
@worldthroughmyvisor Před 5 měsíci
excellent tutorial
Here are the questions that i faced in hewlett packard interview on spark streaming application, probably you can create a video on these too
1. Suppose you read a message from kafka and our application fails to process it, how do you ensure that the same message is processed again successfully. What he was trying to refer was that out of 1000's of messages being read from kafka, how do we ensure we can process the ones successfully that failed, as our application will continue to read the new messages that are coming in, and this unsuccessful message was read once and it failed.
2. How do you ensure parallelism with a kafka producer and a spark streaming read API, there will be 100's of messages incoming at any given point and naturally our spark application cannot process them one at a time, instead spark can process them parallely by reading from multiple partitions. How do you configure your app to do that.
I think these questions can give us a much better idea as to how a prod spec spark streaming application will work. Appreciate if you can create some content on these questions. Thanks
@easewithdata Před 4 měsíci
Thank you for posting the questions. Answer for Question 2 can be found in today's video.
@worldthroughmyvisor Před 4 měsíci
thank you @@easewithdata
@nikschuetz4112 Před měsícem
if you can cast it to a string, why not able to cast it to json, or map it out and use a json parser?
@easewithdata Před měsícem
Yes you can definitely cast as per your wish.
@unknown_fact1586 Před 2 měsíci
I can't see output on docker console. but started the consumer console and can see the data. thanks for the video. can you also make a video on how to read streaming jobs details on spark UI?
@easewithdata Před 2 měsíci
Please make sure to share with your network over LinkedIn
@somyaranjankar5804 Před 3 měsíci
getting below error message while reading from kafka
Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".
How to fix this
@easewithdata Před 3 měsíci
Did you import the Jar file for Kafka in the Spark Session ??
@somyaranjankar5804 Před 3 měsíci
@@easewithdata Yes i did , here is the sample
from pyspark.sql import SparkSession
spark = (
SparkSession
.builder
.appName("Streaming from Kafka")
.config("spark.streaming.stopGracefullyOnShutdown", True)
.config('spark.jars.packages', 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.0')
.config("spark.sql.shuffle.partitions", 3)
.master("local[*]")
.getOrCreate()
)
@somyaranjankar5804 Před 3 měsíci
@@easewithdata Yes i did
from pyspark.sql import SparkSession
spark = (
SparkSession
.builder
.appName("Streaming from Kafka")
.config("spark.streaming.stopGracefullyOnShutdown", True)
.config('spark.jars.packages', 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.0')
.config("spark.sql.shuffle.partitions", 3)
.master("local[*]")
.getOrCreate()
)
@easewithdata Před 3 měsíci
No comments are deleted. Once your spark session os created, check in the Spark UI environment section if the kafka jar is downloaded and attached to cluster
@prathamesh_a_k Před 4 měsíci
getting below error message while reading from kafka
Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".
@easewithdata Před 4 měsíci
Hello Prathmesh,
Did you import the jar to support kafka ?
@somyaranjankar5804 Před 3 měsíci
import means separately needs to be import ?
@easewithdata Před 3 měsíci
You need to import kafka jar file in the SparkSession
@hamedtamadon6520 Před měsícem
it's sometimes occur because of mismatch between downloaded jar file and spark version . you should find appropriate version of jar file which is compatible with your spark version

Další v pořadí

Automatické přehrávání

11 Spark Streaming Triggers - Once, Processing Time & Continuous | Tune Kafka Streaming Performance

11 Spark Streaming Triggers - Once, Processing Time & Continuous | Tune Kafka Streaming Performance

07 Spark Streaming Read from Files | Flatten JSON data

07 Spark Streaming Read from Files | Flatten JSON data

OpenAI's New SearchGPT Shakes Up the Industry, Google Stock CRASHES!

OpenAI's New SearchGPT Shakes Up the Industry, Google Stock CRASHES!

#Paris2024 Opening Ceremony: Olympic Cauldron is lit as Celine Dion returns to the stage

#Paris2024 Opening Ceremony: Olympic Cauldron is lit as Celine Dion returns to the stage

How Many Balloons Does It Take To Fly?

How Many Balloons Does It Take To Fly?

What it feels like cleaning up after a toddler.

What it feels like cleaning up after a toddler.

Growing An Ear In Your Arm 😨

Growing An Ear In Your Arm 😨

12 Spark Streaming Writing data to Multiple Sinks | foreachBatch | Writing data to JDBC(Postgres)

12 Spark Streaming Writing data to Multiple Sinks | foreachBatch | Writing data to JDBC(Postgres)

Strimzi, kubernetes operator for apache kafka | apache kafka kubernetes

Strimzi, kubernetes operator for apache kafka | apache kafka kubernetes

09 Apache Kafka Basics & Architecture | Kafka Tutorial | Pub Sub Architecture | Learn Kafka in 15min

09 Apache Kafka Basics & Architecture | Kafka Tutorial | Pub Sub Architecture | Learn Kafka in 15min

Gravitas: LinkedIn co-founder predicts the end of 9-to-5 jobs | World News | WION

Gravitas: LinkedIn co-founder predicts the end of 9-to-5 jobs | World News | WION

03 Spark Streaming Local Environment Setup - Docker, Jupyter, PySpark and Kafka

03 Spark Streaming Local Environment Setup - Docker, Jupyter, PySpark and Kafka

13 Spark Streaming Handling Errors and Exceptions | Handle Exception for data re-processing in Spark

13 Spark Streaming Handling Errors and Exceptions | Handle Exception for data re-processing in Spark

Economist fact-checks Scott Galloway’s Anti-Boomer TED Talk

Economist fact-checks Scott Galloway’s Anti-Boomer TED Talk

04 Spark Streaming Read from Sockets | Convert Batch Code to Streaming Code

04 Spark Streaming Read from Sockets | Convert Batch Code to Streaming Code

How To Access Any Forked GitHub Repositories Data

How To Access Any Forked GitHub Repositories Data

Rate This Smartphone Cooler Set-up ⭐

Rate This Smartphone Cooler Set-up ⭐

#best PLAYSTATION CONSOLE #collection #shortvideos #gaming #foryou

#best PLAYSTATION CONSOLE #collection #shortvideos #gaming #foryou

World’s smallest 4K headset 😎 #tech #vr #technology #virtualreality #insideout2

World’s smallest 4K headset 😎 #tech #vr #technology #virtualreality #insideout2

HW News - Intel P-Core Only CPUs, ASUS Updates, RTX 5090 & Battlemage Rumors

HW News - Intel P-Core Only CPUs, ASUS Updates, RTX 5090 & Battlemage Rumors

The first two iPads are imitations, just for demonstration purposes, don't worry#ipadkeyboard #ipad

The first two iPads are imitations, just for demonstration purposes, don't worry#ipadkeyboard #ipad

Battery low 🔋 🪫

Battery low 🔋 🪫

Privacy on iPhone | Flock | Apple

Privacy on iPhone | Flock | Apple

Samsung’s Techs Voiding TV Warranties?

Samsung’s Techs Voiding TV Warranties?