Kafka Tutorial - Fault Tolerance

Learning Journal

zhlédnutí 169 958

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 1. 12. 2016
Spark Programming and Azure Databricks ILT Master Class by Prashant Kumar Pandey - Fill out the google form for Course inquiry.
forms.gle/Nxk8dQUPq4o4XsA47
-------------------------------------------------------------------
Data Engineering using is one of the highest-paid jobs of today.
It is going to remain in the top IT skills forever.
Are you in database development, data warehousing, ETL tools, data analysis, SQL, PL/QL development?
I have a well-crafted success path for you.
I will help you get prepared for the data engineer and solution architect role depending on your profile and experience.
We created a course that takes you deep into core data engineering technology and masters it.
If you are a working professional:
1. Aspiring to become a data engineer.
2. Change your career to data engineering.
3. Grow your data engineering career.
4. Get Databricks Spark Certification.
5. Crack the Spark Data Engineering interviews.
ScholarNest is offering a one-stop integrated Learning Path.
The course is open for registration.
The course delivers an example-driven approach and project-based learning.
You will be practicing the skills using MCQ, Coding Exercises, and Capstone Projects.
The course comes with the following integrated services.
1. Technical support and Doubt Clarification
2. Live Project Discussion
3. Resume Building
4. Interview Preparation
5. Mock Interviews
Course Duration: 6 Months
Course Prerequisite: Programming and SQL Knowledge
Target Audience: Working Professionals
Batch start: Registration Started
Fill out the below form for more details and course inquiries.
forms.gle/Nxk8dQUPq4o4XsA47
--------------------------------------------------------------------------
Learn more at www.scholarnest.com/
Best place to learn Data engineering, Bigdata, Apache Spark, Databricks, Apache Kafka, Confluent Cloud, AWS Cloud Computing, Azure Cloud, Google Cloud - Self-paced, Instructor-led, Certification courses, and practice tests.
========================================================
SPARK COURSES
-----------------------------
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/d...
KAFKA COURSES
--------------------------------
www.scholarnest.com/courses/a...
www.scholarnest.com/courses/k...
www.scholarnest.com/courses/s...
AWS CLOUD
------------------------
www.scholarnest.com/courses/a...
www.scholarnest.com/courses/a...
PYTHON
------------------
www.scholarnest.com/courses/p...
========================================
We are also available on the Udemy Platform
Check out the below link for our Courses on Udemy
www.learningjournal.guru/cour...
=======================================
You can also find us on Oreilly Learning
www.oreilly.com/library/view/...
www.oreilly.com/videos/apache...
www.oreilly.com/videos/kafka-...
www.oreilly.com/videos/spark-...
www.oreilly.com/videos/spark-...
www.oreilly.com/videos/apache...
www.oreilly.com/videos/real-t...
www.oreilly.com/videos/real-t...
=========================================
Follow us on Social Media
/ scholarnest
/ scholarnesttechnologies
/ scholarnest
/ scholarnest
github.com/ScholarNest
github.com/learningJournal/
========================================

Komentáře • 124

@ScholarNest Před 3 lety ⁺¹
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
www.learningjournal.guru/courses/
@joowlee Před 7 lety ⁺²¹
Your series of explanations on Kafka is by far the best I could find online as of today, thank you
@ScholarNest Před 7 lety ⁺⁴
Thank you very much.
@carlosalfonzo Před 6 lety ⁺¹
Thank you for explaining replication factor! Loving your videos on Kafka! so well explained.
@YandryPozo Před 7 lety ⁺¹⁰
super well explained, good job !! thanks for your videos!!
@nickusinha Před 4 lety
One of the best Kafka training.. very clear and simple to understand all the series. A BIG thank you to you Sir for preparing this series !!!
@TheNayanava Před 7 lety
absolutely marvellous explanation with proper hands-on. will never forget that replication factor is defined at topic level but leaders are created at partition level... thanks for these amazing videos
@satish.b Před 6 lety
Crisp and clear. Even better than paid courses. Thank you @Learning Journal
@yogeshbelapure Před 4 lety
Absolutely, the best session on Kafka so far. very nicely explained.
@geetikathakkar1445 Před 7 lety
You have explained very well. U made it easy for all of us to understand Kafka. Thank you
@prmetta Před 7 lety
You are wonderful person and I really appreciated for the this sessions. Excellent details and communication also you are giving what exactly required to learn. I have clicked thumbs up icon but wanted to give more my friend, appreciated.
@prahulbhargav Před 7 lety
Awesome tutorial :D Thank you so much for such wonderful explanation sir.
@anandrayudu8008 Před 6 lety
Fantastic explanation. You are making these tough topics so simple to understand.
@John-eq5cd Před 6 lety
Superbly clear, well presented and interesting.
@raviteja7349 Před 7 lety
Fantastic sir.................very clear.
@w0406400 Před 7 lety
excellent explanation - just as much as you need ( no more, no less) :)
@raviwadje9274 Před 6 lety
Really Nicely explained, I don't think there could be more simplified explanation.
@balkrishnapatil649 Před 4 lety
Hello sir.. I gone with your all Kafka videos .. very well explain.. thanku
@shaikjavid2741 Před 6 lety
awesome good job, clearly explained
@iammrchetan Před 7 lety
@learning journal , that's a very good video! Thanks!!
@mahesh_kndpl Před 4 lety
Simply explained and to the point✌️👍
@pradeepreddymothe1316 Před 7 lety
Very nice tutorial sir, very clear explanation , keep posting ... sir
@somurambabu7221 Před 6 lety
well explained . Crispy and clear
@mailsubhro Před 7 lety
Good job in explaining !!
@s.dentertainment6449 Před 3 lety
Nice explanation, I really appriciate it. thanks for your videos!!
@sergiocano9658 Před 4 lety
Great tutorial! Thanks
@snehotoshbanerjee1938 Před 6 lety
Excellent!
@jayaprakashsubramani3869 Před 5 lety
Fantastic Tutorial Sir
@dibyajitpattanayak Před 6 lety
excellent explanation !!!!!
@mamtajain1882 Před 5 lety
Could you please explain using GCP,there i am not able to Setting up a multi-broker cluster.not finding config.
@kapiltyagi9545 Před 7 lety
Nice tutorial, very easy to understand the concept.Thanks for such informative video. Please post video on spark also
@robind999 Před 7 lety
One of the best kafka demo. thank you.
@ScholarNest Před 7 lety
Thanks Robin for sharing your feedback.
@abhijeetjain2098 Před 4 lety
well explained ... thanxs for this...
@MKDMixture Před 3 lety
Thanks sir.Its helpful
@satyapujari8811 Před 7 lety ⁺¹
very well explained.. can you add some video with kafka and spark streaming. with deployment strategy on cluster.
@MohammadRasoolShaik Před 7 lety ⁺¹
How we can see, which partition in which broker within the cluster for a topic, is there any command for it and where can i findout all such commands? And, Is there any UI to manage kafka brokers (To see what all the topics are there, to see all the partitions, To see all the msg's which are currently in the broker for a topic... etc)?
@swaroopkv6625 Před 5 lety
Thank you
@mpgrewal00 Před 7 lety ⁺¹
Fantastic.. your delivery style is flawless.. no errr... no emmm....
@vijeandran Před 7 lety
excellent explanation.... expecting more topics on Kafka.. thks
@ScholarNest Před 7 lety
Thanks, Vijendran.... Stay subscribed and you will get more videos for sure.
@CarlosEcheverriaOne Před 4 lety
Can we use consumer-groups on top of two Kafka Clusters, which are replicated using MirrorMaker?
If yes, will Kafka warranty the "Exactly One Time Delivery" policy?
@nguyen4so9 Před 7 lety ⁺¹
Amzing ! Please keep it up for everyone’s sake. Thank you ! Will you also do other component such as MapReduce, Spark, Pig, Hive :)
@ScholarNest Před 7 lety ⁺¹
Already started Hadoop and Map Reduce. Please check my Hadoop playlist. Others will also follow.
@umeshfadadu2556 Před 4 lety
These tutorials are really awesome. Thanks a lot for these.
Can you please resolve my question : how can we stop the instances of the broker one by one ? I was trying, it is telling "No instance(s) available."
@backendninja8333 Před 5 lety
Best course on kafka. Can you make a course on Cassandra?
@amitsrivastava-amittechjou1228 Před 3 lety
If I have 1 topic with 3 partitions on 3 different machines and also has replication factor 3, so ultimately each partition will be a leader and will also maintain copies of 2 other partitions. This will lead to high space usage on each partition. So shouldn't copies of data be maintained in idle machines, is there way to determine the machines which should maintain the copies and which are not leaders ?
@srimcaou Před 7 lety ⁺¹
Your explanation made it very simple to understand Kafka. Thank You.
A question - If we start Brokers on multiple systems, do we need to increment the broker id's or '0' is good enough in all brokers?
@ScholarNest Před 7 lety ⁺²
Broker ID should be unique for each broker in the cluster.
@anjalim4292 Před 4 lety
Hi this is a wonderful tutorial... But i have a doubt regarding setting the replication factor... Is there any option to set the replicationfactor for dynamically created topics programmatically.. Please reply..
@souravsuman1993 Před 4 lety
Very nice tutorial. A lot of concept is cleared.
A quick question though, if i start single broker and keep the num.partitions=1 in server.properties file, and push a message to a non-existent topic. It will create a topic with one partition. Now if i start another broker on other machine with same configuration will the topic gets replicated on the other server?(no replication factor set)
@pavanim6258 Před 4 lety
Very nice explanation sir..thanq..One question..If the zookeeper is down for some reason and consumer is in the middle of reading..Is there a way to continue from that offset n partition once it's up?
@DeepakSharma_youtube Před 6 lety
What are the similarities/differences between Kafka replication factor and HDFS replication (default 3)? If HDFS is used then do we even need replication at Kafka level?
@ScholarNest Před 6 lety
Replication is a backup copy to be used in case of failure. It has the same purpose everywhere. Kafka doesn't run on YARN yet. It stores data on local filesystem instead of HDFS. So you need both of them to replicate their data.
@DeepakSharma_youtube Před 6 lety
What are the options used these days to copy from Kafka to HDFS? Like "Flume', "Kafka HDFS Connector" (using Confluent Kafka?), any others ?
@SantoshKumar-ck7tl Před 6 lety
Could you please tell me how to create multiple brokers on a Kafka VM on Google cloud
@sushantapatro3845 Před 7 lety ⁺¹
In case of multimode cluster ..Do we need to set up ssh or kafka is having any mechanism for communication between nodes?Nice videos ..helping to understand ..
@ScholarNest Před 7 lety
You need to make sure the TCP/IP connectivity. No need to setup ssh.
@amancheema7875 Před 4 lety
Hi Sir..You did a great job..Need you help I have an doubt.
Replicas: 1,0,2 means Broker 1 is leader and maintains the first copy and Broker 0 and Broker 2 each maintains one copy as per replication factor(3).
--partitions 2 means 1st partition on broker 1 and 2nd partition on Broker 2. How broker 0 can store a copy without having partition?
We have partition 2(Broker 1+ Broker 3) and replication factor 3( Broker 1+Broker 0+ Broker 2). How Broker 0 storing the copy without having partition.I hope you understand my query . Waiting for your response. Thanks
@mkhajar Před 2 lety
Hi Sir, its very good. But it looks that Leader node is a single point of failure. How do you address in case of node 1 fails? Because the producers and consumers are connecting to Node1 which is a Leader node.
@deepdesai132418 Před 3 lety
Can there be possibility of not able to produce an event as the broker is down. How would we handle that
@RahulSharma-gt4ri Před 6 lety
Hi, After changing port no of server-1.properties file, I am getting an error "kafka.common.KafkaException: Socket server failed to bind to 0.0.0.0:9092: Address already in use: bind." please help!
@hemtri1983 Před 3 lety
Isn't leader is a single point of failure? How it handles fault tolerance in case if leader fails?
@user-rr3cc3uv2h Před 7 lety
Is there a way to build gui tool to use the tools under /bin folder? Does Kafka offer any api for us to do such a thing?
@ScholarNest Před 7 lety ⁺²
That's a nice thing to do, but most of it is already done by the confluent team. I suggest that you check the confluent documentation once before you decide to create something.
@arindamroychowdhury8465 Před 4 lety
Nice tutorial. it is really easy to understand. I have one doubt in this video we have created 3 broker . but is --describe command why it is showing 2 rows of data. why not 3 rows of data?
@harivallam428 Před 6 lety
does the size occupied by replications is of the same size as that of original data or is it a compressed format? Also, will all the topics be applied replication in a production system?
@ScholarNest Před 6 lety
Your question indicates that you are concerned for storage space. Having three replica is wildly accepted in almost every distributed system. Storage is getting cheaper and cheaper. Compression is not an issue. You can implement it. But for Kafka, storage space is not a big concern because we cleaanup old messages that we don't need. You can configure cleaning frequency.
@gundamanideep965 Před 4 lety
What is local Host port given while creating topic ??
@balkrishnapatil649 Před 4 lety
I have one question..
Example= I have daily 15 GB data from 4 different resources.. then can you plz tell me that how many producer topics brokers partition and consumers I have to take
@happyworld6637 Před 5 lety
Can you make a video on Kafka controllers
@9975846958 Před 5 lety ⁺¹
Hello Sir, All your lectures are beautifully explained.
I have couple of queries, can you please clarify for me:-You have mentioned, at 2:40min in the video that, when client wants to send data, it connects to the leader. Does Producer keep track of the Partition and Partition leader?
Also, does producer keep track of the Partition Offset?
@ScholarNest Před 5 lety
After connecting to a broker, producer internally queries for the metadata from the broker. The metadata contains all those details.
@9975846958 Před 5 lety
@@ScholarNest - Thanks for replying.
When you say "producer internally queries for the metadata from the broker.", does this means, Producer queries for the metadata details from Kafka Broker and broker in return fetches those details from Zookeeper?
@ScholarNest Před 5 lety
Yes, but all the metadata may not come from the Zookeeper because in every newer version, Kafka is trying to reduce the dependency on Zookeeper.
@arunbm123 Před 6 lety
sir, if 3 brokers on 3 different machines, do we need to configure IP address also apart from port number?
@ScholarNest Před 6 lety
The default is localhost and that works if you are using three different machines.
@ramkowsu5295 Před 4 lety
Good video, can't we get port clash, if we leave it as is. 7:10
@GauravSharma-wb9se Před 2 lety
you have created 2 partition & 3 replicas, so i understnad each broker will have 1 replica so (no. of replica = no. of brokers) but i am confused where partition will be created because we have 2 partition and 3 brokers so in which broker which partition will go?...please clear my doubt.
@suhasnayak4704 Před 5 lety
When there are 2 or more partitions for a topic handled by different brokers, where are we mentioning what requests(from producer/consumer) is handled by which broker? How will kafka handle this ?
@9975846958 Před 5 lety
Hello Suhas, I have just started understanding Kafka and I also have the same query. Did you get an answer to it?
@saireddy98666 Před 6 lety
so if we want 2 partitions for a topic, should we create 2 Kafka brokers??(or multiple, 1 for each partition)
@ScholarNest Před 6 lety
I think this question is answered in one of the videos. Finish the tutorial in sequence.
@dhoni313 Před 6 lety
very good explanation, I have a simple question why do we need to provide zookeeper related info while creating Topic why not Kafka server information ?
@ScholarNest Před 6 lety ⁺¹
Topic metadata is kept in Zookeeper. We need a common place for all brokers to have access to some essential information, Kafka uses Zookeeper for that purpose.
@satyam-talks Před 4 lety
For those, who neither have linux nor any access to clouds, can download git bash and change your code to visual studio code. And you can start working on your windows.😎
@ashrafali12341 Před 7 lety
Very well explained.. how Kafka know first broker(server.properties),second broker(server-1.properties),Third broker(server-2.properties) are in same cluster?
@ScholarNest Před 7 lety
When you start a broker you specify the properties file as a parameter, so Kafka knows it.
@naresht2516 Před 7 lety
Hi Its very good explanation. How to specify the leader? Where to specify?
@itsmerajas Před 7 lety ⁺¹
I had the same question. Turns out that Kafka (actually the ZooKeeper underneath) uses the concept of ISR (In Sync Replicas explained in video) to elect the leader. The ISR is persistent with the Zookeeper so any change in ISR is reflected by Zookeeper. So in layman terms, if ZooKeeper knows that 3 replicas of partitions are to be maintained for a particular topic and all 3 ISR are in perfect order then it just picks one of them (I think randomly) as all are legit leader candidates. In this manner it can also deal with fault-tolerant as all it needs to do is make sure remaining ISR is okay.
For more detail explanation :
community.hortonworks.com/questions/64905/kafka-leader-election.html
kafka.apache.org/documentation/#design_replicatedlog
@alecaldrinelakra1653 Před 4 lety
So if I say if number of partitions is X and replication factor is Y . So total number of partitions at any Moment is X times Y is it ?
@ScholarNest Před 4 lety
Yes. In your case, assuming X=5 and Y=3, You will have 5 partitions and each having 3 copies so in total 15.
@manjusahu2666 Před 5 lety
Hello Sir, is it possible to read messages stored in Kafka from beginning using logstash without changing the group I'd?
@ScholarNest Před 5 lety
You can get data form beginning as long as it is not cleaned or compacted at Kafka. However, When you work in a group, you don't read everything. I suggest you to rethink about what you are trying to achieve.
@manjusahu2666 Před 5 lety
@@ScholarNest thank you sir for reply. I achieved it using Kafka command --reset-offsets --to-earliest. It resets the message's offset from beginning and logstash reads it again.
@user-rr3cc3uv2h Před 7 lety
A question about committed partition message offset and fault tolerance - does partition leader recplicate the committed offset to the follower?
@ScholarNest Před 7 lety
Partition Leader? I think that stores messages and followers replicate the message. Offsets are generally stored in a topic so they are like globally available.
@user-rr3cc3uv2h Před 7 lety
A topic is the meaningful category of a bunch of messages. Could you give some explain what do you mean Offsets are generally stored in a topic?
@ScholarNest Před 7 lety
Kafka needs to store it somewhere. In the earlier version, it was being stored in zookeeper. In the new version, they create a topic and store it there.
@ramasubbareddy6147 Před 6 lety
when we have added two more brokers on existing broker node make sense because we have modified the required information in configuration files but however we have not modified any configuration for zookeeper... So question is how does zookeeper will get know that the two brokers are added newly into the cluster?
@ScholarNest Před 6 lety
Normally you copy a broker configuration file and edit some of the properties. Since zookeeper is same for the new broker as well, we don't change that. But zookeeper details are there in the configuration file.
@ramasubbareddy6147 Před 6 lety
Thank you... Understood and following other videos and if we get any doubts will post them... awesome videos
@dojospark Před 6 lety
How can one implement Kafka to a traditional warehouse space?
@ScholarNest Před 6 lety
Datawarehouse is just a database. Do you see any problem in that?
@sivakaranb7265 Před 4 lety
Sir, How the 3 brokers are coming into same cluster? Are we mentioning it anywhere?
@ScholarNest Před 4 lety ⁺¹
That's a good question but the answer is not that straight. Cluster members are managed in zookeeper. I have explained it in my Kafka Streams training. You might want to get It at Udemy :-)
@mahammadshoyab9717 Před 7 lety
what is a listener in kafka,how to change listener port number
@ScholarNest Před 7 lety
There is no listener in Kafka. You connect to the cluster by supplying a broker address and port. Start broker on a different port if you want to change.
@geetikathakkar1445 Před 7 lety
Please tell me how leader copies all incoming messages to followers?
@ScholarNest Před 7 lety ⁺¹
The leader doesn't copy it to followers. It is the other way around. Followers copy it from the leader.
@avinashjain057 Před 6 lety
Is there a way to Handel failure of producer?
@ScholarNest Před 6 lety
Producers are independent application. We treat it like any other independent application. There is no specific method in Kafka to handle those failures. What would you do to handle failures of any other application. Same applies to Kafka producer.
@avinashjain057 Před 6 lety
Learning Journal thank you for your reply..
Course is awesome
@ramihemangks Před 5 lety
@Lerning Journal
Question: is it possible particular topic all partitions have the same Leader
example Topic name PDF
Number of Partitions :3
Number of Nodes:3 ( N1, N2, N3)
when we run command describe topic PDF then it possible
Partitions 0 Leader: N1
Partitions 1 Leader: N1
Partitions 2 Leader: N1
Reference of video position time 10:06
Thank you, Sir, Very clean and state forward explanations!
@ScholarNest Před 5 lety ⁺²
Normally doesn't happen unless other two brokers are down while you create the topic. But possible on a single node cluster.
@jase4772 Před 5 lety
So does the Producer know the public IP address of all 3 Brokers? How does the Producer know which is the Leader and what makes him swap to the new Leader if it fails? Thats what I don't understand from the video. Ta.
@ScholarNest Před 5 lety
Producer connects using IP address provided and then queries metadata from the broker to get detailed information about other brokers and leader of the topic.
@venkateshkamthane3912 Před 6 lety
What if we have two brokers and have replication factor as 3?
@ScholarNest Před 6 lety
+Venkatesh Kamthane good question. Try it and let me know the answer.
@ART000747 Před 4 lety
@@ScholarNest Topic creation will fail with => Replication factor: 3 larger than available brokers
@ritwikjain234 Před 5 lety
Tutorial was good. Experimenting something I came up with a question. Lets say I have a topic XYZ and three brokers say B1 B2 B3. I have done partition 2 and replicas 1. After checking the details of the topic, B1 and B3 have the topic but not B2.
Now when I have made the producer linked to B2 where the topic is not there and all the consumers have linked to that topic from all three bokers,
I was successful in sending data and well as recieving in all three brokers. How is it happening. I mean topic is not partitoined in B2 but I am able to send the data from there and even B2 is able to take the data
@ScholarNest Před 5 lety
Producer pulls metadata after connecting to the broker. The metadata contains the list of all brokers in the cluster. That's how the producer knows about the all brokers in the cluster.
@amranhossain7099 Před 3 lety
For sound at the beginning of the video you may loos subscriber
@melsaied101 Před 4 lety
Thank you

Další v pořadí

Automatické přehrávání