From Zero to Hero with Kafka Connect

Sdílet
Vložit
  • čas přidán 7. 08. 2024
  • Integrating Apache Kafka with other systems in a reliable and scalable way is often a key part of a streaming platform. Fortunately, Apache Kafka includes the Connect API that enables streaming integration both in and out of Kafka. Like any technology, understanding its architecture and deployment patterns is key to successful use, as is knowing where to go looking when things aren't working.
    This talk covers:
    * key design concepts within Kafka Connect
    * Deployment modes
    * Live demo
    * Diagnosing and resolving common issues encountered with Kafka Connect.
    * Single Message Transforms
    * Deployment of Kafka Connect in containers.
    📔 Slides: rmoff.dev/kafka-connect-zero-...
    👾 Code: github.com/confluentinc/demo-...
    ⏱ Time codes:
    00:00 What is Kafka Connect?
    03:38 Demo streaming data from MySQL into Elasticsearch
    11:43 Configuring Kafka Connect
    12:33 👉 Connector plugins
    13:33 👉 Converters
    13:53 👉 Serialisation and Schemas (Avro, Protobuf, JSON Schema)
    17:13 👉 Single Message Transforms
    19:43 👉 Confluent Hub
    19:51 Running Kafka Connect
    20:24 👉 Connectors and Tasks
    21:29 👉 Workers
    21:56 👉 Standalone Worker
    22:50 👉 Distributed Worker
    23:10 👉 Scaling Kafka Connect
    24:42 Kafka Connect on Docker
    26:17 Troubleshooting Kafka Connect
    27:56 👉 Dynamic Log levels in Kafka Connect
    28:48 👉 Error handling and Dead Letter Queues
    32:16 Monitoring Kafka Connect
    32:59 Recap & Resources
    --
    ☁️ Confluent Cloud ☁️
    Confluent Cloud is a managed Apache Kafka and Confluent Platform service. It scales to zero and lets you get started with Apache Kafka at the click of a mouse. You can signup at confluent.cloud/signup?... and use code 60DEVADV for $60 towards your bill (small print: www.confluent.io/confluent-cl...)
    --
    💾Download Confluent Platform: www.confluent.io/download/?ut...
    📺 Kafka Connect connector deep-dives: • Kafka Connect
    ✍️Kafka Connect documentation: docs.confluent.io/current/con...
    🧩Confluent Hub: www.confluent.io/hub/?...
  • Věda a technologie

Komentáře • 45

  • @thomaskaminski2187
    @thomaskaminski2187 Před 3 lety +8

    Hi Robin,
    I never write comments on youtube videos, but i deeply want to thankyou for all your work !

    • @rmoff
      @rmoff  Před 3 lety +1

      Thanks - glad it was useful!

  • @lossoth7141
    @lossoth7141 Před 3 lety +1

    Your examples are always very well chosen. Thanks.

    • @rmoff
      @rmoff  Před 3 lety

      Thanks - glad you've found it useful :)

  • @vivekshah1664
    @vivekshah1664 Před 4 měsíci

    Hi Robin,
    I am a software engineer at a startup. Last year we build a pipeline to sync our postgres data to elasticsearch and cassandra. It was all custom java code with lot of operational handling. Thank you for this video, I am planning to use connect for those pipelines.

  • @rajeshantony74
    @rajeshantony74 Před 2 lety

    Hai Robin, I am a new subscriber fan here

  • @marcuspaget
    @marcuspaget Před 3 lety +1

    Thanks Robin - from your newest fan and subscriber :) I'm really loving all the information coming from Confluent. Doing a top job. We are getting serious about implementing a solution centralized on Kafka (on limited budget) - guess there is just a lot of different ways and means. Will post on the community bit later - but just wondering - off top of your head if you were combining web logs from multiple websites of a similar nature (db schema is same - although as per your suggestion will look into avro) - would you combine all users into 1 topic (perhaps tagging where they originated) or set-up a topic for each website. Ultimately queries are centralized on username, so origination just fyi. Somewhere I heard/read about creating a topic per user - but this did n't seem right (for 10ks of users)

    • @rmoff
      @rmoff  Před 3 lety +1

      Hi Mark, from what you describe I would definitely collate these into a single topic, since they sound like the same logical entity. One topic per user sounds…unusual.

  • @rum81
    @rum81 Před 3 lety

    Thank you Robin!

    • @rmoff
      @rmoff  Před 3 lety

      my pleasure, glad to help :)

    • @rum81
      @rum81 Před 3 lety

      @@rmoff can you share links of use of kafka connect in production by companies. Need these examples to propose connect in my organization

    • @rmoff
      @rmoff  Před 3 lety +1

      @@rum81 If you look at past talks from Kafka Summit (www.kafka-summit.org/past-events) you'll find lots of examples of companies using Kafka Connect in production.

  • @lohitraja2400
    @lohitraja2400 Před 2 lety

    Hi Robin,
    Thanks for amazing videos.We are implementing Kafka in our project and when ever I got stuck your videos are helping a lot to clear of the concepts and issues.
    I have small conceptual doubt.
    Does Kafka and Kafka connect supports ENUM datatypes . We are facing error like Type cast the data type when syncing data from source table to sync table .

    • @rmoff
      @rmoff  Před 2 lety

      I'm so glad my videos have helped you out :)
      I don't know the answer to your ENUM question - please ask at forum.confluent.io/ and someone should be able to help. Thanks.

  • @user-tm6lw7vi5d
    @user-tm6lw7vi5d Před 3 lety +1

    Hi Robin, thanks for this video. I wonder 'mariadb-jdbc-connect' is available in this project. Thanks :)

    • @rmoff
      @rmoff  Před 3 lety

      Hi, if it has a JDBC driver then it's worth trying with the JDBC Source connector, sure.

  • @ris9hi
    @ris9hi Před 2 lety

    Thanks Robin. I have question on Plugin_path. you have given while installing the connector. From where that path came? Can i give any path? Where i can find that path to mention in Dockerfile?

    • @rmoff
      @rmoff  Před 2 lety

      Hi, this path comes from wherever you put the JDBC connector when you installed it. This might help: rmoff.net/2020/06/19/how-to-install-connector-plugins-in-kafka-connect/
      If you're still stuck then please go to forum.confluent.io/ and ask for further help there. Thanks.

  • @esbee296
    @esbee296 Před 2 lety

    Hi Robin, is there a source connector for adobe or can we use a json connector as long as the streaming data is in json format?

    • @rmoff
      @rmoff  Před rokem

      The best place to ask is www.confluent.io/en-gb/community/ask-the-community/

  • @AnkitSingh-dk7qb
    @AnkitSingh-dk7qb Před 3 lety

    Hi Robin, I facing issue in creating topic in Kafka for decimal data type is store as byte any way to slove that

    • @rmoff
      @rmoff  Před 3 lety

      Hi Ankit, the best place to ask is confluent.io/community/ask-the-community/

  • @mitanshukr
    @mitanshukr Před rokem

    In distributed mode, somtimes connect worker throws error about status.storage.topic cleanup.policy should be set to compact. I'm wondering why it throws that error occasionally!? and...Would setting log.cleanup.policy to compact on Kafka broker fix the issue!?

    • @rmoff
      @rmoff  Před rokem

      Yes, they should be set to compact - see docs.confluent.io/kafka-connectors/self-managed/userguide.html#kconnect-internal-topics
      Also head to confluent.io/community/ask-the-community if you have any more questions :)

  • @abhinavkumar-se2fd
    @abhinavkumar-se2fd Před 3 lety

    Hey Robin, thanks for this video. But could u pls guide us first on how to start apache kafka connect? And how to check if it is already running.

    • @rmoff
      @rmoff  Před 3 lety +1

      You can find good info on running Kafka Connect here: docs.confluent.io/platform/current/connect/userguide.html#connect-userguide-standalone-config

    • @abhinavkumar-se2fd
      @abhinavkumar-se2fd Před 3 lety

      @@rmoff I am trying to test FileStreamSourceConnector (file-source , a preconfigured connector in apache kafka)........ Connector starts successfully and it also fetches the data into the topic..... but when I run kafka consumer , it does not fetches any record...... i am following this document docs.confluent.io/platform/current/connect/quickstart.html
      Also, I am unable to find such connector under plugin.path...... then how come connctor starts ?

  • @aparnas8958
    @aparnas8958 Před 2 lety

    Hello Robin, I connected azureSQL with kafkaconnect by giving table name,host name,server name. .But not able to specify the db schema name anywhere, is there any way to specify schema name? Because without specifying schema name it is creating new table in db.

    • @rmoff
      @rmoff  Před 2 lety

      hi, please head over to forum.confluent.io/ and ask there :) thanks.

  • @armenchakhalyan
    @armenchakhalyan Před 2 lety

    The key format 'AVRO' is not currently supported. - when using FOEMAT='AVRO' in the KSQL

    • @rmoff
      @rmoff  Před 2 lety

      You need to upgrade to a more recent version of ksqlDB.

  • @miristegal
    @miristegal Před 3 lety

    I'm getting
    ERROR 1049 (42000): Unknown database 'demo'
    while trying to connect to mysql...

    • @rmoff
      @rmoff  Před 3 lety

      Did you create the database first? If you're still stuck head to forum.confluent.io/ with full details of what you've run and where you're getting the error.

  • @bavisettijyothsna7464

    Can you share any documents for msk as sink connectors

    • @rmoff
      @rmoff  Před rokem

      hi, the best place to get help is at www.confluent.io/en-gb/community/ask-the-community/ :)

  • @radityoperwianto1339
    @radityoperwianto1339 Před 2 lety +1

    I hope it isn't too late to thank you Robin

    • @rmoff
      @rmoff  Před 2 lety +1

      Glad it was useful :)

  • @rkravinderkumar05
    @rkravinderkumar05 Před 2 lety

    Hi Robin,
    How can we include the json schema in the message, when field is an array of objects ? I don't have the option to use avro.

    • @rmoff
      @rmoff  Před 2 lety

      Hi, can you post this at forum.confluent.io/ and hopefully someone will be able to help there :)