Mask RCNN - COCO - instance segmentation

Sdílet
Vložit
  • čas přidán 5. 11. 2017
  • source: github.com/karolmajek/Mask_RCNN
    Input 4K video: [NEW LINK!!!]
    archive.org/details/000220170...
    If this video helped you somehow - you can buy me a coffee:
    bit.ly/Coffee4Karol
  • Věda a technologie

Komentáře • 227

  • @Augmented_AI
    @Augmented_AI Před 5 lety +21

    Would really love a video tutorial :)

  • @Musicoder-es3nt
    @Musicoder-es3nt Před 6 lety +2

    it's really impressive.

  • @mohamedouziane
    @mohamedouziane Před 4 lety

    Question: for segmentation semantic models : Bisenet & segnet, it is necessery to prepare the masque for training? if yes how why to prepare this masque ? thank you for anwser me.

  • @nguyenmanhcuong9657
    @nguyenmanhcuong9657 Před 6 lety +1

    Hi, I want to know which is the annotation tool to create the dataset (like MS Coco). I used labelImg but it can create only the Pascal VOC data set. How to create the mask for image in case of MS Coco ? Thank for your help !

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Do you want to create contours of objects? There is a tool available, I don't remember name/link, sorry

  • @fouziaehsan2859
    @fouziaehsan2859 Před 6 lety

    HI
    Can anybody help me how can i use mask rcnn to train for mainly 3 object classes on my own video dataset? What we need to modify in train_shapes.ipynb for loading images from own dataset?
    thanks !

  • @denanfamily5890
    @denanfamily5890 Před 6 lety +19

    T-900 point of view

  • @sublimoud3196
    @sublimoud3196 Před 6 lety

    hi, i am trying to set my own dataset with create_pet_tf_record.py i have mask, xml and png, but when i launch train and eval masks are not displayed... have you an idea ? no one answer me

  • @tahirhanyildizoglu3408
    @tahirhanyildizoglu3408 Před 6 lety +12

    at 0:48 it finds reflection of car in mirror at right side , Really good !

    • @smart_world7928
      @smart_world7928 Před 6 lety

      at 1:23 it also finds a bus right there ;)

    • @ojjoooooo
      @ojjoooooo Před 5 lety +2

      Would be good if it understood it's a reflection.

    • @KarolMajek
      @KarolMajek  Před 5 lety +1

      Imagine advertisements, billboards with cars or people...

    • @liberator328
      @liberator328 Před 4 lety +1

      Wouldn't that be bad? Ideally it should detect that it's a mirror of a car, not another different car.

    • @KarolMajek
      @KarolMajek  Před 4 lety +2

      You need a lot of examples and a class for that. For autonomous machines it's a lot easier to use lidar.

  • @thisnameisnotavailable
    @thisnameisnotavailable Před 6 lety +45

    15:19 "Clock 98% ... " - it's ventilation, man! VENTILATION!

    • @KarolMajek
      @KarolMajek  Před 6 lety +20

      no ventilation in COCO set cocodataset.org that's the reason

    • @mathiashaudgaard9360
      @mathiashaudgaard9360 Před 5 lety +10

      However this ventilation has a lot of the same features as a clock. If you look closer you can see how they relate :-)

    • @OscarPrice007
      @OscarPrice007 Před 5 lety +1

      What about the bus (van) and the three sinks (litterly just road)?

    • @dx31900
      @dx31900 Před 4 lety

      That's why they never labeled anything 100%. Machine vision can "never" be accurate as human eye.

  • @user-uh1xh4dq3q
    @user-uh1xh4dq3q Před 6 lety +2

    impressive

  • @mohamedouziane
    @mohamedouziane Před 4 lety

    Great job

  • @MasterJ202
    @MasterJ202 Před 4 lety +2

    I’m interested in using this with Unity. I already have Tensorflow and I have been playing around with an object detection example project from the internet and it works pretty well but I like the idea of Masking the objects instead of just bounding boxes. Do you think it’s possible?

  • @MrSwan-tm5wj
    @MrSwan-tm5wj Před 6 lety +2

    Would it be possible to track rats going around a track, and get their order ? Is yolo a better choice to do this ?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      YOLO/SSD are faster and should do this, but you will need also tracking I guess

    • @MrSwan-tm5wj
      @MrSwan-tm5wj Před 6 lety

      Thanks.

  • @walterrural
    @walterrural Před 5 lety +3

    May I use this footage in a project? I'm looking for this kind of footage, but nothing I find licensed in Creative Commons compares to this.

    • @KarolMajek
      @KarolMajek  Před 5 lety +4

      Ok, but please share the result with me. Just wanted to know. Thanks for asking!

  • @TankNSSpank
    @TankNSSpank Před 6 lety +1

    Holy shit. Good job.

  • @marr73
    @marr73 Před 6 lety

    this is more like semantic segmentation? how can i reproduce the result?

  • @piyush9555
    @piyush9555 Před rokem

    how precisely accurate computers have become, we're in a real trouble now

  • @richardburrows9202
    @richardburrows9202 Před 5 lety +1

    Very impressive work Karol. I'm working on a home surveillance tool (personal project) that utilizes IP cameras and an object detection neural network. Have you done much work in low-light environments? Any tips for working with images that have IR illumination (most security cameras use IR for night mode so the images are in greyscale).

    • @KarolMajek
      @KarolMajek  Před 5 lety +1

      It's hard if you have bad illumination. I was working with thermal camera, but not with RGB+illumination. Did you already tried to run something on such videos? If you want I can run some algorithms and publish results here, on my channel

  • @KaeMCe1
    @KaeMCe1 Před 6 lety +2

    Dzień dobry,
    należy przyznać, że algorytm segmentacji w połączeniu z YOLO(?) TensorFlow (?) działa naprawdę dobrze, jestem pod wrażeniem. Mam pytanie odnośnie użytego sprzętu i jak wyglądało użycie CPU i GPU oraz czy algorytm wykorzystywał obraz w oryginalnej rozdzielczości 4k, czy też był on specjalnie zmniejszany do obliczeń? Jaki FPS osiągnięto? Podejrzewam, że o ile samo wykrywanie obiektów może odbywać się w czasie rzeczywistym o tyle połączenie tego z segmentacją już nie koniecznie, jak to wyglądało w praktyce?

    • @KarolMajek
      @KarolMajek  Před 6 lety +1

      Dzień dobry,
      Wejście 1024x1024. Mask RCNN arxiv.org/abs/1703.06870 w Tensorflow. Na GPU: GTX1080 ~1FPS (muszę to jeszcze sprawdzić bo w hoście mam jeszcze K40 i FPS też nie jestem pewien)
      A więc działa to znacznie wolniej niż YOLO czy SSD, ale wyjście jest bardziej atrakcyjne

    • @KaeMCe1
      @KaeMCe1 Před 6 lety +1

      Dziękuję za odpowiedź. Myślę, że w wolnej chwili spróbuję uruchomić YOLO albo tensorflow z bazą COCO.
      Nie myślał Pan o przetestowaniu tego typu algorytmu na filmie z platformy latającej (dronie)? Osobiście jestem ciekaw czy baza COCO pozwoli na detekcję obiektów widzianych z góry i z dużej odległości.

    • @KarolMajek
      @KarolMajek  Před 6 lety

      YOLO może mieć problem - źle działa dla małych obiektów. Bardzo interesujący problem, niestety nie mam takiego materiału video. Co do samego uruchamiania na dronie - proszę sprawdzić ten film - na telefonie: czcams.com/video/cQJa9AVEAII/video.html
      Co do danych z góry - kwestia przygotowania zbioru danych i wytrenowania sieci neuronowej.

  • @nicom.1500
    @nicom.1500 Před 6 lety +3

    Great video, Karol. Thanks for sharing. I think yours is the only implementation where you see only one color for each label (people, car, etc) which makes it much more understandable vs the original. How are you doing that? Thanks!

    • @furkancetin3195
      @furkancetin3195 Před 6 lety

      Bro did you find an answer? I am trying to the same thing

    • @KarolMajek
      @KarolMajek  Před 6 lety +2

      It's not a secret, I plan to release code and create tutorial. Sorry that you are waiting so long

    • @furkancetin3195
      @furkancetin3195 Před 6 lety

      It will be great I am waiting for that tutorial thank you :)

  • @justcurious2906
    @justcurious2906 Před 6 lety +1

    Is this implementation better than YOLO? Also, how do you count uniquely the objects detected, without counting the same object twice?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      This is single frame detection, without any tracking, so you wouldn't know. But in each frame you are able to count objects, as output you get list of instances + contours

  • @qizhang7648
    @qizhang7648 Před 6 lety +1

    Why it can't recognize the Big Big van right in front during time frame 16-18 min? It starts to recognize the van only when in distance or looks smaller. This remind me the Tesla fatal accident of running into a trailer. any explanation?

    • @gsupreeth
      @gsupreeth Před 6 lety +1

      Good catch. I noticed that it recognises it as a bus before it enters the tunnel. Once inside it has seemed to reacquire the shape everytime the brightness level changes. Also I think it's because it's directly in front, and does not have a stong enough contour to recognise the object at once. Will be interesting to hear Mr.Karol's perspective on this.

    • @KarolMajek
      @KarolMajek  Před 6 lety

      I think the biggest issue is dataset bias (trucks are far away, more pickups, or other types, ...). MS COCO is used for training - look here what the truck is cocodataset.org/#explore If we lower detection threshold it should appear, but we will have far more bad results...

  • @aiai-nn5jd
    @aiai-nn5jd Před 6 lety +1

    Thanks for share this video, how can i get this code to learn?

  • @bigbook5307
    @bigbook5307 Před 6 lety +1

    Is there a cpp implementation with caffe?? I have not found on github.

    • @KarolMajek
      @KarolMajek  Před 6 lety

      +Xuling Chang I don't know any. I googled but haven't found any results

  • @shahriarshakhawat4605
    @shahriarshakhawat4605 Před 5 lety

    I want to learn this stuffs. Just installed Cvat on my MacBook air.
    Can you please suggest me from where should I start? As I have pretty vague knowledge about this things.

    • @KarolMajek
      @KarolMajek  Před 4 lety

      I don't think cvat is a good start, but it depends what is your goal, what do you want to achieve

  • @YanuarTriAdityaNugraha

    Really good implementation, but perhaps for safety critical use-cases, I'll limit on using the net only for vehicle and pedestrian detection only, with probably extension to traffic light detection (spot on). There's plenty of mislabel for traffic signs, almost all signs were detected as stop signs.

  • @jinwu4715
    @jinwu4715 Před 6 lety +2

    Karol, Super cool video! I wonder where you got those input videos. Can I use them in my own demo?
    Best,

    • @KarolMajek
      @KarolMajek  Před 6 lety

      My own, recorded with a phone. Yes, go ahead! Can you share results publicly? Would be great to see!

    • @jinwu4715
      @jinwu4715 Před 6 lety

      Will do if I have a proper video produced. Thanks!

  • @HumbertoManeto
    @HumbertoManeto Před 3 lety

    Is it possible to measure movement speed?

  • @cemrekaplan1480
    @cemrekaplan1480 Před 5 lety +2

    you've been watched
    -person of interest

    • @battalll
      @battalll Před 4 lety

      aga ben seni daha önce de bir ai videosunun yorumlarında görmüştüm sanki

    • @vardaan3282
      @vardaan3282 Před 4 lety

      It was You are being watched! I think!

  • @yuxiangwu4573
    @yuxiangwu4573 Před 6 lety +1

    Nice result

  • @hawklee8582
    @hawklee8582 Před 6 lety +1

    Great job, but I wonder if it is possible to hide the square outline and the shade of each detected object. Instead, I want the TensorFlow tells us the name of an object when we touch it on the screen.

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Nice idea for an app. It is possible

    • @hawklee8582
      @hawklee8582 Před 6 lety

      Could you show me how to do it?

  • @user-qr2fe9sv8p
    @user-qr2fe9sv8p Před 6 lety +2

    Hi, what are the hardware requirements for it to run in real-time? Thanks.

    • @Chocolovers
      @Chocolovers Před 3 lety

      Its not ment for realtime too much weight

    • @KarolMajek
      @KarolMajek  Před 3 lety

      True. And it's quite old. Check YOLACT and DETR

  • @lukas_543
    @lukas_543 Před 4 lety

    is the underlying data available (video + instance segmentations for each frame)?

  • @primodernious
    @primodernious Před 4 lety

    now we need just to make it detect more details of objects and translate these speed signs into numbers for speed regulation and get enough detail to tell what ever the car is a farrari or a ford as well as being able to detect red green and blue color on traffic lights and make it do a stop on red light or a stop if a person is on the way. had also been cool if the program could detect if the person what a male or female, long hair or short hair, sun glasses or no sun glasses and other details. i hope such detail will be included in future versions.

  • @Kevin-ze5gk
    @Kevin-ze5gk Před 6 lety +1

    what Hardware did you use for the recognition and on how much FPS you run it?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      desktop 1080. it was slow. I don't remember now. In next mask RCNN vid I will try to add FPS

  • @HomelessRafi
    @HomelessRafi Před 6 lety +2

    I am trying to segment and detect 2 different images, however the results i get are identical so that every image has the masks and detection of only the first image. Does anyone else have this problem?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      You mean to put 2 images as input (batch size=2), not to run prediction 2 times (run 2x with batch size=1)? Can you share your code?

    • @HomelessRafi
      @HomelessRafi Před 6 lety

      The first part. Currently I have the prediction running 2 times (batch size=1). But I want to be able to run 2 images simultaneously with the mask updating for each image.I hope I explained that clearly, let me know.

    • @HomelessRafi
      @HomelessRafi Před 6 lety

      Quick follow up question, how did you get the mask to stay a constant color for each frame?

  • @tomaszgiba
    @tomaszgiba Před 6 lety +1

    Impressive! This looks like V.A.T.S.

  • @DanFrederiksen
    @DanFrederiksen Před 4 lety

    Funny that it sees the fans in the tunnels as clocks.

  • @fabricmax7024
    @fabricmax7024 Před 6 lety +1

    Hey dude, I also made a video demo and i had a webcam test, but the speed is very slow, only about 1~2 fps. Do you know how to improve the test speed ? I want to make a real time application.

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Same for me, I also have 1-2 fps. This video is not realtime. I increased speed to match 30fps after detection

    • @fabricmax7024
      @fabricmax7024 Před 6 lety

      Yeah, same for me. I tested on a GTX TITAN X, what about 1080 ti? If you have progress about speed, could you please update your github? Appreciate it a lot!

    • @KarolMajek
      @KarolMajek  Před 5 lety

      GTX 980m, measured only the time of inference. There are computational heavy steps required to blend masks with input video

  • @fackdog
    @fackdog Před rokem

    *Top!*

  • @faruknane
    @faruknane Před 5 lety

    how much space is required for the network model to be trained? in my case for 400x400 image (batchsize=14) I need 11gb memory (U net basic model).

    • @KarolMajek
      @KarolMajek  Před 5 lety +1

      For mask RCNN I don't know. If you reduce batch size you can use smaller GPU, but bigger batch is better. Faster RCNN with NASNet requires 8gb for 1 image in batch...

    • @faruknane
      @faruknane Před 5 lety

      ​@@KarolMajek 8gb is really huge wow. Thank you for the information.

    • @KarolMajek
      @KarolMajek  Před 5 lety +1

      Go to tensorflow object detection model zoo and check prediction speed. Slower nets need more memory to train

  • @magdalenamackowiak1065
    @magdalenamackowiak1065 Před 3 lety +1

    Dzień dobry,
    Czy możliwa jest realizacja tego typu projektu w środowisku Google Colab?

    • @KarolMajek
      @KarolMajek  Před 3 lety

      Uruchomić się da, ale Colab jest bardzo niewdzięczny. W wersji pro mocniejsze maszyny, ale lepiej wychodzi gdy się uruchomi na laptopie - sesja się nagle nie skończy i nie trzeba dostosowywać kodu. Zapraszam do mailingu mojego, tam wygodniej się pogada, albo przez jakiś formularz kontaktowy. A chodzi konkretnie o uruchomienie inferencji by dostać segmentację instancji? Może wystarczą ramki jednak, bo szybciej

  • @parkerlopez3197
    @parkerlopez3197 Před 2 lety +2

    The most impressive part of this is the detection of the car via a reflection on the glass

  • @abdullahzaqebah7919
    @abdullahzaqebah7919 Před 4 lety

    hi
    when i runs the code, in the step of Create Model and Load Trained Weights
    i faced this error:
    AttributeError: 'TensorShape' object has no attribute 'rank'
    any help?

    • @KarolMajek
      @KarolMajek  Před 4 lety +1

      I think you're using newer version of TensorFlow

    • @abdullahzaqebah7919
      @abdullahzaqebah7919 Před 4 lety

      @@KarolMajek am using tensorflow 1.9.0

    • @KarolMajek
      @KarolMajek  Před 4 lety

      @@abdullahzaqebah7919 Try Matterport original repo github.com/matterport/Mask_RCNN and their updated demo github.com/matterport/Mask_RCNN/blob/master/samples/demo.ipynb
      TF version seems ok (>1.3.0)

  • @my1347
    @my1347 Před 4 lety

    In order to get better result , do you train with 4K image pairs? i

    • @KarolMajek
      @KarolMajek  Před 4 lety +1

      This is original result from a model trained on COCO dataset. 4k video is for test purpose only

    • @my1347
      @my1347 Před 4 lety +1

      @@KarolMajek got it, Thank you!

  • @gilmeregildo1093
    @gilmeregildo1093 Před 5 lety +1

    why its hard to breath while working on this??

    • @KarolMajek
      @KarolMajek  Před 5 lety

      Is it? Maybe it's the air pollution

  • @jaydipsinhparmar6010
    @jaydipsinhparmar6010 Před 6 lety +1

    hello sir i have download the source link but i am not able to run the proj.please give me the doc if you have

    • @KarolMajek
      @KarolMajek  Před 6 lety

      You will need python, tensorflow and jupyter notebook installed. Then try to run this: github.com/karolmajek/Mask_RCNN/blob/master/demo.ipynb

    • @jaydipsinhparmar6010
      @jaydipsinhparmar6010 Před 6 lety

      first thanks for your reply.i have allready install the python,tensorflow and jupyter notebook.i have soma cuda error can you give the specific version of tensorflow and cuda toolkit version

  • @dantwister5106
    @dantwister5106 Před 3 lety

    Transparent colors looks great. The question is, is it possible to integrate this into google glass, so that you have a real life color hack? I think it is because it is easy process screenshots in real time at 640x360 resolution, no?

    • @KarolMajek
      @KarolMajek  Před 3 lety

      Thanks, mask rcnn is too slow. Check YOLACT or DETR

  • @deacix
    @deacix Před 6 lety +1

    Crazy shit!

  • @kanimozhisoundararajan9850

    Hai !! am new to this mask RCNN. but i was really impressed by ur work.am having some doubt in running the github work. Can i mail u ?

  • @twiekfkwniejwjrj7867
    @twiekfkwniejwjrj7867 Před rokem

    what is the mask?

  • @muhammadmonjurulkarim1614

    how did you set same color for all cars?

    • @KarolMajek
      @KarolMajek  Před 5 lety

      I am going to share this code finally, but have too many other things. Thank you for patience

  • @yuejohnson7661
    @yuejohnson7661 Před 6 lety +1

    Hi, It is very good job , and I can run matterport/Mask_RCNN but it is so slow(GTX 1060 , about 450ms per frame which frame size is 640*480 ), How fast you are?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Thanks! I confirm this is slow. It took day or two to process this video :-) I need to add fps display next time

    • @yuejohnson7661
      @yuejohnson7661 Před 6 lety +1

      I'm sorry, I didn't see the previous comments. Mybe most of us are interested in using Mask Rcnn for real-time.

  • @neparkiraj
    @neparkiraj Před 5 lety

    Can you make it recognize different types of trees from a long way?

    • @KarolMajek
      @KarolMajek  Před 5 lety

      You will need manually annotated dataset. But from the large distance the results will not be perfect

  • @thejanwijesinghe2393
    @thejanwijesinghe2393 Před 4 lety

    What is the average fps it could run on a GPU like titan x?

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Hard to say
      Ignore Mask RCNN, chech YOLACT - it's much faster!

  • @hlliu8638
    @hlliu8638 Před 6 lety

    Good! How to deal with the video? I can only use the source code to detect a single picture.

    • @KarolMajek
      @KarolMajek  Před 6 lety +1

      maybe I forgot to push. Basically you need to predict for every from and them make video from it - I am using ffmpeg, can recommend

  • @yesu399
    @yesu399 Před 6 lety

    Is it real-time? Or do you analyze each frame and write masks onto video?

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Not realtime

    • @yesu399
      @yesu399 Před 6 lety +1

      Emmmm.. So you process image by image and combine into a video?

    • @KarolMajek
      @KarolMajek  Před 6 lety +1

      Correct. Mask RCNN is slow. I plan to run 4 other maskr rcnns with different detectors and will put fps there

    • @yesu399
      @yesu399 Před 6 lety

      Yeah. It's indeed slow, I apply it on my own dataset and the best I got is 5 FPS. Do you have any recommendation for accelerating the prediction? Like any nice paper? Thank you!

    • @KarolMajek
      @KarolMajek  Před 6 lety

      Not yet, I think solution is waiting somewhere, but I haven't found it. I would give a try to any pytorch implementation. Should be faster because of channel-first image representation

  • @matiqulislam4026
    @matiqulislam4026 Před 6 lety

    I have successfully run the demo of Mask-RCNN but problem is that it does not display the output image

    • @KarolMajek
      @KarolMajek  Před 6 lety

      matiqul islam tu can use matplot or cv imshow. I can send you demo code via email if you want

    • @matiqulislam4026
      @matiqulislam4026 Před 6 lety

      Thanks brother please send me the code. My email address is : matiqul06@gmail.com

    • @matiqulislam4026
      @matiqulislam4026 Před 6 lety

      Dear Karol Majek I have another Question: Do you tell me Where I get or How I can generate "mask_rcnn_shapes.h5" Shapes trained weights. Also send me the demo code of my email: matiqul06@gmail.com

  • @Tetsujinfr
    @Tetsujinfr Před 4 lety

    How is it possible that with such a high resolution the detection is so unstable. E.g. the utility vehicule at 10:00 right in front of the camera get undetected several secondes...

    • @KarolMajek
      @KarolMajek  Před 4 lety

      It's not a matter of resolution. This net was trained on COCO, and here you have images from different distribution.
      I think Mask RCNN trained on Open Images v5 would work much better
      check also YOLACT which is much faster

  • @shikharsaxena9989
    @shikharsaxena9989 Před 3 lety

    Is this sematic segmentation because all the cars have same colour

    • @KarolMajek
      @KarolMajek  Před 3 lety

      No, it's instance segmentation. I set consistent colors for each class

  • @PrasannaRoutray97
    @PrasannaRoutray97 Před 4 lety

    Inference speed? Can we use it in real-time?

    • @KarolMajek
      @KarolMajek  Před 4 lety

      It's super slow.
      For realtime check: YOLACT

  • @abbkeihatsu
    @abbkeihatsu Před 6 lety

    Can it calculate trajectory and velocity?

    • @bormisha
      @bormisha Před 5 lety +1

      I guess this is still image processing applied to every frame of the video, so the results are not being correlated between frames. Compared to this, trajectory and velocity estimation once the objects have been detected and classified seems an easier task to me.

  • @user-ep6sq9nh8p
    @user-ep6sq9nh8p Před 5 lety

    Can you tell me about hardware requirement for this?

    • @KarolMajek
      @KarolMajek  Před 5 lety

      NVIDIA GPU. With 980M it works pretty slow. Now you can find faster methods. This was below 1fps as I remember

  • @jeffersondesouza7783
    @jeffersondesouza7783 Před 3 lety

    Boa noite,
    Caso possa adicionar as legendas em inglês eu consigo configurar a tradução automática para português e acompanhar com mais facilidade.
    Eu tenho aprimorado o meu inglês, mas ainda tenho dificuldades em alguns termos e fico com a sensação de estar perdendo algo importante.
    Você tem uma extensa gama de materiais excelentes, então se puder fazer a adição de legendas neles, seria de enorme ajuda.
    Mais uma vez parabéns pelo excelente trabalho.
    Grato,

  • @emretatbak
    @emretatbak Před 5 lety +1

    Hello what is your gpu and cpu?

  • @2mvX
    @2mvX Před 3 lety

    Myślisz że jest szansa implementacji tego na raspberry? Jakiego hardware to wymaga?

    • @KarolMajek
      @KarolMajek  Před 3 lety

      Na RPi już same boxy działają wolno. Zobacz Tencent NCNN. Zobacz NanoDet (jutro film będzie tu)

    • @KarolMajek
      @KarolMajek  Před 3 lety

      Przy okazji. Znasz DeepDrive.pl/30
      Trochę filmów o sieciach + parę miesięcy maili o Deep Learningu

  • @ltsgroupcompany
    @ltsgroupcompany Před 4 lety

    Hello, this video is so nice! Could I use this video for my youtube channel? If it is okay, then I will put the link of the video on description box.

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Yes, ok. Can you put here the link to your video so I can share it in a post?

    • @ltsgroupcompany
      @ltsgroupcompany Před 4 lety

      @@KarolMajek Thank you so much :)) ok, I will.

    • @ltsgroupcompany
      @ltsgroupcompany Před 4 lety

      @@KarolMajek czcams.com/video/E-52wXmG-1Q/video.html
      3:17 - 3:25 I used the clip from this video for my youtube channel! Hope you enjoy this video!!

  • @masifakbar
    @masifakbar Před 4 lety +1

    can i see the code to this project. your repository only have gifs and base code

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Are you looking for code for inference on video?
      I was modyfying this demo notebook github.com/karolmajek/Mask_RCNN/blob/master/demo.ipynb

    • @masifakbar
      @masifakbar Před 4 lety

      @@KarolMajek but how did u read video, frame by frame or did u use any other method? And any idea on how to use imagenet weights and dataset instead of Coco??

  • @kaibinxu6090
    @kaibinxu6090 Před 6 lety +2

    Around 10:00 there are moments the mask rCNN doesn't recognize the truck right in front. What may cause it?

    • @jerelvelarde2829
      @jerelvelarde2829 Před 5 lety

      It's not trained to recognize it.

    • @kawo666
      @kawo666 Před 5 lety

      @@jerelvelarde2829 But if it's not trained to recognize it, why does it recognize it in other frames?

    • @aniketji1435
      @aniketji1435 Před 5 lety +1

      It happens when the accuracy of detection falls below a given threshold which is set as criteria to display. And in continuous videos only we find such glitches where the frame might recognise it as 2 objects with less confidence in each.

    • @shairozsohail1059
      @shairozsohail1059 Před 5 lety

      Could be because the tires are occluded and because of the lack of reflective surface/depth

  • @juliorivera7663
    @juliorivera7663 Před 6 lety +1

    Hi, there. Good job for this! I'm really surprised!
    By this day, I'm working with Mask RCNN, but I could not figure out how to implement a live video, would give me a tip or something? I would really appreciate it! Thanks anyway!

    • @KarolMajek
      @KarolMajek  Před 6 lety +2

      Julio Rivera basically I am doing prediction for every frame and writing result to file. Then I am using ffmpeg to create a video. Of course you can use camera instead of mp4 file thanks to opencv.

    • @juliorivera7663
      @juliorivera7663 Před 6 lety

      Thank you!

    • @KarolMajek
      @KarolMajek  Před 6 lety +1

      Julio Rivera if you will need some help, feel free to ask

    • @juliorivera7663
      @juliorivera7663 Před 6 lety

      Karol Majek Cool! Do you have an email? Mine is juliorivera.rivas2013@gmail.com
      You are the man, haha

    • @KarolMajek
      @KarolMajek  Před 6 lety +1

      Julio Rivera karolmajek@gmail.com

  • @andersonriciamorim4895

    Which FPS did you achieve?

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Super slow, seconds per frame. For faster inference check detectron2 and yolact

  • @aaronag7876
    @aaronag7876 Před 3 lety +1

    interesting the system picked up cars in the reflection of windows

  • @saifu2write
    @saifu2write Před 4 lety

    source: github.com/karolmajek/Mask_RCNN
    please tell me how to work and install to proper steps.
    because some error i am facing.
    please help me.

  • @s2maschmeyer
    @s2maschmeyer Před 5 lety

    It can't seem to decide if motorcycles are cars or people? On a highway, I guess it doesn't matter in an accident....very little protection. Motorcycles are sometimes more unpredictable than cars and therefore, higher danger....It's a Bird... It's a Plane... It's a Boat.... It's Superman!

  • @t-lm
    @t-lm Před 5 lety

    Very nice job, Could you please add code example for IP camera live RTSP streaming on github? For static counting cars usage.

    • @KarolMajek
      @KarolMajek  Před 5 lety

      You will not get live detections since mask RCNN is pretty slow. You can use opencv to receive such stream. Then just pass the image through net and you will see your results

    • @t-lm
      @t-lm Před 5 lety

      @@KarolMajek I use ROCM with Sapphire NITRO+ RX 580 8 GB and no issue with live detection from webcam, i just need to figure out how to do it with IP cam.

    • @KarolMajek
      @KarolMajek  Před 5 lety

      Ok. Just use opencv camera capture with url as input

    • @t-lm
      @t-lm Před 5 lety

      @@KarolMajek i did that but after a few sec it crashes with this: [rtsp @ 0x7f762b264700] RTP: PT=60: bad cseq 4e86 expected=30c3 i think is a FFmpeg bug.

    • @KarolMajek
      @KarolMajek  Před 5 lety

      Can be. I don't have solution for that problem

  • @bryangarcia4153
    @bryangarcia4153 Před 5 lety +1

    This is not live, right?

    • @KarolMajek
      @KarolMajek  Před 5 lety

      Yes, it's super slow. Check my newest video, there is something much more online

  • @Booruvcheek
    @Booruvcheek Před 4 lety

    Source video not available

    • @KarolMajek
      @KarolMajek  Před 4 lety

      It's THE DAY - I finally update the link in desc:
      Input 4K video: [NEW LINK!!!]
      archive.org/details/0002201705192

  • @zieddatascientist546
    @zieddatascientist546 Před 3 lety

    I'm always asking my studient what we can do with that ? It's beautiful, nice, etc. but why

    • @AashuPrasad
      @AashuPrasad Před 3 lety

      Of many possibilities, it can be used as a vision system for self driving cars!

    • @stangerdanger7252
      @stangerdanger7252 Před 3 lety

      Terminators will use this to identity Humans and Machines.

  • @gemobnyi4474
    @gemobnyi4474 Před 6 lety

    inspired by bigpackets

  • @Ken-kt9br
    @Ken-kt9br Před 6 lety

    Reported for walls, enjoy your vac

  • @arcadetv8537
    @arcadetv8537 Před 6 lety

    where is it? turkey?

  • @sekolahrakyat4434
    @sekolahrakyat4434 Před 6 lety +1

    Hi carol can u send me the code ?

    • @sekolahrakyat4434
      @sekolahrakyat4434 Před 6 lety

      Please send me @ sarwo.jowo@gmail.com

    • @KarolMajek
      @KarolMajek  Před 6 lety

      look in description - github.com/karolmajek/Mask_RCNN

  • @fetullahatas3927
    @fetullahatas3927 Před 4 lety

    what is the fps

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Really low. Seconds per frame is a better metric
      Check Detectron2 or YOLACT

    • @fetullahatas3927
      @fetullahatas3927 Před 4 lety

      @@KarolMajek Thanks for reply, I did train YOLACT, accuracy is nothing close to Maskrcnn

    • @fetullahatas3927
      @fetullahatas3927 Před 4 lety

      @@KarolMajek I actually just need a real-time detector, in my scenario there is a lot of overlapping objects, do you have any recommendations for this task ?

    • @KarolMajek
      @KarolMajek  Před 4 lety

      @@fetullahatas3927 then check mask rcnn in detectron2

  • @uccao304
    @uccao304 Před 6 lety

    code

  • @chuckiepan6380
    @chuckiepan6380 Před 6 lety

    牛逼

  • @cezexcezex9888
    @cezexcezex9888 Před 6 lety

    skynet

  • @neptune4909
    @neptune4909 Před 3 lety

    esp irl

  • @sampark8422
    @sampark8422 Před 6 lety

    in real life wall hacks?

  • @HikariSakai
    @HikariSakai Před 5 lety +1

    Nice aimbot

  • @AnshumanKumar007
    @AnshumanKumar007 Před 4 lety

    This vide was sponsored by the rich people with awesome GPU gang.

    • @KarolMajek
      @KarolMajek  Před 4 lety +1

      3yo laptop with GTX980, nothing special

  • @PIDOtomasyon
    @PIDOtomasyon Před 4 lety

    I can send a gift, but Paypal blocked in here(Turkey), I can send over Patreon.

    • @KarolMajek
      @KarolMajek  Před 4 lety

      Thank you, what about Revolut?
      I had created patreon, let me check.
      Thanks!

    • @KarolMajek
      @KarolMajek  Před 4 lety

      www.patreon.com/karolmajek

  • @cjhuang3835
    @cjhuang3835 Před 2 lety

    nice try