5 139
3 234 745

Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network | Spotlight 2-2B

3:48

Deeply Aggregated Alternating Minimization for Image Restoration | Spotlight 1-1C

3:57

Probabilistic Temporal Subspace Clustering | Spotlight 3-1A

3:59

Panelformer: Sewing Pattern Reconstruction From 2D Garment Images

8:20

Guided Distillation for Semi-Supervised Instance Segmentation

9:47

MetaSeg: MetaFormer-Based Global Contexts-Aware Network for Efficient Semantic Segmentation

8:30

Elastic Shape-From-Template With Spatially Sparse Deforming Forces | Spotlight 1-1B

Abed Malti; Cédric Herzet
Current Elastic SfT (Shape from Template) methods are based on l2-norm minimization. None can accurately recover the spatial location of the acting forces since l2-norm based minimization tends to find the best tradeoff among noisy data to fit an elastic model. In this work, we study shapes that are deformed with spatially sparse set of forces. We propose two formulations for a new class of SfT problems dubbed here SLE-SfT (Sparse Linear Elastic-SfT). The First ideal formulation uses an l0-norm to minimize the cardinal of non-zero components of the deforming forces. The second relaxed formulation uses an l1-norm to minimize the sum of absolute values of force components. These new formulations do not use Solid Boundary Constraints (SBC) which are usually needed to rigidly position the shape in the frame of the deformed image. We introduce the Projective Elastic Space Property (PESP) that jointly encodes the reprojection constraint and the elastic model. We prove that filling this property is necessary and sufficient for the relaxed formulation to: (i) retrieve the ground-truth 3D deformed shape, (ii) recover the right spatial domain of non-zero deforming forces. (iii) It also proves that we can rigidly place the deformed shape in the image frame without using SBC. Finally, we prove that when filling PESP, resolving the relaxed formulation provides the same ground-truth solution as the ideal formulation. Results with simulated and real data show substantial improvements in recovering the deformed shapes as well as the spatial location of the deforming forces.

zhlédnutí: 205

Video

Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network | Spotlight 2-2B

3:48

Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network | Spotlight 2-2B

zhlédnutí 169Před 3 měsíci

Ji Lin; Liangliang Ren; Jiwen Lu; Jianjiang Feng; Jie Zhou In this paper, we propose a consistent-aware deep learning (CADL) framework for person re-identification in a camera network. Unlike most existing person re-identification methods which identify whether two body images are from the same person, our approach aims to obtain the maximal correct matches for the whole camera network. Differe...

Deeply Aggregated Alternating Minimization for Image Restoration | Spotlight 1-1C

3:57

Deeply Aggregated Alternating Minimization for Image Restoration | Spotlight 1-1C

zhlédnutí 142Před 3 měsíci

Youngjung Kim; Hyungjoo Jung; Dongbo Min; Kwanghoon Sohn Regularization-based image restoration has remained an active research topic in image processing and computer vision. It often leverages a guidance signal captured in different fields as an additional cue. In this work, we present a general framework for image restoration, called deeply aggregated alternating minimization (DeepAM). We pro...

Probabilistic Temporal Subspace Clustering | Spotlight 3-1A

3:59

Probabilistic Temporal Subspace Clustering | Spotlight 3-1A

zhlédnutí 66Před 3 měsíci

Behnam Gholami; Vladimir Pavlovic Subspace clustering is a common modeling paradigm used to identify constituent modes of variation in data with locally linear structure. These structures are common to many problems in computer vision, including modeling time series of complex human motion. However classical subspace clustering algorithms learn the relationships within a set of data without con...

Panelformer: Sewing Pattern Reconstruction From 2D Garment Images

8:20

Panelformer: Sewing Pattern Reconstruction From 2D Garment Images

zhlédnutí 161Před 3 měsíci

Authors: Cheng-Hsiu Chen; Jheng-Wei Su; Min-Chun Hu; Chih-Yuan Yao; Hung-Kuo Chu Description: In this paper, we present a novel approach for reconstructing garment sewing patterns from 2D garment images. Our method addresses the challenge of handling occlusion in 2D images by leveraging the symmetric and correlated nature of garment panels. We introduce a transformer-based deep neural network c...

Guided Distillation for Semi-Supervised Instance Segmentation

9:47

Guided Distillation for Semi-Supervised Instance Segmentation

zhlédnutí 365Před 3 měsíci

Authors: Tariq Berrada; Camille Couprie; Karteek Alahari; Jakob Verbeek Description: Although instance segmentation methods have improved considerably, the dominant paradigm is to rely on fully annotated training images, which are tedious to obtain. To alleviate this reliance, and boost results, semi-supervised approaches leverage unlabeled data as an additional training signal that limits over...

MetaSeg: MetaFormer-Based Global Contexts-Aware Network for Efficient Semantic Segmentation

8:30

MetaSeg: MetaFormer-Based Global Contexts-Aware Network for Efficient Semantic Segmentation

zhlédnutí 155Před 3 měsíci

Authors: Beoungwoo Kang; Seunghun Moon; Yubin Cho; Hyunwoo Yu; Suk-Ju Kang Description: Beyond the Transformer, it is important to explore how to exploit the capacity of the MetaFormer, an architecture that is fundamental to the performance improvements of the Transformer. Previous studies have exploited it only for the backbone network. Unlike previous studies, we explore the capacity of the M...

Gradient-Guided Knowledge Distillation for Object Detectors

9:37

Gradient-Guided Knowledge Distillation for Object Detectors

zhlédnutí 233Před 3 měsíci

Authors: Qizhen Lan; Qing Tian Description: Deep learning models have demonstrated remarkable success in object detection, yet their complexity and computational intensity pose a barrier to deploying them in real-world applications (e.g., self-driving perception). Knowledge Distillation (KD) is an effective way to derive efficient models. However, only a small number of KD methods tackle object...

Small Objects Matters in Weakly-Supervised Semantic Segmentation

9:57

Small Objects Matters in Weakly-Supervised Semantic Segmentation

zhlédnutí 167Před 3 měsíci

Authors: Cheolhyun Mun; Sanghuk Lee; Youngjung Uh; Junsuk Choe; Hyeran Byun Description: Weakly-supervised semantic segmentation (WSSS) performs pixel-wise classification given only image-level labels for training. Despite the difficulty of this task, the research community has achieved promising results over the last five years. Still, current WSSS literature misses the detailed sense of how w...

BPKD: Boundary Privileged Knowledge Distillation for Semantic Segmentation

9:34

BPKD: Boundary Privileged Knowledge Distillation for Semantic Segmentation

zhlédnutí 103Před 3 měsíci

Authors: Liyang Liu; Zihan Wang; Minh Hieu Phan; Bowen Zhang; Jinchao Ge; Yifan Liu Description: Current knowledge distillation approaches in semantic segmentation tend to adopt a holistic approach that treats all spatial locations equally. However, for dense prediction, students' predictions on edge regions are highly uncertain due to contextual information leakage, requiring higher spatial se...

Query-Guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch

6:01

Query-Guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch

zhlédnutí 77Před 3 měsíci

Query-Guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch

TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

3:03

TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

zhlédnutí 49Před 3 měsíci

TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

High-Fidelity Pseudo-Labels for Boosting Weakly-Supervised Segmentation

4:18

High-Fidelity Pseudo-Labels for Boosting Weakly-Supervised Segmentation

zhlédnutí 141Před 3 měsíci

High-Fidelity Pseudo-Labels for Boosting Weakly-Supervised Segmentation

Graph Neural Networks for End-to-End Information Extraction From Handwritten Documents

9:59

Graph Neural Networks for End-to-End Information Extraction From Handwritten Documents

zhlédnutí 108Před 3 měsíci

Graph Neural Networks for End-to-End Information Extraction From Handwritten Documents

Real-Time User-Guided Adaptive Colorization With Vision Transformer

8:21

Real-Time User-Guided Adaptive Colorization With Vision Transformer

zhlédnutí 192Před 3 měsíci

Authors: Gwanghan Lee; Saebyeol Shin; Taeyoung Na; Simon S. Woo Description: Recently, the vision transformer (ViT) has achieved remarkable performance in computer vision tasks and has been actively utilized in colorization. Vision transformer uses multi-head self attention to effectively propagate user hints to distant relevant areas in the image. However, despite the success of vision transfo...

Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis

7:48

Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis

zhlédnutí 36Před 3 měsíci

Authors: Shangbang Long; Siyang Qin; Yasuhisa Fujii; Alessandro Bissacco; Michalis Raptis Description: We propose Hierarchical Text Spotter (HTS), a novel method for the joint task of word-level text spotting and geometric layout analysis. HTS can recognize text in an image and identify its 4-level hierarchical structure: characters, words, lines, and paragraphs. The proposed HTS is characteriz...

Towards Domain-Aware Knowledge Distillation for Continual Model Generalization

9:28

Towards Domain-Aware Knowledge Distillation for Continual Model Generalization

zhlédnutí 60Před 3 měsíci

Authors: Nikhil Reddy; Mahsa Baktashmotlagh; Chetan Arora Description: Generalization on unseen domains is critical for Deep Neural Networks (\dnns) to perform well in real-world applications such as autonomous navigation. However, catastrophic forgetting limit the ability of domain generalization and unsupervised domain adaption approaches to adapt to constantly changing target domains. To ove...

Let’s Observe Them Over Time: An Improved Pedestrian Attribute Recognition Approach

6:26

Let’s Observe Them Over Time: An Improved Pedestrian Attribute Recognition Approach

zhlédnutí 139Před 3 měsíci

Authors: Kamalakar Vijay Thakare; Debi Prosad Dogra; Heeseung Choi; Haksub Kim; Ig-Jae Kim Description: Despite poor image quality, occlusions, and small training datasets, recent pedestrian attribute recognition (PAR) methods have achieved considerable performance. However, leveraging only spatial information of different attributes limits their reliability and generalizability. This paper int...

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

7:44

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

zhlédnutí 44Před 3 měsíci

Authors: Cheng Yang; Rui Xu; Ye Guo; Peixiang Huang; Yiru Chen; Wenkui Ding; Zhongyuan Wang; Hong Zhou Description: Visual commonsense reasoning (VCR) is a challenging multi-modal task, which requires high-level cognition and commonsense reasoning ability about the real world. In recent years, large-scale pre-training approaches have been developed and promoted the state-of-the-art performance ...

Beyond Fusion: Modality Hallucination-Based Multispectral Fusion for Pedestrian Detection

7:42

Beyond Fusion: Modality Hallucination-Based Multispectral Fusion for Pedestrian Detection

zhlédnutí 39Před 3 měsíci

Beyond Fusion: Modality Hallucination-Based Multispectral Fusion for Pedestrian Detection

Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths

8:17

Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths

zhlédnutí 32Před 3 měsíci

Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

7:29

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

zhlédnutí 78Před 3 měsíci

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

Understanding Dark Scenes by Contrasting Multi-Modal Observations

8:08

Understanding Dark Scenes by Contrasting Multi-Modal Observations

zhlédnutí 26Před 3 měsíci

Understanding Dark Scenes by Contrasting Multi-Modal Observations

Can Vision-Language Models Be a Good Guesser? Exploring VLMs for Times and Location Reasoning

9:56

Can Vision-Language Models Be a Good Guesser? Exploring VLMs for Times and Location Reasoning

zhlédnutí 37Před 3 měsíci

Can Vision-Language Models Be a Good Guesser? Exploring VLMs for Times and Location Reasoning

Patch-Based Selection and Refinement for Early Object Detection

9:52

Patch-Based Selection and Refinement for Early Object Detection

zhlédnutí 23Před 3 měsíci

Patch-Based Selection and Refinement for Early Object Detection

Implicit Neural Representation for Change Detection

9:13

Implicit Neural Representation for Change Detection

zhlédnutí 57Před 3 měsíci

Implicit Neural Representation for Change Detection

Foundation Model Assisted Weakly Supervised Semantic Segmentation

7:43

Foundation Model Assisted Weakly Supervised Semantic Segmentation

zhlédnutí 84Před 3 měsíci

Foundation Model Assisted Weakly Supervised Semantic Segmentation

Unsupervised Graphic Layout Grouping With Transformers

4:41

Unsupervised Graphic Layout Grouping With Transformers

zhlédnutí 20Před 3 měsíci

Unsupervised Graphic Layout Grouping With Transformers

Beyond Classification: Definition and Density-Based Estimation of Calibration in Object Detection

7:51

Beyond Classification: Definition and Density-Based Estimation of Calibration in Object Detection

zhlédnutí 25Před 3 měsíci

Beyond Classification: Definition and Density-Based Estimation of Calibration in Object Detection

Universal Semi-Supervised Model Adaptation via Collaborative Consistency Training

3:19

Universal Semi-Supervised Model Adaptation via Collaborative Consistency Training

zhlédnutí 27Před 3 měsíci

Universal Semi-Supervised Model Adaptation via Collaborative Consistency Training

Komentáře

@safaridah-rn1yi Před 6 dny
I want to contact you, email address please
@safaridah-rn1yi Před 6 dny
I want to contact you, email address please
@GirishKumar-de5op Před 7 dny
we need any dataset or csv file to get height
@usernotfound778 Před 12 dny
That's awesome I have to implement this for my university graduation project can i have any way to contact with you i have some questions
@arvindaa Před 20 dny
hello sir can you please made how to implement the project
@pinkomega4106 Před 23 dny
No calibration for the model in the experiment? The MLLS only performs well with proper calibration
@arvindaa Před 23 dny
good job bro nice😃
@chenliangwang7882 Před 25 dny
I also particularly need a version with sound. It's a pity that such a great tutorial doesn't have any sound.
@abrahamvj3672 Před 27 dny
Can i get the codebase for the same if possible. Thank You
@shrikantAmbre-oq1kh Před měsícem
is it any code have you
@titirbiswas956 Před měsícem
Where can I find the code ? If I could use it , it would be really helpful to me
@enestemel9490 Před měsícem
A word of thanks to the speakers would be appreciated before continuing with the next speaker right away. Thank you to all the speakers for your contributions.
@bugdary Před měsícem
any github ?
@nobdan123 Před měsícem
It’s either mute or black screen.
@rahulsinghgulia6666 Před měsícem
Is there any reference Github project on this guided VAE model ?
@kesavasriramkothamasu3797 Před měsícem
Nice presentation
@kaelthunderhoof5619 Před 2 měsíci
Can this be used for a web application for a salon virtual try on?
@omarh.l.l8127 Před 2 měsíci
Thanks for the presentation. Is the code open sourced? Can you share the repository if so? Thank you
@daimsharif1615 Před 2 měsíci
Daym
@iMaTzzz Před 2 měsíci
Really great video and paper ! I'm quite surprised it hasn't been looked much before as I haven't seen a lot of papers in this topic. But I believe it may be quite useful in generic cases and could easily be used to train on a particular dataset so thank you, I really appreciate your work !
@JeracVideo Před 2 měsíci
Hi, I'm searching for software that can detect eyes, nose, and ears on a human face in a point cloud and return the coordinates of those features. Then, I'd like to merge a second point cloud according to these coordinates. I'm not sure where to start. Can someone give me some hints? Thanks!
@abdulmuqtadir6470 Před 2 měsíci
since test dataset have no public gt, how you have measured ANLS and accuracy? can you please! explain.
@TheNitroPython Před 2 měsíci
could you plz post any links to a github/paper to this?
@user-nx8nq7gj5g Před 2 měsíci
code implementation ??
@danieleneh3193 Před 2 měsíci
Please can you help with the githu repo
@paone9851 Před 2 měsíci
What is this crap
@DH0809 Před 2 měsíci
🎉🎉🎉 great job guy! Are you have any plan to open source layerDoc pre-train model?
@anish2099 Před 2 měsíci
HII Do you have the implementation of this paper I need
@user-kb9ft3es8u Před 2 měsíci
please give me video data set link
@danibot3000 Před 2 měsíci
It would be great to have a program to execute which then reads chart images and spits out a chart in excel etc.
@ZEENATFATIMA-jk1zl Před 2 měsíci
sir your code is not fully available could you kindly guide about training and evaluation . and how raw* images obtained. it Would be helpful
@itgoons Před 2 měsíci
can you pass on the source code link or any related material
@tiancaixinxin Před 2 měsíci
Thanks for watching the video about our paper!
@chrisrosch4731 Před 3 měsíci
How well would you say does this work for example to determine whether a person looks creepy in an image. E.g. creepy stare
@dialectricStudios Před 3 měsíci
Sick 😤
@waniubaid7718 Před 3 měsíci
Sahi h boss❤
@yyongfan Před 3 měsíci
Is there a GEThib project?
@ashoknp Před 3 měsíci
Thank you for sharing this algorithm and making the video. This is a very useful algorithm.
@pengfeichu7009 Před 3 měsíci
can you show your code
@user-ud8lp9gl6l Před 3 měsíci
Where can I get the dataset?
@BatashIs Před 3 měsíci
Can you share the code and make a video on image augmention without changing the value
@improvement_developer8995 Před 3 měsíci
Graft
@dreamzdziner8484 Před 3 měsíci
This looks super cool. Wish this gets a local install for windows😊
@rajpulapakura001 Před 3 měsíci
Hi Harsh, thanks for the awesome explanation and paper. I have a question. I am building a model for brain tumor segmentation. Each brain MRI scan contains 4 modalities: FLAIR, T1, T1CE, and T2, each providing complementary information about the subject's brain. Obviously, the reason I watched this video is because I wanted to create a segmentation model which could perform well even if some modalities were not available during inference. My question relates to the learnable token/embedding you use when you randomly mask the modalities during training. In the video you are using a dataset with 2 modalities, RGB and Depth. In my case, I have 4 modalities, so should I learn a "mask embedding" for each modality? Also, how do you learn this "mask embedding"? I would love to have a chat and discuss the details of your approach. Again, awesome paper and contributions. Hope to talk to you soon. Cheers, Raj, 16 y.o. ML Engineer from Australia
@user-ud1mv8zh2w Před 3 měsíci
please how can i implement this with efficientnetV2 in keras
@dhirendrakumar5158 Před 3 měsíci
The work is very interesting. Please upload the code I could not access the code for the work.
@fadwa2325 Před 3 měsíci
Can I get the code please?
@user-wc2qm9sw9n Před 3 měsíci
I actually like this topic
@mukeshkhanna305 Před 3 měsíci
Can I get the code for it?
@Harshit-cv4ie Před 3 měsíci
why is this not presented by authors??

ComputerVisionFoundation Videos

Komentáře