#032

Machine Learning Street Talk

zhlédnutí 8 607

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 8. 09. 2024

Komentáře • 44

@oncedidactic Před 3 lety ⁺¹²
Strikes me that someone sharing very engineering and problem-solving oriented statements can be both extremely useful from the standpoint of a practitioner and also touching on very profound, edge-of-understanding areas. You don't always get this dual experience, so I really appreciated hearing from a guest like that. :D
@snarkyboojum Před 19 dny
I can't believe how much great content you've put out there on the internet. From one netizen to another - thank you!
@TheVigi99 Před 3 lety ⁺³
As a speech researcher, it was insightful to hear dr simon's thoughts on simclr, following the same recipe on speech via simclr and similar frameworks, seems to a sample efficient way to learn across various modalities.
@drhilm Před 3 lety ⁺⁷
So many practical insights. So fan to look under the hood of these popular papers.
@SawomirMucha Před 3 lety ⁺¹
Brilliant content! On the topic of systematic way of finding the data augmentations, it feels like DA helps in overcoming "staticness" of the image domain (for natural images). As Dr. Simon noted, videos provide a lot of semantic information that's missing in the static image world. Looking at the objects, we naturally can find them "cropped" (obscured, behind other object) or in different colour schemes throughout the day. Knowing that the main bottleneck is compute, it makes the whole thing even more exciting (assuming the exponential compute growth keeps up). Big breakthroughs ahead of us!
Thank you once again for this show, you're doing great job!
@snippletrap Před 3 lety ⁺³
I'm with Yann LeCun on this one -- I think regularized latent variable models are ultimately the way to go. They reduce the Kolmogorov complexity of the learned model, constraining the size of the solution space, whereas contrastive learning pushes on individual points. To shape the space you're going to need a lot of points...
@flamboyanta4993 Před rokem
Such a good discussion! Thanks!
@videowatching9576 Před 2 lety
Fascinating, I’m interested in hearing more about text to video generation
@smealzzon Před 3 lety ⁺¹
Really enjoyed the short/condensed version. Looking forward to watching the long version to improve understanding. I really like this podcast format! Also great material!
@MrJorgeceja123 Před 3 lety ⁺⁵
Tim, your Whimsical notes are great! Can you release them public?
@abby5493 Před 3 lety ⁺²
Another awesome and informative video. 😍
@machinelearningdojowithtim2898 Před 3 lety
Thank you Abby! 😎
@Johan-qf8iv Před 3 lety
Great insights. I really appreciate the relatable and grounded discussions during the whole session. Thank you!
@DavenH Před 3 lety ⁺²
Thank you for this great segment. So many insights in those papers. I guess contrastive loss works best when we can use our own intuitions of the right inductive bias to augment the dataset, like Kilcher notes. I struggled with this part in an NLP project, which was concerned with learning sentence embeddings; how do you augment sentences? There aren't many simple surface-level tweaks you can do, analogous to changing the hue and brightness and cropping. You could swap synonyms and delete the odd adjective, but more than that what can you do without changing the meaning?
@quebono100 Před 3 lety ⁺⁴
Wow, with every episode even better content :-O mind-blowing
@TimScarfe Před 3 lety ⁺¹
🔥🔥✌️✌️😊😊
@philipprenz9741 Před 3 lety
1:23:04 Random sampling is not trickery. If you try to follow up the sequence "A cat is a" then suggestions with their probability are maybe ("is a", 0.4), ("cute animal", 0.3), ("awesome climber", 0.3).
The first is a useless repetition and the latter two make sense. That means that most of the probability mass is concentrated on stuff that makes sense, but the most probable continuation doesn't. Sampling the most probable continuation ignores that most of the probablity rests on stuff that makes sense. When more outputs are available this effect can get more pronounced.
@ronjagring9691 Před 3 lety ⁺¹
Thanks for this really nice podcast. Please continue :)
@Chr0nalis Před 3 lety
The way I see it, data augmentation is a way of making up for the lack of suitable prior in our model (CNN). Data augmentation doesn't generate new data, but it does implicitly strengthen the weak prior of the CNN which can lead to better generalization, faster training, etc.
@alandolhasz7863 Před 3 lety
Great episode! Plenty of practical ideas to play with.
@crimythebold Před 3 lety
Another fantastic one. I've got more papers to read about contrastive loss now
@georgepearse2680 Před 3 lety
Could you create a series / episode on active learning? Hugely relevant to work in industry, particularly segmentation tasks the require expert labellers, but weirdly little research on it. Recently discovered the channel and love the work you do.
@kimchi_taco Před 3 lety ⁺¹
Today is my lucky day. I found it!!
@fredguth1315 Před rokem
Please tell Simon that he can find type dispatching (function overloading) in fastcore from fastai. He is absolutely right that python is not the best language for de DL job. Anyway, I believe python will be default la gauge in the long run in the same way JavaScript is for web dev.
@shipper611 Před 3 lety ⁺¹
Stupid question maybe - if self similarity seems to be a pretty straightforward way to detect unnecessary layers and their exclusion would help - why isn’t a „self similarity check“ and corresponding auto deletion of layers not build into high level frameworks like Keras?
@alachance2010 Před 3 lety
Loved it
@lethanh1122 Před 3 lety ⁺³
Which tool to draw the graph at around 2:40 to 5:00 ?
@TimScarfe Před 3 lety ⁺³
Whimsical
@quebono100 Před 3 lety ⁺⁴
I had some similar thought, that maybe the selfsimilarity that mandelbrot discovered has some implication in NN
@obsiyoutube4828 Před 2 lety
amazing
@dr.mikeybee Před 3 lety
When there is no direct correlation in a layer, adding other layers may provide a combined feature correlation. Once the necessary combined correlations are created, there is nothing left to find. Additional layers just create unnecessary combinatoric burden.
@crimythebold Před 3 lety
Agreed. Any volonteer to code that for next week ?
@florianhonicke5448 Před 3 lety
Thanks, you made my day! One usecase of the embeddings learned by contrastive learning is neural search (finding nearest neighbours).
However, you can use many different attributes for contrastive learning. Would you just train a head on a pretrained network for each attribute? Or is the gain from fine-tuning the whole model worth it?
@SimonJackson13 Před 3 lety
Not sure why you'd not use a mean/variance brightness contrast normalization on the colour channel histograms.
@marekglowacki2607 Před 3 lety
After representation learning there will be data augmentation learning ;)
@machinelearningdojowithtim2898 Před 3 lety ⁺²
First!! 🙌🙌
@quebono100 Před 3 lety ⁺¹
Second
@MachineLearningStreetTalk Před 3 lety ⁺¹
As always 🙌
@vladomie Před 3 lety
Please drop the background noise
... it's distracting and annoying.
@MarkoTintor Před 3 lety ⁺²
Frequent cut-in video snippets are distracting from the talk.
@machinelearningdojowithtim2898 Před 3 lety ⁺¹
Thanks for the feedback! We are trying to achieve a certain format and are always experimenting - might have gone too far on this one. It's just a show teaser though, the full interview is always shown afterwards with no music or distractions, so just skip ahead.
@crimythebold Před 3 lety ⁺⁴
If it is the static picture inserts, I disagree. I dont always know what they are talking about, it helps to undertand
@Chr0nalis Před 3 lety ⁺¹
I disagree as well. I think they stimulate the imagination / provide a canvas while you are thinking about the content that is being talked about.

Další v pořadí

Automatické přehrávání

#033 Karl Friston - The Free Energy Principle