Valence Labs
Valence Labs
  • 231
  • 279 114
Weisfeiler Leman for Euclidean Equivariant Machine Learning | Snir Hordan
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg
The k-Weifeiler-Leman (k-WL) graph isomorphism test hierarchy is a common method for assessing the expressive power of graph neural networks (GNNs). Recently, the 2-WL test was proven to be complete on weighted graphs which encode 3D point cloud data. Consequently, GNNs whose expressive power is equivalent to the 2-WL test are provably universal on point clouds. Yet, this result is limited to invariant continuous functions on point clouds.
In this paper we extend this result in three ways: Firstly, we show that 2-WL tests can be extended to point clouds which include both positions and velocity, a scenario often encountered in applications. Secondly, we show that PPGN (Maron et al., 2019) can simulate 2-WL uniformly on all point clouds with low complexity. Finally, we show that a simple modification of this PPGN architecture can be used to obtain a universal equivariant architecture that can approximate all continuous equivariant functions uniformly.
Building on our results, we develop our WeLNet architecture, which can process position-velocity pairs, compute functions fully equivariant to permutations and rigid motions, and is provably complete and universal. Remarkably, WeLNet is provably complete precisely in the setting in which it is implemented in practice. Our theoretical results are complemented by experiments showing WeLNet sets new state-of-the-art results on the N-Body dynamics task and the GEOM-QM9 molecular conformation generation task.
Paper link: arxiv.org/abs/2402.02484
Speakers: Snir Hordan
Twitter Hannes: HannesStaerk
Twitter Dominique: dom_beaini
~
Chapters
00:00 - Intro + Background
11:57 - WL for Euclidean Equivariant ML
14:34 - Simulation of 2-WL via PPGN
18:06 - PPGN
21:29 - Weighted Summation Is All You Need
30:06 - WeLNet
36:23 - Experiments
43:06 - Conclusions
47:09 - Q+A
zhlédnutí: 78

Video

A Causal Inference Framework for Combinatorial Interventions | Anish Agarwal
zhlédnutí 544Před dnem
Portal is the home of the TechBio community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/care Summary: Consider a setting where there are N heterogeneous units and p interventions. Our goal is to learn unit-specific potential outcomes for any combination of these p interventions, i.e., N×(2^p) causal parameters. Choosing a combination of intervent...
KAN: Kolmogorov-Arnold Networks | Ziming Liu
zhlédnutí 22KPřed 14 dny
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KA...
D-Flow: Differentiating through Flows for Controlled Generation | Heli Ben-Hamu
zhlédnutí 872Před 21 dnem
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Taming the generation outcome of state of the art Diffusion and Flow-Matching (FM) models without having to re-train a task-specific model unlocks a powerful tool for solving inverse problems, conditional generation, and controlled...
EquiReact: An Equivariant Neural Network for Chemical Reactions | Puck van Gerwen
zhlédnutí 532Před 28 dny
Valence Portal is the home of the AI for drug discovery community. Join here for more details on this talk and to connect with the speakers: portal.valencelabs.com/ Summary: While molecular property prediction is well-established, reaction property prediction is in its infancy. To date, it is unclear what kind of information including chemical connectivity, reaction rules or three-dimensionalit...
Multimodal language models for mapping the genotype-phenotype relationship | Farhan Khodaee
zhlédnutí 410Před 28 dny
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: How complex phenotypes emerge from intricate gene expression patterns is a fundamental question in biology. Quantitative characterization of this relationship, however, is challenging due to the vast combinatorial possibilities and...
Causal Abstractions using Generalized Functions | Sander Beckers
zhlédnutí 211Před měsícem
Portal is the home of the TechBio community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/care Summary: I will introduce generalized functions and show how they can be used to reinterpret and generalize causal models and the causal relations that they express in a variety of different ways. As a first step, I define generalized functions and their ...
Learning to Group Auxiliary Datasets for Molecule | Tinglin Huang
zhlédnutí 199Před měsícem
Valence Portal is the home of the AI for drug discovery community. Join here for more details on this talk and to connect with the speakers: portal.valencelabs.com/ Summary: The limited availability of annotations in small molecule datasets presents a challenge to machine learning models. To address this, one common strategy is to collaborate with additional auxiliary datasets. However, negativ...
Smooth, exact rotational symmetrization for deep learning on point clouds | Sergey Pozdnyakov
zhlédnutí 529Před měsícem
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Point clouds are versatile representations of 3D objects and have found widespread application in science and engineering. Many successful deep-learning models have been proposed that use them as input. The domain of chemical and m...
Uncovering and Inducing Interpretable Causal Structure in Deep Learning Models | Atticus Geiger
zhlédnutí 475Před měsícem
Portal is the home of the TechBio community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/care Summary: A faithful and interpretable explanation of an AI model’s behavior and internal structure is a high-level explanation that is human-intelligible but also consistent with the known, but often opaque low-level causal details of the model. We argue ...
Local Search GFlowNets | Minsu Kim
zhlédnutí 296Před měsícem
Valence Portal is the home of the AI for drug discovery community. Join here for more details on this talk and to connect with the speakers: portal.valencelabs.com/ Summary: The Local Search GFlowNets is a new training algorithm designed to enhance the sampling quality of GFlowNets by utilizing local search methods. GFlowNets, a training approach for generating structured sequences in a constru...
Mosaic-SDF for 3D Generative Models | Lior Yariv
zhlédnutí 438Před měsícem
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Current diffusion or flow-based generative models for 3D shapes divide to two: distilling pre-trained 2D image diffusion models, and training directly on 3D shapes. When training a diffusion or flow models on 3D shapes a crucial de...
Stability-Aware Boltzmann Estimator (StABlE) Training of NN Interatomic Potentials | Sanjeev Raja
zhlédnutí 430Před měsícem
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Neural network interatomic potentials (NNIPs) are an attractive alternative to ab-initio methods for molecular dynamics (MD) simulations. However, they can produce unstable simulations which sample unphysical states, limiting their...
Combinatorial perturbation prediction using causally-inspired neural networks | Guadalupe Gonzalez
zhlédnutí 389Před měsícem
Valence Portal is the home of the AI for drug discovery community. Join here for more details on this talk and to connect with the speakers: portal.valencelabs.com/ Summary: As an alternative to target-driven drug discovery, phenotype-driven approaches identify compounds that counteract the overall disease effects by analyzing phenotypic signatures. Our study introduces a novel approach to this...
A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems | Mathis, Joshi, and Duval
zhlédnutí 1,5KPřed měsícem
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: portal.valencelabs.com/logg Abstract: Recent advances in computational modelling of atomic systems, spanning molecules, proteins, and materials, represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space. In these graphs, the geometric attrib...
HyperPCM: Robust Task-Conditioned Modeling of Drug-Target Interactions | Emma Svensson
zhlédnutí 247Před měsícem
HyperPCM: Robust Task-Conditioned Modeling of Drug-Target Interactions | Emma Svensson
Diffusion Models on Sampling Rare Events | Chenru Duan
zhlédnutí 745Před 2 měsíci
Diffusion Models on Sampling Rare Events | Chenru Duan
MFBind: a Multi-Fidelity Approach for Evaluating Drugs in Generative Modeling | Peter Eckmann
zhlédnutí 332Před 2 měsíci
MFBind: a Multi-Fidelity Approach for Evaluating Drugs in Generative Modeling | Peter Eckmann
Meaningful Causal Aggregation and Paradoxical Confounding | Yuchen Zhu
zhlédnutí 104Před 2 měsíci
Meaningful Causal Aggregation and Paradoxical Confounding | Yuchen Zhu
Evaluating retrosynthesis with syntheseus and retro-fallback | Austin Tripp
zhlédnutí 189Před 2 měsíci
Evaluating retrosynthesis with syntheseus and retro-fallback | Austin Tripp
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation | Omer Bar-Tal
zhlédnutí 401Před 2 měsíci
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation | Omer Bar-Tal
Additive Decoders for Latent Variables Identification | Sébastien Lachapelle
zhlédnutí 191Před 2 měsíci
Additive Decoders for Latent Variables Identification | Sébastien Lachapelle
Removing Biases from Molecular Representations via Information Maximization | Chenyu Wang
zhlédnutí 417Před 2 měsíci
Removing Biases from Molecular Representations via Information Maximization | Chenyu Wang
Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms | Bowen Jing
zhlédnutí 384Před 2 měsíci
Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms | Bowen Jing
Generalization in diffusion models from geometry-adaptive harmonic representation | Zahra Kadkhodaie
zhlédnutí 927Před 2 měsíci
Generalization in diffusion models from geometry-adaptive harmonic representation | Zahra Kadkhodaie
Manifold Diffusion Fields | Ahmed Elhag
zhlédnutí 627Před 2 měsíci
Manifold Diffusion Fields | Ahmed Elhag
Linear Structure of High-Level Concepts in Text-Controlled Generative Models | Victor Veitch
zhlédnutí 339Před 2 měsíci
Linear Structure of High-Level Concepts in Text-Controlled Generative Models | Victor Veitch
AlphaFold Meets Flow Matching for Generating Protein Ensembles | Bowen Jing
zhlédnutí 3,2KPřed 2 měsíci
AlphaFold Meets Flow Matching for Generating Protein Ensembles | Bowen Jing
Multiflow: protein structure and sequence co-generation | Jason Yim & Andrew Campbell
zhlédnutí 842Před 3 měsíci
Multiflow: protein structure and sequence co-generation | Jason Yim & Andrew Campbell
BISCUIT: Causal Representation Learning from Binary Interactions | Phillip Lippe
zhlédnutí 302Před 3 měsíci
BISCUIT: Causal Representation Learning from Binary Interactions | Phillip Lippe

Komentáře

  • @caiodaumann6728
    @caiodaumann6728 Před 9 hodinami

    One question I have is, are these flows monotonically increasing? The usual "block" flows have this nice property, but do these continuous flows trained with flow matching also have this property in the transformations from base to data?

  • @araldjean-charles3924

    Are we talking here about a general representation theory? Are b-splines the only basis set that can be used? What about wavelets, Fourier series, etc.?

  • @spencerfunk6697
    @spencerfunk6697 Před 4 dny

    exactly 10% of your subs liked

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    Since u haven't given up on KAN u can apply normalization function to the whole data set eg: x=y^2 may be out of bounds for large value of x u can simply represent the section of b-spline where curve of differential would explode with a representation while keeping the curve = x=y^2 but on the side would be it's multipliers. Eg: u can represent billion with b as in calculator it also saves space. It's multipliers would show the different between x=y^2 nd nx=y^2.......I don't know I understood it right if it does best of luck for ur P.hd

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    Eg: π in circle is present so KAN is good for producing formula

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    So n can be represented as function itself. Instead of going to infinity.

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    Can Kan be extended to math transformer

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    LLM to design physical language representation.......sphere representing nothing then twist n stretch to represent some memories........... cluster of neuron might represent memory but it still is capable of processing.....since audio n videos have same zeros n 1

  • @deliyomgam7382
    @deliyomgam7382 Před 4 dny

    So circle x circle= donuts but to define direction u need trigonometry......eg: circle x sin 2 or somthing or sin circle or circle sin(x)= donuts invite homer please.....1 hole then train to find hole of knots...

  • @HD-qq3bn
    @HD-qq3bn Před 7 dny

    I suggest to use piece wise function instead of spline, which show some similarities with FEM , which may easy to train

  • @TeeTeeNet
    @TeeTeeNet Před 8 dny

    Hannes, if you say thank you after a speaker has answered your question you let them know that your done. Just saying “yup” is kinda rude.

  • @zacharyjohnson3771
    @zacharyjohnson3771 Před 8 dny

    Hello host. You had some questions about vector multiplication and why squaring a vector is a scalar but multiplying two vectors is not around minute 16. This person does a great job at giving that explanation. czcams.com/video/hUlvxaQBW78/video.htmlsi=FJeVdOa4r3sPKZW- . Thanks for all the great interviews.

  • @brian5735
    @brian5735 Před 9 dny

    I like the 1d showing the integration. Great for PDEs

  • @georgekarniadakis5089

    MLPs use adaptive acivation functions, see the work by Jagtap et al . A survey paper is here: Ameya D. Jagtap, G.E. Karniadakis, How important are activation functions for regression and classification? A survey, performance comparison, and future directions, Journal of Machine Learning for Modeling and Computing, Volume 4, Issue 1, 2023, pp. 21-75

  • @sunghjung45
    @sunghjung45 Před 10 dny

    The question at 1:18:43 killed me 🤣

    • @sunghjung45
      @sunghjung45 Před 10 dny

      czcams.com/video/5p4JEXweboE/video.html

  • @gemini_537
    @gemini_537 Před 11 dny

    Gemini 1.5 Pro: This video is about Kolmogorov-Arnold Networks (KANs) presented by Ziming Liu, a Phd student at MIT. KANs are a new type of neural network architecture inspired by the Kolmogorov-Arnold representation theorem. This theorem states that any continuous function can be represented as a finite sum of compositions of single-variable functions. The video talks about the following aspects of KANs: * Motivation: Why KANs were developed and what problems they address (0:00-2:22) * Mathematical foundations: Explanation of the Kolmogorov-Arnold representation theorem (2:22-7:44) * Visualization of KANs: How KANs are visualized as networks (7:44-12:12) * Training KANs: How to train a KAN to approximate a function (12:12-15:37) * Comparison with MLPs: How KANs compare to traditional Multi-Layer Perceptrons (MLPs) (15:37-20:22) * Applications of KANs: Examples of using KANs for symbolic and special function approximation (20:22-29:31) * Interpretability of KANs: How KANs can be interpreted to reveal the underlying structure of the function they approximate (29:31-41:26) * Discovery with KANs: How KANs can be used to discover new relationships between variables (41:26-47:22) * Case study: Recovering scientific results with KANs (47:22-58:12) * Open questions and future directions: Discussion on limitations and future research areas for KANs (58:12-1:00:00) In conclusion, KANs are a promising new direction in neural network research that leverages the Kolmogorov-Arnold representation theorem to achieve interpretable function approximation. They have the potential to be particularly useful in scientific applications where understanding the relationships between variables is important.

  • @radosawjasiewicz2494
    @radosawjasiewicz2494 Před 11 dny

    What about vector functions?

  • @automatescellulaires8543

    Yes we Kan ? I swear i've already heard this somewhere.

  • @darkhydrastar
    @darkhydrastar Před 12 dny

    👏😎

  • @Kram1032
    @Kram1032 Před 12 dny

    So in principle, clearly you could simply take the functions KAN is built upon to be NNs. Furthermore, you could take a KAN of KANs, which strikes me as a second way to "go deep" on KANs. It also feels a little bit to me like the connections between objects, functions, functionals, natural transformations... - i.e. you'd essentially be able to encode category theoretical notions in KANs. - Is that a reasonable comparison to make? If so I wonder if you could simply take your base objects to be, say, the primitives of your favourite proof assistant plus arbitrarily deep, arbitrarily nested KANs to effectively efficiently find arbitrary functions that well represent whatever relationships you'd throw at them It's probably not at all easy to do, but that'd seem to me to be the most powerful version.

  • @elirane85
    @elirane85 Před 12 dny

    God, I wish this entire "AI Boom" happened when I was in collage almost 20 years ago. I would be able to publish so many papers. Now its a sigmoid, boom paper, now it's an exponent, paper, now a spline, paper, what's next, directed graph, paper, fully connected graph, paper. When exactly did the level of research papers started to be like my freshman year homework?

  • @taraaryal9609
    @taraaryal9609 Před 12 dny

    Do you also have an example to solve ODE using KAN?

  • @shinkurt
    @shinkurt Před 13 dny

    Thanks guys

  • @tankieslayer6927
    @tankieslayer6927 Před 13 dny

    Tegmark attention-whoring again and giving a bad name to physicists. This is a completely worthless paper. Learning activiation functions isn’t a new idea it’s just unnecessary.

    • @choi77770
      @choi77770 Před 9 dny

      You should give reasons for this comment

    • @tudoropran1967
      @tudoropran1967 Před 6 dny

      A statement with no arguments is unscientific.

  • @TomHutchinson5
    @TomHutchinson5 Před 13 dny

    Wow, this is blowing up. Most of the journal club videos get hundreds of views. This already has thousands! I look forward to watching the talk and reading the paper.

  • @Pingu_astrocat21
    @Pingu_astrocat21 Před 13 dny

    Thank you for uploading :)

  • @space-time-somdeep
    @space-time-somdeep Před 13 dny

    Thanks

  • @ferencszalma7094
    @ferencszalma7094 Před 14 dny

    0:02:35 Kolmogorov-Arnold Representation Theorem KART ~The only true multivariate function is the sum. 0:03:45 Details of (two layers) KART: 1d edge functions and node sums 0:05:05 KAN Kolmogorov-Arnold Network (orig 2-layer) 0:05:55 Multi-layer KAN 0:07:55 MLP and KAN comparison 0:09:45 B-splines basics 0:14:30 B-spline Cox-de Boor recursion formula (inefficient) 0:14:45 Implementation tricks: residual activations, initialization, grid update 0:38:05 Q: Expressivity vs generalization, bias-variance tradeoff, U-shape loss as fn of p (number of features) 0:39:15 Q: What if activation is out of range of the finite spline domain? -> Use the residual activation fn! 0:40:40 KANs to solve physics problems from raw data or already partially processed data? 0:43:15 KANs to solve PDEs? 0:44:35 Grid resolution finetuning is done manually 0:47:20 Can you replicate KANs by MLPs with the right breadth and depth? Yes. Would be nice to see a unified theory. 0:51:18 What's the novelty of KANs? At the technical level what makes a KAN a KAN? 0:58:16 Inductive bias: KAN's or DNN's inductive biases better fit a task: vision, language, science 0:59:25 History of connection vs symbolism 1957 - Frank Rosenblatt, Invention of perceptron 1969 - Marvin Minsky & Seymour Papert, Perceptrons: An introduction to computation geometry: "Perceptrons cannot do XOR" 1974 - Paul Werbos, "A multi-layer perceptron can do XOR" 1975 - Robert Hecht-Nielsen, Kolmogorov networks (2 layer, width 2n+1) 1988 - George Cybenko, "2 Layer Kolmogorov networks can do XOR" 1989 - Tomaso Poggio, "KA is irrelevant for neural networks 2012 - year of modern deep learning Expert systems/symbolic regression vs KANs vs MLP/Kolmogorov networks 1:04:20 KAN vs MLP phylosophy: High internal degrees of freedom, reductionism, parts are important vs Low internal degrees of freedom, holism, interaction of parts is important 1:08:45 Intricacies of developing something new: KANs beyond 2 layers 1:11:45 Github repos

  • @mohammedbenaissa1278
    @mohammedbenaissa1278 Před 14 dny

    Can we make a cnn with kan layer

  • @PabloHorneman-rd4cq
    @PabloHorneman-rd4cq Před 14 dny

    Legend!

  • @hanyanglee9018
    @hanyanglee9018 Před 14 dny

    You simply make activation functions for each instructions and protect the activation output between layers, it would probably work. Except for Idk how to protect the activation between layers in a graceful way. Softmax helps self attention to protect *that*. BN seems not to be used anymore, but it actually protects *that*. None of them is grace since they all distort the forward path in some way. Or, if we don't use the B spline, we still can use sigmoid (with MIG) to so similar job. Edit: sigmoid way doesn't provide any known interpredability. It's only about the black box way.

  • @Pingu_astrocat21
    @Pingu_astrocat21 Před 14 dny

    thank you for uploading this :)

  • @namjoonsuh8095
    @namjoonsuh8095 Před 16 dny

    Seminal work

  •  Před 26 dny

    Thank you Valence Lab

  • @jakublala
    @jakublala Před 26 dny

    42:03: when Hannes says he has "no intuition on what is going on" -> I couldn't relate more haha. Thanks Hannes for asking for elaboration on that equation!

  • @Pingu_astrocat21
    @Pingu_astrocat21 Před 28 dny

    Amazing talk,thanks a lot for uploading :)

  • @Apalion41
    @Apalion41 Před měsícem

    This Hannes kid is insufferable, with boring, useless, ignorant questions and interruptions. Hats off to Yilun Xu for a great presentation despite the heckler. This is very interesting work!

  • @adrienbufort795
    @adrienbufort795 Před měsícem

    Amazing work !!!! I wonder how to extand those kind of generative model to categorical variable.

  • @dwi4773
    @dwi4773 Před měsícem

    I am so confused by the stairs in the top right corner in Hannes background... it looks tiny

  • @AndreaRoncoli
    @AndreaRoncoli Před měsícem

    What a great talk, thanks guys!

  • @beil2944
    @beil2944 Před měsícem

    I dont quite understand at around 24:00 why invariants are just numbers? I thought invariant network also operates on tensors?

    • @omidshay
      @omidshay Před měsícem

      Here, by tensors he means geometric tensors not "tensors" as set of numbers.

  • @lukebeaton9962
    @lukebeaton9962 Před 2 měsíci

    This professor provides really instructive lectures! Hope to see more!

  • @arawndyn
    @arawndyn Před 2 měsíci

    This guy seems so smart, reminds me of my professor, hope to see him more!

  • @BodhiMC
    @BodhiMC Před 2 měsíci

    big if true!

  • @live3546
    @live3546 Před 2 měsíci

    So good~

  • @MrMIB983
    @MrMIB983 Před 2 měsíci

    He's back

  • @mehrdadmirpourian64
    @mehrdadmirpourian64 Před 2 měsíci

    Wonderful paper!

  • @user-rm9mn2up4d
    @user-rm9mn2up4d Před 2 měsíci

    I really don't see the point of giving a talk this way. Using extremely abstract explanation on a supposedly applied topic, missing all details of the implementation when asked is just bad practice. IMO without Hannes's questions this talk would be absolutely useless.

  • @AkashRajput-nv8xs
    @AkashRajput-nv8xs Před 2 měsíci

    Really good talk ideas clearly explained

  • @osmanmamun6868
    @osmanmamun6868 Před 2 měsíci

    The confusion about the message passing (ref slide at 56:30 time stamp) happened because the presenters failed to communicate that Y_{i, k} is constant, irrespective of the number of layers. Only different tunable weights (W_{i, k}) are introduced in each layer. So in each passing the GPU doesn't need to communicate with other GPUs to get the updated value of Y's, and h_{i, j} is already available in the local GPU.