87
145 804

Causal Seminar: Elizabeth Stuart, Johns Hopkins University

1:19:15

HDSI Industry Seminar: Foundation Medicine, Leah Comment

55:11

HDSI Industry Seminar: Ron Papka, Voya Investment Management

1:00:41

Using Machine Learning to Unveil the Invisible Universe

36:08

Machine Learning the Genealogy of the Milky Way

29:29

Workshop | Data Science in the Physical Sciences (Standard Format)

1:51:00

Causal Seminar: Paul Rosenbaum, University of Pennsylvania

Being Realistic About Unmeasured Biases in Observational Studies
The talk is intended as an introduction to some recent technical work, but I don’t recommend reading the technical work prior to the talk; so, there is no pre-reading. Someone who entirely new to the subject might optionally take a look at Chapter 9 (Sensitivity Analysis) and Chapter 10 (Design Sensitivity) of my book “Observation and Experiment: An Introduction to Causal Inference” (Harvard University Press, 2017); however, the talk will be a gentle introduction to new and somewhat more technical material. It looks like Harvard’s library provides on-line access to “Observation and Experiment”.
Observational studies of the effects caused by treatments are always subject to the concern that an ostensible treatment effect may reflect a bias in treatment assignment, rather than an effect actually caused by the treatment. The degree of legitimate concern is strongly affected by simple decisions that an investigator makes during the design and analysis of an observational study. Poor choices lead to heightened concern; that is, poor choices make a study sensitive to small unmeasured biases where better choices would correctly report insensitivity to larger biases. Indeed, perhaps surprisingly, unambiguous evidence of the presence of unmeasured bias may increase insensitivity to unmeasured bias. These issues are discussed with the aid of some theory and a simple example of an observational study.
Speaker:
Paul Rosenbaum, Robert G. Putzel Professor Emeritus of Statistics and Data Science, Wharton School of the University of Pennsylvania

zhlédnutí: 199

Video

Causal Seminar: Elizabeth Stuart, Johns Hopkins University

1:19:15

Causal Seminar: Elizabeth Stuart, Johns Hopkins University

zhlédnutí 272Před 14 dny

Combining experimental and non-experimental data to examine treatment effect heterogeneity Determining “what works for whom” is a key goal in prevention and treatment across a variety of areas, including mental health. Identifying effect moderators-factors that relate to the size of treatment effects-is crucial for delivery of treatment and prevention interventions, but doing so is incredibly d...

HDSI Industry Seminar: Foundation Medicine, Leah Comment

55:11

HDSI Industry Seminar: Foundation Medicine, Leah Comment

zhlédnutí 142Před 14 dny

Causal data science: personalizing treatment decisions in cancer Speaker: Leah Comment, Director, Decision Sciences, Foundation Medicine

HDSI Industry Seminar: Ron Papka, Voya Investment Management

1:00:41

HDSI Industry Seminar: Ron Papka, Voya Investment Management

zhlédnutí 195Před měsícem

Monetizing Data Science on Wall Street In this seminar, Ron Papka will discussed: - Use of Machine Learning, NLP and Analytics in the financial sector - Skills needed for change management projects leveraging Data Science and AI innovations - The business value of data and analytics on Wall Street Speaker: Ron Papka, SVP, Head of Data Engineering and Governance, Voya Investment Management

Using Machine Learning to Unveil the Invisible Universe

36:08

Using Machine Learning to Unveil the Invisible Universe

zhlédnutí 478Před měsícem

Workshop | Data Science in the Physical Sciences Speaker: Carlos Arguelles Delgado, Assistant Professor of Physics, Harvard University; IAIFI | Using Machine Learning to Unveil the Invisible Universe

Machine Learning the Genealogy of the Milky Way

29:29

Machine Learning the Genealogy of the Milky Way

zhlédnutí 477Před měsícem

Workshop | Data Science in the Physical Sciences Speaker: Lina Necib, Assistant Professor of Theoretical Astrophysics, MIT

Workshop | Data Science in the Physical Sciences (Standard Format)

1:51:00

Workshop | Data Science in the Physical Sciences (Standard Format)

zhlédnutí 857Před měsícem

Instructor: Matthew Schwartz, Professor of Physics, Department of Physics, Harvard University Presenters: Carlos Arguelles Delgado, Assistant Professor of Physics, Harvard University; IAIFI | Using Machine Learning to Unveil the Invisible Universe Lina Necib, Assistant Professor of Theoretical Astrophysics, MIT | (Machine) Learning the Genealogy of the Milky Way

Tutorial | Bayesian causal inference: A critical review and tutorial (Standard Format)

1:47:18

Tutorial | Bayesian causal inference: A critical review and tutorial (Standard Format)

zhlédnutí 10KPřed měsícem

This tutorial aims to provide a survey of the Bayesian perspective of causal inference under the potential outcomes framework. We review the causal estimands, assignment mechanism, the general structure of Bayesian inference of causal effects, and sensitivity analysis. We highlight issues that are unique to Bayesian causal inference, including the role of the propensity score, the definition of...

Tutorial | LLMs in 5 Formulas (Standard Format)

2:40:31

Tutorial | LLMs in 5 Formulas (Standard Format)

zhlédnutí 13KPřed 2 měsíci

Slide deck: drive.google.com/file/d/1DGXbMU4cCK15nbLiI3zcuwmvClwzoEsY/view?usp=sharing One year after the release of GPT-4, large language models (LLMs) remain the most exciting topic in AI. While much about their qualitative capabilities remain poorly understood, there are some areas where we can quantitatively measure, bound, and forecast their behavior. This tutorial will introduce the topic...

Workshop | AI: A Serious Look at Big Questions (Standard Format)

2:34:24

Workshop | AI: A Serious Look at Big Questions (Standard Format)

zhlédnutí 1,4KPřed 2 měsíci

This interdisciplinary workshop brings together leading figures in neuroscience, philosophy, and AI to address the most pressing questions on artificial intelligence. Entitled “A Serious Look at the Big Questions,” the event is structured as a series of discussions and panels that seeks to take seriously the “big questions” posed by AI including the similarities between AI and human intelligenc...

Industry Seminar: Praveen Pankajakshan, Cropin

1:12:49

Industry Seminar: Praveen Pankajakshan, Cropin

zhlédnutí 263Před 2 měsíci

Geospatial AI for Monitoring Food Security and Climate Resilient Agriculture Amid the rising food insecurity and climate challenges, there is a need for some urgent and crucial, coordinated actions. Some recent reports from the United Nations have highlighted that in 2023, about 25 million people in the west sub-saharan regions are at a high risk of food insecurity, indicating a sharp rise in h...

Workshop | AI: A Serious Look at Big Questions (360°)

2:34:24

Workshop | AI: A Serious Look at Big Questions (360°)

zhlédnutí 715Před 2 měsíci

WATCH IN STANDARD FORMAT: czcams.com/video/AR67AvEkVGU/video.html This interdisciplinary workshop brings together leading figures in neuroscience, philosophy, and AI to address the most pressing questions on artificial intelligence. Entitled “A Serious Look at the Big Questions,” the event is structured as a series of discussions and panels that seeks to take seriously the “big questions” posed...

2:40:31

Tutorial | LLMs in 5 Formulas (360°)

zhlédnutí 32KPřed 2 měsíci

WATCH IN STANDARD FORMAT: czcams.com/video/k9DnQPrfJQs/video.html One year after the release of GPT-4, large language models (LLMs) remain the most exciting topic in AI. While much about their qualitative capabilities remain poorly understood, there are some areas where we can quantitatively measure, bound, and forecast their behavior. This tutorial will introduce the topic of LLMs through 5 ke...

Tutorial | Bayesian causal inference: A critical review and tutorial (360°)

1:47:18

Tutorial | Bayesian causal inference: A critical review and tutorial (360°)

zhlédnutí 874Před 2 měsíci

WATCH IN STANDARD FORMAT: czcams.com/video/7Cwl6DgL64o/video.html This tutorial aims to provide a survey of the Bayesian perspective of causal inference under the potential outcomes framework. We review the causal estimands, assignment mechanism, the general structure of Bayesian inference of causal effects, and sensitivity analysis. We highlight issues that are unique to Bayesian causal infere...

Scenes from the 2024 HDSI Annual Conference

0:41

Scenes from the 2024 HDSI Annual Conference

zhlédnutí 64KPřed 2 měsíci

Scenes from the 2024 HDSI Annual Conference

Industry Seminar: Francesca Lazzeri, Microsoft

1:02:03

Industry Seminar: Francesca Lazzeri, Microsoft

zhlédnutí 242Před 4 měsíci

Industry Seminar: Francesca Lazzeri, Microsoft

Industry Seminar: Isabel Fulcher, Delfina

53:28

Industry Seminar: Isabel Fulcher, Delfina

zhlédnutí 257Před 5 měsíci

Industry Seminar: Isabel Fulcher, Delfina

15:13

HDSI Postdoc Experience: Minsuk Shin

zhlédnutí 72Před 6 měsíci

HDSI Postdoc Experience: Minsuk Shin

HDSI Causal Seminar: Alberto Abadie, MIT

1:54:48

HDSI Causal Seminar: Alberto Abadie, MIT

zhlédnutí 142Před 6 měsíci

HDSI Causal Seminar: Alberto Abadie, MIT

HDSI Causal Seminar: Postdoctoral Fellow Showcase

1:33:23

HDSI Causal Seminar: Postdoctoral Fellow Showcase

zhlédnutí 131Před 6 měsíci

HDSI Causal Seminar: Postdoctoral Fellow Showcase

HDSI Causal Seminar: Georgia Papadogeorgou, University of Florida

1:34:28

HDSI Causal Seminar: Georgia Papadogeorgou, University of Florida

zhlédnutí 126Před 6 měsíci

HDSI Causal Seminar: Georgia Papadogeorgou, University of Florida

HDSI Postdoc Experience: Max Kleiman-Weiner

30:22

HDSI Postdoc Experience: Max Kleiman-Weiner

zhlédnutí 70Před 6 měsíci

HDSI Postdoc Experience: Max Kleiman-Weiner

Industry Seminar: Slawek Kierner, Intuitive

47:00

Industry Seminar: Slawek Kierner, Intuitive

zhlédnutí 116Před 6 měsíci

Industry Seminar: Slawek Kierner, Intuitive

13:56

HDSI Postdoc Experience: Isabel Fulcher

zhlédnutí 155Před 6 měsíci

HDSI Postdoc Experience: Isabel Fulcher

1:56

HDSI Annual Conference Recap

zhlédnutí 223Před 6 měsíci

HDSI Annual Conference Recap

HDSI Causal Seminar: Fan Li, Duke University

1:26:02

HDSI Causal Seminar: Fan Li, Duke University

zhlédnutí 337Před 7 měsíci

HDSI Causal Seminar: Fan Li, Duke University

Adopting Digital Trust: Culture, Talent, and Capabilities

59:02

Adopting Digital Trust: Culture, Talent, and Capabilities

zhlédnutí 242Před rokem

Adopting Digital Trust: Culture, Talent, and Capabilities

57:58

Data Ethics - Leaders and Frameworks

zhlédnutí 265Před rokem

Data Ethics - Leaders and Frameworks

59:56

Digital Trust as a Global Concept

zhlédnutí 570Před rokem

Digital Trust as a Global Concept

The Chief Trust Officer - What it means to be a technical leader

58:45

The Chief Trust Officer - What it means to be a technical leader

zhlédnutí 346Před rokem

The Chief Trust Officer - What it means to be a technical leader

Komentáře

@olorunfemijoshua9586 Před 3 dny
How can I be part of this program?
@dskbiswas Před 7 dny
This video can be a terrific example of overfitting 😁😆
@JMSproduction-qk2rx Před 13 dny
Pleass my youtube compleat now
@dj...channel2549 Před 14 dny
So amazing 😊
@hanhphuclaemtn92 Před 21 dnem
Yydhkxhxhx
@hanhphuclaemtn92 Před 21 dnem
Đìh
@shrirangmoghe3784 Před 22 dny
Please upload slides and perhaps sync it to the audio if you can. Can’t believe we are in age of AI and humans are already losing it
@shrirangmoghe3784 Před 22 dny
What are we even looking at here ? Are we at an edge of some black hole? Totally nauseating
@davidbadmus-yh9bo Před 27 dny
like this
@swenic Před měsícem
Not liking the current layout. It is not possible to make up a lot of detail (you can see a large room w people in it but nothing of value is discernible) in the small image, and placing it side by side w the presented material essentially makes that harder to see as well, resulting in a need for more screen estate or get used to watching things in half the size that is customary. How about instructing presenters to leave a small box area in their presentations and paste the room in that?
@ericgibson2079 Před měsícem
Please support The End and Prevention of Homelessness USA 2030!
@JOHNSMITH-ve3rq Před měsícem
If y’all could make sure you actually have clean audio in future that’d be great. The banging is super super distracting and brings down the quality of the whole thing.
@harvarddatascienceinitiati3320 Před 21 dnem
Noted! Thank you for your feedback.
@santaespinal1540 Před měsícem
🎯 Key Takeaways for quick navigation: 00:12 *🎓 Introduction of Sasha Rush by David Parks* - Introduction of Sasha Rush, a professor at Cornell University and a contributor to Hugging Face, now part of Google, known for his work in AI and large language models. - Sasha's background in computer science, his contributions to AI research, and his commitment to open-source initiatives. 04:28 *🗣️ Overview of Large Language Models (LLMs)* - Large language models (LLMs) are discussed in terms of their significance, complexity, and implications. - LLMs are described as extremely useful, expensive, important, and sometimes perceived as intimidating, but also capable of providing remarkable outputs with consistency and creativity. - The talk aims to provide insights into LLMs and address questions regarding their functioning, reasoning, and impact on various domains. 07:38 *📊 Dividing LLM Understanding into Five Formulas* - Sasha introduces the concept of understanding LLMs through five formulas: perplexity, attention, gem, chinchilla, and rasp. - Each formula represents a different aspect of LLMs, such as generation, memory, efficiency, scaling, and reasoning, providing a comprehensive view of their functionalities. - These formulas aim to simplify complex concepts and enable a deeper understanding of LLMs for both technical and non-technical audiences. 11:33 *🔍 Understanding Perplexity in Language Models* - Perplexity in language models is explained, focusing on the probabilistic model of documents and the probability distribution of word sequences. - The Markov assumption and the representation of Theta as categorical distributions are discussed as historical aspects of language models. - Modern advancements in LLMs are highlighted, indicating departures from past assumptions and embracing more complex neural network architectures. - Language models' historical evolution is discussed, from early Markov models to modern neural network-based LLMs. - Changes in assumptions and approaches, such as fixed-length history and categorical representations, are contrasted with contemporary methods emphasizing neural network architectures. - The session concludes with reflections on the enduring relevance and evolution of language models in the field of natural language processing. 21:04 *🧮 Language Model Development Pre-2010* - Explanation of early methods to develop language models. - Discussion of Shannon's paper from 1948 outlining language model development. - Overview of the challenges and limitations of early language models. 22:32 *📊 Quantifying Language Model Performance* - Introduction to evaluating language model performance. - Explanation of the challenges in quantifying language model effectiveness due to the lack of a definitive correct answer. - Illustration of the evaluation process using an example sentence and the concept of word prediction difficulty. 29:17 *📏 Metric Relating to Compression and Shannon's Work* - Introduction to a metric related to language compression inspired by Shannon's work. - Explanation of encoding words with binary strings based on their probabilities. - Discussion on how the metric, perplexity, measures the efficiency of language compression. 38:21 *📈 Evaluation and Practical Applications of Perplexity* - Discussion on the practical evaluation of perplexity using a test set. - Exploration of the relationship between perplexity and model quality. - Overview of historical perplexity scores achieved by various language models and their implications. 44:09 *📊 Correlation between Perplexity and Task Accuracy* - Perplexity correlates almost perfectly with translation accuracy. - General language perplexity correlates extremely well with task accuracy on different tasks. 45:59 *📈 Impact of Perplexity Reduction on Model Performance* - Lower perplexity corresponds to better model performance. - Graphs illustrate the relationship between perplexity reduction and resource investment. 53:17 *🧠 Introduction to Memory and Attention Models* - Explains the concept of memory and attention in neural network models. - Introduces the limitations of Markov models and the need for attention mechanisms. 01:05:03 *🧠 Understanding Query, Key, and Value Operations in LLMs* - Query, key, and value operations in LLMs enable prediction of the next word based on contextual information. - The process involves querying a lookup table using the query, retrieving the key, and using it to predict the next word. 01:07:03 *🔄 Transitioning from Argmax to Softmax* - Argmax selection poses challenges in neural networks due to its discontinuous nature. - Softmax provides a smooth alternative for selecting the best choice, enabling meaningful training. 01:15:08 *📊 Evolution of Attention Mechanisms in Language Modeling* - The attention mechanism, popularized by the "Attention is All You Need" paper, revolutionized language modeling. - Transformers utilize attention mechanisms to process input sequences efficiently and capture long-term dependencies. 01:18:05 *🖥️ Efficient Computation with Attention Mechanisms* - Attention mechanisms efficiently compute long-term context by utilizing matrix operations. - Matrix multiplication and softmax operations allow for effective computation of attention scores. 01:27:10 *🖥️ Transition from CPUs to GPUs for efficient computation* - The shift from CPUs to GPUs revolutionized the efficiency of running applications like language models. - GPUs allowed for faster computation of complex operations like softmax, significantly altering the research landscape. 01:29:19 *🧠 Understanding GPU architecture and parallel computing* - GPUs operate as parallel computers with multiple threads running simultaneously. - Threads within blocks share block memory, enabling fast data exchange and computation. 01:34:24 *🧮 Efficient matrix multiplication on GPUs* - Matrix multiplication on GPUs involves loading data into block memory, performing computations within blocks, and minimizing reads from global memory. - Leveraging shared memory and parallel processing allows for efficient computation of matrix multiplication. 01:47:04 *💡 Maximizing GPU performance for ML applications* - ML applications benefit from efficient GPU performance, measured in ML operations per second (MLops). - Optimizing data formats, such as using smaller floating-point values, enhances GPU efficiency. 01:53:20 *🚀 GPU Optimization: Why Speed Matters* - GPU programming optimization is crucial for efficient computation. - Efficient GPU programming enables faster computation, which is essential for large-scale models like LLMs. 01:53:49 *🔍 Scaling in LLMs: Model Size vs. Training Data* - The performance of LLMs depends on the interplay between model size and training data. - Increasing model size and training data improves the model's ability to generalize and understand complex patterns. 01:54:42 *📊 Compute Optimization Formula: The Chinchilla Formula* - The Chinchilla Formula extrapolates the expected perplexity of LLMs based on model size and training data. - It suggests a proportional relationship between model size and training data for optimal performance. 02:11:35 *📊 Cost Comparison and Scaling Laws* - Understanding the costs associated with large language models (LLMs). - Comparing and incorporating costs related to data acquisition and model scaling. 02:13:09 *💰 Cost Considerations in Model Development* - Addressing cost considerations in LLM development, especially focusing on compute and data. - Exploring scenarios where reducing specific costs benefits certain stakeholders. 02:14:46 *🔄 Data Reusability and Synthesis* - Discussing the reusability of training data and its diminishing returns. - Exploring the potential of synthetic text generation and its current applications. 02:17:29 *📰 Importance of Training Data Quality* - Highlighting the significance of high-quality training data, such as from sources like the New York Times. - Discussing challenges in quantifying and assessing the quality of training data. 02:18:19 *🛑 Addressing Token Exhaustion Concerns* - Discussing concerns related to token exhaustion and its potential impact on model training. - Exploring strategies to address token scarcity, including alternative data sources. 02:32:52 *🧠 Exploring Formal Logic in Language Models* - Introducing a formal logic approach to understanding language model behavior. - Discussing the use of logical operations to manipulate and analyze model behavior. 02:34:57 *🔄 RASP Language and Multi-Layered Models* - Explaining the RASP language and its deterministic, logical approach to modeling. - Highlighting the benefits of using multiple layers in language model architectures. 02:36:30 *🔍 Potential and Limitations of Formal Logic in LLMs* - Discussing the implications of formal logic approaches for understanding model capabilities. - Exploring the feasibility of implementing complex tasks using logical formulations in language models. Made with HARPA AI
@noninvasive_rectal_probe8990 Před měsícem
Lmao this talk is trash
@420_gunna Před měsícem
What do you think is bad about it? Haven't listened yet, but Sasha always puts out great content in the past.
@imaspacecreature Před měsícem
Wanted to hear more!
@ShivaprakashYaragal Před měsícem
This is awesome. I loves these tools and Taxi data
@user-wr4yl7tx3w Před měsícem
If Harvard wasn’t so woke. Only woke DEI look down upon Extension students.
@AlgoNudger Před měsícem
Thanks.
@AlgoNudger Před měsícem
Thanks.
@r2internet Před měsícem
Thanks for the informative talk. Can you please share the slides?
@harvarddatascienceinitiati3320 Před měsícem
Hi there, we have reached out to request permission to share the slides. We will add a link to the description if approved. Thank you!
@harvarddatascienceinitiati3320 Před 21 dnem
See the description!
@mikaackermann4072 Před 2 měsíci
Why 360? How about a normal Video?
@kalmyk Před 2 měsíci
you can just imagine formulas
@noomade Před měsícem
what do you mean?
@jaredtweed7826 Před 2 měsíci
Can you upload this with the slide not in VR
@harvarddatascienceinitiati3320 Před 2 měsíci
Working on it :)
@RocketmanUT Před 2 měsíci
Going to need to reupload this, the slides are distorted.
@harvarddatascienceinitiati3320 Před 2 měsíci
Working on it :)
@jaredtweed7826 Před 2 měsíci
Thank you!
@pankajsinghrawat1056 Před 2 měsíci
make normal videos please
@harvarddatascienceinitiati3320 Před 2 měsíci
Working on it :)
@travelcatchannel8657 Před 2 měsíci
Thanks very much for this presentation. It helps a lot. Could you kindly tell me the tool you use for demo?
@user-wr4yl7tx3w Před 3 měsíci
why does every harvard presentation have such a long preamble. just skip the first 10 minutes if you don't want to sit through it.
@KyleXLS Před 3 měsíci
I'd love to have access to the data to play around with in ArcGIS Pro.
@TheNighter Před 4 měsíci
Harvard should not exist.
@kinmacpherson9302 Před 5 měsíci
'Promosm' 🤪
@mgophern Před 6 měsíci
could you share slides?
@toddbrous_untwist Před rokem
This was such an awesome video! Thank you HDSI for posting this.
@lorenzo2775 Před rokem
I like your style!!!! #1 YT views provider -> Promo>SM!
@optimism90 Před rokem
Thanks for sharing the tutorial video. Could you share the R code and slides if possible, thanks!
@marilinfykes7558 Před rokem
p̷r̷o̷m̷o̷s̷m̷ 😑
@chinwevivianaliyu Před rokem
Would it be possible to have a recorded version for those who registered?
@muhammadsyukri746 Před rokem
Thanks so much indeed for this video. Helps me a lot
@haow85 Před 2 lety
Are their any open datasets for fair AI ?
@somewheresomeone3959 Před 2 lety
Great work and thanks for sharing! Is it just me or the video is a lil bit asynchronous of the voice and the pictures?
@somewheresomeone3959 Před 2 lety
Nvm I shut the page and reentered, everything works charmingly rn.
@ThinkQbD Před 2 lety
Thank you. Great panel discussion!
@harvarddatascienceinitiati3320 Před 2 lety
Glad you enjoyed it!

Harvard Data Science Initiative

Komentáře