2 962
16 255 619

How to enforce data quality across columns within a table in Databricks

1:24

Databricks Clean Rooms

4:18

AI/BI: Intelligent Analytics for Real-World Data

6:34

AI-Powered Data Warehousing on Databricks SQL

5:50

LakeFlow Demo

3:58

Say goodbye to messy JSON headaches with VARIANT

1:57

Make your records Unique with Generated Identity Columns

Check out the docs here: docs.databricks.com/en/delta/generated-columns.html#use-identity-columns-in-delta-lake
Find more examples in this blog: www.databricks.com/blog/2022/08/08/identity-columns-to-generate-surrogate-keys-are-now-available-in-a-lakehouse-near-you.html

zhlédnutí: 1 002

Video

How to enforce data quality across columns within a table in Databricks

1:24

How to enforce data quality across columns within a table in Databricks

zhlédnutí 1,4KPřed 14 hodinami

Check out the docs here: docs.databricks.com/en/delta/generated-columns.html

4:18

Databricks Clean Rooms

zhlédnutí 1,9KPřed 19 hodinami

Data clean rooms allow businesses to easily collaborate on data in a secure environment, where multiple parties can safely combine sensitive data without compromising privacy or security. By implementing stringent protocols and advanced technologies, data clean rooms enable organizations to share data securely while ensuring compliance with privacy and regulatory requirements. In an era where d...

AI/BI: Intelligent Analytics for Real-World Data

6:34

AI/BI: Intelligent Analytics for Real-World Data

zhlédnutí 2,7KPřed dnem

In this video you will learn about AI/BI which features two complementary capabilities: Dashboards and Genie. Dashboards provide a low-code experience to help analysts quickly build highly interactive data visualizations for their business teams using natural language, and Genie allows business users to converse with their data to ask questions and self-serve their own analytics. Databricks AI/...

AI-Powered Data Warehousing on Databricks SQL

5:50

AI-Powered Data Warehousing on Databricks SQL

zhlédnutí 1,3KPřed dnem

In this video, you will learn how you can leverage Databricks SQL's data warehousing capabilities to call AI functions, query models, and utilize the context-aware Databricks Assistant for seamless and efficient data analysis. This powerful combination makes it easier for analysts to unlock valuable insights and drive impactful decisions.

3:58

LakeFlow Demo

zhlédnutí 4,1KPřed 14 dny

Databricks LakeFlow is a new solution that contains everything you need to build and operate production data pipelines. It includes new native, highly scalable connectors for databases including MySQL, Postgres, SQL Server and Oracle and enterprise applications like Salesforce, Microsoft Dynamics, NetSuite, Workday, ServiceNow and Google Analytics. Users can transform data in batch and streamin...

Say goodbye to messy JSON headaches with VARIANT

1:57

Say goodbye to messy JSON headaches with VARIANT

zhlédnutí 3,3KPřed 14 dny

Try it out today on Databricks: docs.databricks.com/en/semi-structured/variant.html Read more about it on our blog: www.databricks.com/blog/introducing-open-variant-data-type-delta-lake-and-apache-spark If you're curious about the implementation check out the talk: czcams.com/video/jtjOfggD4YY/video.html Or read about it on GitHub: github.com/apache/spark/blob/master/common/variant/README.md

2:31

Data Intelligence Day Seoul 2024

zhlédnutí 567Před 14 dny

Data Intelligence Day Seoul, Korea took place on 23 April 2024 and gathered over 1,200 industry leaders and data and AI experts. Watch Data Intelligence Day Seoul On Demand: events.databricks.com/KoreaDIDays2024

17:50

An Introduction to DBRX

zhlédnutí 4,3KPřed 21 dnem

Learn from Naveen Rao, VP of Generative AI at Databricks, as he explains DBRX, a new, open source foundation model that sets the standard for production quality and price/performance. With up to 3x faster inference, DBRX - outperforms all other open models in quality benchmarks - and that allows enterprises to quickly build your own custom LLM efficiently and with full control. Read more about ...

11:08

Demo: How Do I Use DBRX?

zhlédnutí 1,9KPřed 21 dnem

Watch how DBRX uses Databricks to build and customize GenAI applications using your own enterprise data Read more about DBRX here: www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms?

What's Next for Apache Spark™ Including the Upcoming Release of Apache Spark 4.0

18:30

What's Next for Apache Spark™ Including the Upcoming Release of Apache Spark 4.0

zhlédnutí 7KPřed 21 dnem

Reynold Xin, Co-founder and Chief Architect, Databricks shares the latest innovation coming out of the Apache Spark™ open source project including a preview of the anticipated release of Spark 4.0 Speakers: Reynold Xin, Co-founder and Chief Architect, Databricks Tareef Kawaf, President, Posit Sofware, PBC

The Evolution of Delta Lake from Data + AI Summit 2024

16:07

The Evolution of Delta Lake from Data + AI Summit 2024

zhlédnutí 2,4KPřed 21 dnem

Shant Hovsepian, Chief Technology Officer of Data Warehousing at Databricks explains why Delta Lake is the most adopted open lakehouse format. Includes: - Delta Lake UniForm GA (support for and compatibility with Hudi, Apache Iceberg, Delta) - Delta Lake Liquid Clustering - Delta Lake production-ready catalog (Iceberg REST API) - The growth and strength of the Delta ecosystem - Delta Kernel - D...

2:33

Setting up PAT and Secret Scope

zhlédnutí 427Před 21 dnem

Quick video on how to setup a Personal Access Token and Secret Scope and Secret with Azure Key Vault.

Increase your column sizes without rewriting the entire table

2:02

Increase your column sizes without rewriting the entire table

zhlédnutí 793Před 21 dnem

Docs: docs.databricks.com/en/delta/type-widening.html

Announcing Delta Lake 4.0 with Liquid Clustering. Presented by Shant Hovsepian at Data + AI Summit

5:15

Announcing Delta Lake 4.0 with Liquid Clustering. Presented by Shant Hovsepian at Data + AI Summit

zhlédnutí 5KPřed 21 dnem

Announcing Delta Lake 4.0 with Liquid Clustering. Presented by Shant Hovsepian at Data AI Summit

Open Sourcing Unity Catalog Live Onstage with Matei Zaharia at Data + AI Summit 2024

0:56

Open Sourcing Unity Catalog Live Onstage with Matei Zaharia at Data + AI Summit 2024

zhlédnutí 1,2KPřed 21 dnem

Open Sourcing Unity Catalog Live Onstage with Matei Zaharia at Data AI Summit 2024

Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam

16:58

Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam

zhlédnutí 9KPřed 21 dnem

Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam

Recap of Announcements at Data + AI Summit 2024 with Ali Ghodsi, Co-Founder and CEO, Databricks

0:38

Recap of Announcements at Data + AI Summit 2024 with Ali Ghodsi, Co-Founder and CEO, Databricks

zhlédnutí 982Před 21 dnem

Recap of Announcements at Data AI Summit 2024 with Ali Ghodsi, Co-Founder and CEO, Databricks

Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar

7:47

Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar

zhlédnutí 1,3KPřed 21 dnem

Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar

Data Sharing and Cross-Organization Collaboration. Presented by Matei Zaharia at Data + AI Summit

6:13

Data Sharing and Cross-Organization Collaboration. Presented by Matei Zaharia at Data + AI Summit

zhlédnutí 460Před 21 dnem

Data Sharing and Cross-Organization Collaboration. Presented by Matei Zaharia at Data AI Summit

Announcing Unity Catalog Metrics with Live Demo. Matei Zaharia and Zeashan Pappa at Data + AI Summit

5:18

Announcing Unity Catalog Metrics with Live Demo. Matei Zaharia and Zeashan Pappa at Data + AI Summit

zhlédnutí 1,2KPřed 21 dnem

Announcing Unity Catalog Metrics with Live Demo. Matei Zaharia and Zeashan Pappa at Data AI Summit

Evolving Data Governance With Unity Catalog Presented by Matei Zaharia at Data + AI Summit 2024

14:43

Evolving Data Governance With Unity Catalog Presented by Matei Zaharia at Data + AI Summit 2024

zhlédnutí 3,3KPřed 21 dnem

Evolving Data Governance With Unity Catalog Presented by Matei Zaharia at Data AI Summit 2024

Unity Catalog Demo of New Features with Zeashan Pappa at Data + AI Summit 2024

5:22

Unity Catalog Demo of New Features with Zeashan Pappa at Data + AI Summit 2024

zhlédnutí 1,4KPřed 21 dnem

Unity Catalog Demo of New Features with Zeashan Pappa at Data AI Summit 2024

How Data Intelligence is Delivering Big Wins at Texas Rangers. Alexander Booth at Data + AI Summit

10:46

How Data Intelligence is Delivering Big Wins at Texas Rangers. Alexander Booth at Data + AI Summit

zhlédnutí 494Před 21 dnem

How Data Intelligence is Delivering Big Wins at Texas Rangers. Alexander Booth at Data AI Summit

The Future of Lakehouse Format Interoperability with Ali Ghodsi and Ryan Blue at Data + AI Summit

5:01

The Future of Lakehouse Format Interoperability with Ali Ghodsi and Ryan Blue at Data + AI Summit

zhlédnutí 682Před 21 dnem

The Future of Lakehouse Format Interoperability with Ali Ghodsi and Ryan Blue at Data AI Summit

How to Make Small Language Models Work. Yejin Choi Presents at Data + AI Summit 2024

17:52

How to Make Small Language Models Work. Yejin Choi Presents at Data + AI Summit 2024

zhlédnutí 4KPřed 21 dnem

How to Make Small Language Models Work. Yejin Choi Presents at Data AI Summit 2024

Data + AI Summit Keynote 2024 - Day 2 Opening Remarks with Ali Ghodsi

2:52

Data + AI Summit Keynote 2024 - Day 2 Opening Remarks with Ali Ghodsi

zhlédnutí 288Před 21 dnem

Data AI Summit Keynote 2024 - Day 2 Opening Remarks with Ali Ghodsi

Building an Enterprise Data & AI Catalog with Databricks Unity Catalog

0:53

Building an Enterprise Data & AI Catalog with Databricks Unity Catalog

zhlédnutí 1,3KPřed 21 dnem

Building an Enterprise Data & AI Catalog with Databricks Unity Catalog

Patrick Wendell, Co-founder and VP of Engineering on Building Production-Quality AI Systems

36:31

Patrick Wendell, Co-founder and VP of Engineering on Building Production-Quality AI Systems

zhlédnutí 2KPřed 21 dnem

Patrick Wendell, Co-founder and VP of Engineering on Building Production-Quality AI Systems

22:52

The Best Data Warehouse is a Lakehouse

zhlédnutí 4,8KPřed 21 dnem

The Best Data Warehouse is a Lakehouse

Komentáře

@chinmaykajalwa Před 22 hodinami
Nice video. You showcased an example of Duck DB integration with Unity Catalog. Can you please help me validate if my understanding captured in below points, about the behaviour of "open sourced Unity Catalog" (UC)? 1. Within UC, we can apply the Data Security changes like data masking. When this UC is accessed from Duck DB, the columns will appear as masked there too. 2. Similarly, other Security config such as row level Security, column level Security will also be visible in Duck DB. 3. Similar to "attach accounts_prod" command in Duck DB, we can integrate UC with other lakehouse implementations such as Microsoft Fabric and even on-prem Delta Lake too (or at least such integration is in roadmap). 4. Such tables are hosted/managed Within Databricks, but are accessed from Duck DB too, which is a reverse of what is done in case of "external table".
@baijusingh8486 Před dnem
Thank you for your great presentation. Could you please provide the source code or share the link to the Git repository
@GeorgeSut Před dnem
Epic Brit accent
@OfferoC Před 2 dny
This sounds backwards
@pabloe1802 Před 4 dny
system.billing.usage does not contain a cluster_id columns
@ArifMarias Před 4 dny
Would it be possible to generate this on DLT Streaming table ?
@jeffrey6124 Před 5 dny
Lovely! .... more detailed demo please 🤓 many thanks
@priteshkhilari1918 Před 5 dny
No module named 'databricks.vector_search'
@sandeepkashyap6917 Před 6 dny
I cannt find the "launch genie" button , Do i need to enable anything else ?
@Cal-e-man Před 6 dny
The video is playing back strangely interlaced, like a column was out in the stream or what not.
@ngneerin Před 6 dny
How to install private package in serverless compute? I used init_script with normal compute to add private package access details in /etc/pip.conf BUT with serverless, initi scripts can't be used
@pedrocavalcanti7488 Před 6 dny
oi got stuck because I'm only receiving 3 rows of responses. but my table have 800
@indiandelicacies1745 Před 6 dny
Thanks for sharing - great content Naveen
@joyo2122 Před 6 dny
are you hiring?
@Seanrck Před 7 dny
Epic
@lorgerdat Před 7 dny
Databricks AI is great but a show stopper for adoption in many big organisations is data privacy and residency. My understanding is the data leaves the organisations tenant.
@b4dabng272 Před 7 dny
Brilliant 👌🏽
@LairdErnst Před 8 dny
Awesome!
@majidafra Před 8 dny
When I choose Timeseries as the profile type and set all the required fields it seems to be working fine, but when I open the dashboard it throws an error like "Table or View {catalog}.{schema}.my_table_profile_metrics Can not be found". should I create the my_table_profile_metrics myself or it is a part of the process?
@kuto1 Před 8 dny
Awasome. Can I use German to ask these questions?
@SlyRax_AT Před 8 dny
curious to know, too
@matthiasmueller9340 Před 9 dny
Did I miss something or did you not mention on how to activate the AI features? Because when I try to use ai_analyze_sentiment() within my serverless Europe West env, I get an error saying AI_FUNCTION_HTTP_REQUEST_ERROR
@the_class_apart Před 9 dny
Wow this is amazing. I wanted to understand how variant data type is different from Struct type? Also second question. How does it work with array of json?
@Databricks Před 9 dny
Variant can be a mix of structs and arrays. The difference is the flexibility that you can have compared to the other two.
@liqunxie7830 Před 9 dny
People who don't undertand the video at all (possibly not even finish watching) comment Feifei Li "creepy". That's the time you know most of the world goes crazy.
@alex_316 Před 9 dny
Love this shorts
@fernalication Před 10 dny
I ended up writing a custom function to handle data in batches and recursively exploding lists and normalizing dictionaries. Not having a schema or frontend developers saving elemnts as lists, then dictiomaries and then as bananas was tricky. I will give this one a try 😅
@Databricks Před 9 dny
Hope this simplifies things! Would love to hear if you notice performance gains too. Holly
@esteban-alvino Před 10 dny
Hello for the video, it could't follow it up, because of the juniper notebook, what do you recommend me to follow in order to replicate what you did in this vidoe. Thank you.
@AmnBrt Před 10 dny
Well to be honest, it is not really free. You still need to pay for the AWS resources set up through CloudFormation. That stack cost me ~7 USD per day.
@ledinhanhtan Před 11 dny
Awesome!
@SmartAI247 Před 11 dny
oh great.
@trevorwills4234 Před 12 dny
Great video Jason, Thanks for putting together. Would you be able to share the notebooks as well.
@mhalton Před 13 dny
Unfortunately, can't understand shit of what he's saying!
@rebeccaamador6926 Před 13 dny
Getting an issue when trying to create the serving endpoint. It's saying Served entity creation aborted for served entity `audio_transcription_chatbot_model-1`, config version 1, since the update timed out. Have not been able to figure out why.
@yao5261 Před 13 dny
懂了，赛博号脉！
@FullEvent5678 Před 14 dny
Very inspiring! My mind is going att 1000 miles an hour with ideas for our startup and clients from this!
@subedi04 Před 14 dny
Where can access your code or workbook? Would be nie to run your code.
@AadidevSooknananNXS Před 14 dny
Holden and team are incredibly engaging and very easy to understand!
@ia6906 Před 14 dny
Great feature, please also include low code features in order to be more beneficial as Data factory also has for ETL
@Naraharisettiraviteja Před 14 dny
awesome
@brento2890 Před 15 dny
Excellent presentation, beginning 3.5-4.0 Billion years ago and explaining all the way to now (AI, non-physical-spatial). Excellent. Thank you. 👏
@TheDataArchitect Před 15 dny
Who's the speaker?
@Databricks Před 14 dny
Holly Smith - FYI it's also me in the comments for my videos so fire away with any technical follow on questions - Holly
@TheDataArchitect Před 14 dny
@@Databricks Awesome thanks
@muhammadibrahimabdullahi3840 Před 15 dny
AI can do everything you need to do in times of studying and understanding AI.
@benim1917 Před 15 dny
Awesome 👏🏾
@Thegameplay2 Před 15 dny
🎉
@gravenguan Před 15 dny
How did parse_json handle schema evolution and from my kowledge, prod table do not recommend parse schema on the fly, it's more safer to define schema first
@Databricks Před 15 dny
I agree, but with a lot of JSON data you don't know the schema upfront and so can't define it. It's worth noting this is different from inferring the schema which looks at the first 1000 rows and is brittle to upstream changes - Holly
@gravenguan Před 15 dny
@@Databricks We used parse_json for dev and exploration purposes as well, thank for the clarification
@Databricks Před 15 dny
@@gravenguan No worries! Hope this clarifies for other users too
@nagendrasrinivas-cj7sr Před 15 dny
this is clearly copied from snowflake
@Databricks Před 15 dny
Variants in their various forms have been around for many decades. We're big fans of open source so anyone can use the implementation in other projects or products.