Databricks
Databricks
  • 2 962
  • 16 255 619
Make your records Unique with Generated Identity Columns
Check out the docs here: docs.databricks.com/en/delta/generated-columns.html#use-identity-columns-in-delta-lake
Find more examples in this blog: www.databricks.com/blog/2022/08/08/identity-columns-to-generate-surrogate-keys-are-now-available-in-a-lakehouse-near-you.html
zhlédnutí: 1 002

Video

How to enforce data quality across columns within a table in Databricks
zhlédnutí 1,4KPřed 14 hodinami
Check out the docs here: docs.databricks.com/en/delta/generated-columns.html
Databricks Clean Rooms
zhlédnutí 1,9KPřed 19 hodinami
Data clean rooms allow businesses to easily collaborate on data in a secure environment, where multiple parties can safely combine sensitive data without compromising privacy or security. By implementing stringent protocols and advanced technologies, data clean rooms enable organizations to share data securely while ensuring compliance with privacy and regulatory requirements. In an era where d...
AI/BI: Intelligent Analytics for Real-World Data
zhlédnutí 2,7KPřed dnem
In this video you will learn about AI/BI which features two complementary capabilities: Dashboards and Genie. Dashboards provide a low-code experience to help analysts quickly build highly interactive data visualizations for their business teams using natural language, and Genie allows business users to converse with their data to ask questions and self-serve their own analytics. Databricks AI/...
AI-Powered Data Warehousing on Databricks SQL
zhlédnutí 1,3KPřed dnem
In this video, you will learn how you can leverage Databricks SQL's data warehousing capabilities to call AI functions, query models, and utilize the context-aware Databricks Assistant for seamless and efficient data analysis. This powerful combination makes it easier for analysts to unlock valuable insights and drive impactful decisions.
LakeFlow Demo
zhlédnutí 4,1KPřed 14 dny
Databricks LakeFlow is a new solution that contains everything you need to build and operate production data pipelines. It includes new native, highly scalable connectors for databases including MySQL, Postgres, SQL Server and Oracle and enterprise applications like Salesforce, Microsoft Dynamics, NetSuite, Workday, ServiceNow and Google Analytics. Users can transform data in batch and streamin...
Say goodbye to messy JSON headaches with VARIANT
zhlédnutí 3,3KPřed 14 dny
Try it out today on Databricks: docs.databricks.com/en/semi-structured/variant.html Read more about it on our blog: www.databricks.com/blog/introducing-open-variant-data-type-delta-lake-and-apache-spark If you're curious about the implementation check out the talk: czcams.com/video/jtjOfggD4YY/video.html Or read about it on GitHub: github.com/apache/spark/blob/master/common/variant/README.md
Data Intelligence Day Seoul 2024
zhlédnutí 567Před 14 dny
Data Intelligence Day Seoul, Korea took place on 23 April 2024 and gathered over 1,200 industry leaders and data and AI experts. Watch Data Intelligence Day Seoul On Demand: events.databricks.com/KoreaDIDays2024
An Introduction to DBRX
zhlédnutí 4,3KPřed 21 dnem
Learn from Naveen Rao, VP of Generative AI at Databricks, as he explains DBRX, a new, open source foundation model that sets the standard for production quality and price/performance. With up to 3x faster inference, DBRX - outperforms all other open models in quality benchmarks - and that allows enterprises to quickly build your own custom LLM efficiently and with full control. Read more about ...
Demo: How Do I Use DBRX?
zhlédnutí 1,9KPřed 21 dnem
Watch how DBRX uses Databricks to build and customize GenAI applications using your own enterprise data Read more about DBRX here: www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms?
What's Next for Apache Spark™ Including the Upcoming Release of Apache Spark 4.0
zhlédnutí 7KPřed 21 dnem
Reynold Xin, Co-founder and Chief Architect, Databricks shares the latest innovation coming out of the Apache Spark™ open source project including a preview of the anticipated release of Spark 4.0 Speakers: Reynold Xin, Co-founder and Chief Architect, Databricks Tareef Kawaf, President, Posit Sofware, PBC
The Evolution of Delta Lake from Data + AI Summit 2024
zhlédnutí 2,4KPřed 21 dnem
Shant Hovsepian, Chief Technology Officer of Data Warehousing at Databricks explains why Delta Lake is the most adopted open lakehouse format. Includes: - Delta Lake UniForm GA (support for and compatibility with Hudi, Apache Iceberg, Delta) - Delta Lake Liquid Clustering - Delta Lake production-ready catalog (Iceberg REST API) - The growth and strength of the Delta ecosystem - Delta Kernel - D...
Setting up PAT and Secret Scope
zhlédnutí 427Před 21 dnem
Quick video on how to setup a Personal Access Token and Secret Scope and Secret with Azure Key Vault.
Increase your column sizes without rewriting the entire table
zhlédnutí 793Před 21 dnem
Docs: docs.databricks.com/en/delta/type-widening.html
Announcing Delta Lake 4.0 with Liquid Clustering. Presented by Shant Hovsepian at Data + AI Summit
zhlédnutí 5KPřed 21 dnem
Announcing Delta Lake 4.0 with Liquid Clustering. Presented by Shant Hovsepian at Data AI Summit
Open Sourcing Unity Catalog Live Onstage with Matei Zaharia at Data + AI Summit 2024
zhlédnutí 1,2KPřed 21 dnem
Open Sourcing Unity Catalog Live Onstage with Matei Zaharia at Data AI Summit 2024
Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam
zhlédnutí 9KPřed 21 dnem
Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam
Recap of Announcements at Data + AI Summit 2024 with Ali Ghodsi, Co-Founder and CEO, Databricks
zhlédnutí 982Před 21 dnem
Recap of Announcements at Data AI Summit 2024 with Ali Ghodsi, Co-Founder and CEO, Databricks
Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar
zhlédnutí 1,3KPřed 21 dnem
Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar
Data Sharing and Cross-Organization Collaboration. Presented by Matei Zaharia at Data + AI Summit
zhlédnutí 460Před 21 dnem
Data Sharing and Cross-Organization Collaboration. Presented by Matei Zaharia at Data AI Summit
Announcing Unity Catalog Metrics with Live Demo. Matei Zaharia and Zeashan Pappa at Data + AI Summit
zhlédnutí 1,2KPřed 21 dnem
Announcing Unity Catalog Metrics with Live Demo. Matei Zaharia and Zeashan Pappa at Data AI Summit
Evolving Data Governance With Unity Catalog Presented by Matei Zaharia at Data + AI Summit 2024
zhlédnutí 3,3KPřed 21 dnem
Evolving Data Governance With Unity Catalog Presented by Matei Zaharia at Data AI Summit 2024
Unity Catalog Demo of New Features with Zeashan Pappa at Data + AI Summit 2024
zhlédnutí 1,4KPřed 21 dnem
Unity Catalog Demo of New Features with Zeashan Pappa at Data AI Summit 2024
How Data Intelligence is Delivering Big Wins at Texas Rangers. Alexander Booth at Data + AI Summit
zhlédnutí 494Před 21 dnem
How Data Intelligence is Delivering Big Wins at Texas Rangers. Alexander Booth at Data AI Summit
The Future of Lakehouse Format Interoperability with Ali Ghodsi and Ryan Blue at Data + AI Summit
zhlédnutí 682Před 21 dnem
The Future of Lakehouse Format Interoperability with Ali Ghodsi and Ryan Blue at Data AI Summit
How to Make Small Language Models Work. Yejin Choi Presents at Data + AI Summit 2024
zhlédnutí 4KPřed 21 dnem
How to Make Small Language Models Work. Yejin Choi Presents at Data AI Summit 2024
Data + AI Summit Keynote 2024 - Day 2 Opening Remarks with Ali Ghodsi
zhlédnutí 288Před 21 dnem
Data AI Summit Keynote 2024 - Day 2 Opening Remarks with Ali Ghodsi
Building an Enterprise Data & AI Catalog with Databricks Unity Catalog
zhlédnutí 1,3KPřed 21 dnem
Building an Enterprise Data & AI Catalog with Databricks Unity Catalog
Patrick Wendell, Co-founder and VP of Engineering on Building Production-Quality AI Systems
zhlédnutí 2KPřed 21 dnem
Patrick Wendell, Co-founder and VP of Engineering on Building Production-Quality AI Systems
The Best Data Warehouse is a Lakehouse
zhlédnutí 4,8KPřed 21 dnem
The Best Data Warehouse is a Lakehouse

Komentáře

  • @chinmaykajalwa
    @chinmaykajalwa Před 22 hodinami

    Nice video. You showcased an example of Duck DB integration with Unity Catalog. Can you please help me validate if my understanding captured in below points, about the behaviour of "open sourced Unity Catalog" (UC)? 1. Within UC, we can apply the Data Security changes like data masking. When this UC is accessed from Duck DB, the columns will appear as masked there too. 2. Similarly, other Security config such as row level Security, column level Security will also be visible in Duck DB. 3. Similar to "attach accounts_prod" command in Duck DB, we can integrate UC with other lakehouse implementations such as Microsoft Fabric and even on-prem Delta Lake too (or at least such integration is in roadmap). 4. Such tables are hosted/managed Within Databricks, but are accessed from Duck DB too, which is a reverse of what is done in case of "external table".

  • @baijusingh8486
    @baijusingh8486 Před dnem

    Thank you for your great presentation. Could you please provide the source code or share the link to the Git repository

  • @GeorgeSut
    @GeorgeSut Před dnem

    Epic Brit accent

  • @OfferoC
    @OfferoC Před 2 dny

    This sounds backwards

  • @pabloe1802
    @pabloe1802 Před 4 dny

    system.billing.usage does not contain a cluster_id columns

  • @ArifMarias
    @ArifMarias Před 4 dny

    Would it be possible to generate this on DLT Streaming table ?

  • @jeffrey6124
    @jeffrey6124 Před 5 dny

    Lovely! .... more detailed demo please 🤓 many thanks

  • @priteshkhilari1918
    @priteshkhilari1918 Před 5 dny

    No module named 'databricks.vector_search'

  • @sandeepkashyap6917
    @sandeepkashyap6917 Před 6 dny

    I cannt find the "launch genie" button , Do i need to enable anything else ?

  • @Cal-e-man
    @Cal-e-man Před 6 dny

    The video is playing back strangely interlaced, like a column was out in the stream or what not.

  • @ngneerin
    @ngneerin Před 6 dny

    How to install private package in serverless compute? I used init_script with normal compute to add private package access details in /etc/pip.conf BUT with serverless, initi scripts can't be used

  • @pedrocavalcanti7488

    oi got stuck because I'm only receiving 3 rows of responses. but my table have 800

  • @indiandelicacies1745

    Thanks for sharing - great content Naveen

  • @joyo2122
    @joyo2122 Před 6 dny

    are you hiring?

  • @Seanrck
    @Seanrck Před 7 dny

    Epic

  • @lorgerdat
    @lorgerdat Před 7 dny

    Databricks AI is great but a show stopper for adoption in many big organisations is data privacy and residency. My understanding is the data leaves the organisations tenant.

  • @b4dabng272
    @b4dabng272 Před 7 dny

    Brilliant 👌🏽

  • @LairdErnst
    @LairdErnst Před 8 dny

    Awesome!

  • @majidafra
    @majidafra Před 8 dny

    When I choose Timeseries as the profile type and set all the required fields it seems to be working fine, but when I open the dashboard it throws an error like "Table or View {catalog}.{schema}.my_table_profile_metrics Can not be found". should I create the my_table_profile_metrics myself or it is a part of the process?

  • @kuto1
    @kuto1 Před 8 dny

    Awasome. Can I use German to ask these questions?

  • @matthiasmueller9340

    Did I miss something or did you not mention on how to activate the AI features? Because when I try to use ai_analyze_sentiment() within my serverless Europe West env, I get an error saying AI_FUNCTION_HTTP_REQUEST_ERROR

  • @the_class_apart
    @the_class_apart Před 9 dny

    Wow this is amazing. I wanted to understand how variant data type is different from Struct type? Also second question. How does it work with array of json?

    • @Databricks
      @Databricks Před 9 dny

      Variant can be a mix of structs and arrays. The difference is the flexibility that you can have compared to the other two.

  • @liqunxie7830
    @liqunxie7830 Před 9 dny

    People who don't undertand the video at all (possibly not even finish watching) comment Feifei Li "creepy". That's the time you know most of the world goes crazy.

  • @alex_316
    @alex_316 Před 9 dny

    Love this shorts

  • @fernalication
    @fernalication Před 10 dny

    I ended up writing a custom function to handle data in batches and recursively exploding lists and normalizing dictionaries. Not having a schema or frontend developers saving elemnts as lists, then dictiomaries and then as bananas was tricky. I will give this one a try 😅

    • @Databricks
      @Databricks Před 9 dny

      Hope this simplifies things! Would love to hear if you notice performance gains too. Holly

  • @esteban-alvino
    @esteban-alvino Před 10 dny

    Hello for the video, it could't follow it up, because of the juniper notebook, what do you recommend me to follow in order to replicate what you did in this vidoe. Thank you.

  • @AmnBrt
    @AmnBrt Před 10 dny

    Well to be honest, it is not really free. You still need to pay for the AWS resources set up through CloudFormation. That stack cost me ~7 USD per day.

  • @ledinhanhtan
    @ledinhanhtan Před 11 dny

    Awesome!

  • @SmartAI247
    @SmartAI247 Před 11 dny

    oh great.

  • @trevorwills4234
    @trevorwills4234 Před 12 dny

    Great video Jason, Thanks for putting together. Would you be able to share the notebooks as well.

  • @mhalton
    @mhalton Před 13 dny

    Unfortunately, can't understand shit of what he's saying!

  • @rebeccaamador6926
    @rebeccaamador6926 Před 13 dny

    Getting an issue when trying to create the serving endpoint. It's saying Served entity creation aborted for served entity `audio_transcription_chatbot_model-1`, config version 1, since the update timed out. Have not been able to figure out why.

  • @yao5261
    @yao5261 Před 13 dny

    懂了,赛博号脉!

  • @FullEvent5678
    @FullEvent5678 Před 14 dny

    Very inspiring! My mind is going att 1000 miles an hour with ideas for our startup and clients from this!

  • @subedi04
    @subedi04 Před 14 dny

    Where can access your code or workbook? Would be nie to run your code.

  • @AadidevSooknananNXS
    @AadidevSooknananNXS Před 14 dny

    Holden and team are incredibly engaging and very easy to understand!

  • @ia6906
    @ia6906 Před 14 dny

    Great feature, please also include low code features in order to be more beneficial as Data factory also has for ETL

  • @Naraharisettiraviteja

    awesome

  • @brento2890
    @brento2890 Před 15 dny

    Excellent presentation, beginning 3.5-4.0 Billion years ago and explaining all the way to now (AI, non-physical-spatial). Excellent. Thank you. 👏

  • @TheDataArchitect
    @TheDataArchitect Před 15 dny

    Who's the speaker?

    • @Databricks
      @Databricks Před 14 dny

      Holly Smith - FYI it's also me in the comments for my videos so fire away with any technical follow on questions - Holly

    • @TheDataArchitect
      @TheDataArchitect Před 14 dny

      @@Databricks Awesome thanks

  • @muhammadibrahimabdullahi3840

    AI can do everything you need to do in times of studying and understanding AI.

  • @benim1917
    @benim1917 Před 15 dny

    Awesome 👏🏾

  • @Thegameplay2
    @Thegameplay2 Před 15 dny

    🎉

  • @gravenguan
    @gravenguan Před 15 dny

    How did parse_json handle schema evolution and from my kowledge, prod table do not recommend parse schema on the fly, it's more safer to define schema first

    • @Databricks
      @Databricks Před 15 dny

      I agree, but with a lot of JSON data you don't know the schema upfront and so can't define it. It's worth noting this is different from inferring the schema which looks at the first 1000 rows and is brittle to upstream changes - Holly

    • @gravenguan
      @gravenguan Před 15 dny

      @@Databricks We used parse_json for dev and exploration purposes as well, thank for the clarification

    • @Databricks
      @Databricks Před 15 dny

      @@gravenguan No worries! Hope this clarifies for other users too

  • @nagendrasrinivas-cj7sr

    this is clearly copied from snowflake

    • @Databricks
      @Databricks Před 15 dny

      Variants in their various forms have been around for many decades. We're big fans of open source so anyone can use the implementation in other projects or products.