SQL Server in Microsoft Fabric? Use CETaS to move data!

Guy in a Cube

zhlédnutí 24 252

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 5. 08. 2024
People have been asking "What about SQL Server?" with regards to Microsoft Fabric. Bob Ward joins us to show a way you can get your SQL data into Fabric with CETaS and the power of shortcuts in OneLake. And, then use that data in Power BI!
CREATE EXTERNAL TABLE AS SELECT (CETAS) (Transact-SQL)
learn.microsoft.com/sql/t-sql...
Microsoft SQL, Data Virtualization and Microsoft Fabric (Slides)
aka.ms/sqlfabricdemo
Bob Ward:
/ bobwardms
/ bobwardms
📢 Become a member: guyinacu.be/membership
*******************
Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
🎓 Guy in a Cube courses: guyinacu.be/courses
*******************
LET'S CONNECT!
*******************
-- / guyinacube
-- / awsaxton
-- / patrickdba
-- / guyinacube
-- / guyinacube
-- guyinacube.com
**Gear**
🛠 Check out my Tools page - guyinacube.com/tools/
#MicrosoftFabric #SQLServer #GuyInACube
Věda a technologie

Komentáře • 50

@chainpeu Před rokem ⁺¹⁶
Using SQL Server On-Prem as Power BI source data must be one of the most extended use cases out there. Before Fabric exposing SQL Server tables to Power BI was very straight forward using Power BI Gateways.
Seems that now with fabric we need to create SQL Server partitions, then provision BLOB storage in ADLS, then create a CETaS to create that partition as a parquet file on ADLS, then expose this to OneLake using shortcuts, then format again the parquet virtualized files in OneLake using a delta table, then create the Power BI data model in Power BI and only then you can create a Power BI report out of a SQL Server table. Am I the only one that sees this as some convoluted crazy overengineering process?
Probably there are other ways using Azure Pipelines to copy data from SQL Server On-Prem to OneLake (not in preview I believe). But even when that is GA, the concept of having to copy your on-prem SQL Server structured data to a BLOB storage persistence layer to do simple reporting looks twisted. Companies that want to embrace Fabric are going to have to re-engineer their ETL processes to write on OneLake and deal with a heck of one-time data load for the existing historical data. Just to reach probably a worse performance scenario than the one they already have.
Direct lake connectivity might be the one saving grace on this whole thing... we will have to see. But as of now I'm very sceptical and a bit pissed at MS with their strategy to push everyone (eventually) to this one Lake architecture that in many use cases is absolutely not required.
@noahhadro8213 Před rokem ⁺⁵
Well said
@GuyInACube Před rokem ⁺⁵
Appreciate the long comment and thougts you put into that. To be clear, the approaches you've done in the past with Power BI and leveraging data from on-prem, still work! That hasn't changed. There are new approaches and things are evolving. If you still want to Import your data or use DirectQuery against something like SQL Server via a gateway, go for it! If you want to leverage OneLake for reuse of data, you can do that now as well. They are different architectures though. It's the question of would you benefit from more of a lakehouse approach, or something different. Fabric gives you a lot of options and you have different tools in your toolbelt.
A lot of companies already have a data lake type approach where data is already in storage of some kind. This is an easy fit for Fabric. As you mentioned, if you want to take your existing on-prem assets to a lake centric approach, there is some thought and architecture that needs to go into doing that. But again, if what you have is working, that gives you time to think about that approach and if it makes sense from a business perspective. What problem are you trying to solve? Does a lake centric approach with Microsoft Fabric's new capabilities help solve that problem?
The ability to use the gateway with pipelines, in Fabric, isn't an option yet. Get your feedback in at aka.ms/fabricideas! Lots of folks commenting the desire to have it - and I feel it makes sense.
@jeffgray8076 Před 11 měsíci
Thanks for making this comment - I thought I was taking crazy pills watching that video. Taking data from an existing table, running it through several very brittle engineering steps, only to convert to a new table for use had me wondering what I was missing. Very skeptical of this use case.
@dagg497 Před 4 měsíci ⁺¹
@@jeffgray8076I agree with you. I would have liked to see Fabric as just repackaged Azure storage gen2+Data Factory+Synapse into RDB database like Azure SQL Database or Serverless.
This move to Parquet is ONLY beneficial to Databricks imo.
AND you get stuck with basically having to write a Scala/Python ELT framework just to handle the Delta Lake Ingestion/conversion/time logic..
Oh and It's real fun querying the data in ssms and having to use an exact Azure url 🤣🤣
Sidenote i hate Parquet and Databricks already after 8months tackling it
Not to mention you never knew the past years If Excel/Teams stored data in Onedrive/Sharepoint Onedrive/MS Stream.
AND now there's Dataverse also, that actjally looks very promising for the power platform with the new power pages to maybe replace sharepoint altogether!
@Storm_Troooper Před rokem ⁺²
Yoooo!!! Great content guys. You guys are making my fabric journey easy to navigate with your amazing content. I love the fabric community
@kayespedal10 Před 11 měsíci ⁺¹
This was my second video I've watched. Wowsers, blown away, all that movement of data and zero code.
A+
@GuyInACube Před 11 měsíci
BAM! Welcome to the channel. Appreciate you watching!
@yanny2000 Před rokem ⁺⁹
would love to see a video from you guys showing some examples or best practices how to get data from different sources and databases into Power BI.
For example: we have a CRM running on mysql, a ERP on firebird and a financial ERP on MSSQL. What is the best strategy to bring data togehter in order to analyse it in Power BI. And where?! (in fabric one lake, in an azure sql Datawarehouse,...)
@GuyInACube Před rokem
We are already thinking about this. When we think of on-prem data, for lakehouse and warehouse, the only real option right now is Dataflows Gen2. The existing approaches with Power BI and the on-premises data gateway are still valid approaches for reporting on data. That doesn't take advantage of the new workloads though.
I'll go back to what business problem are you trying to solve though and which tool will help you solve that problem. Going with a lake centric approach with Lakehouse/Warehouse/OneLake should be approached with some architecture in mind and the understanding that orchestration will need to take place regardless of the sources.
@AdrianMcGavock Před rokem ⁺¹
Great video, cool demo from Bob 🥳
@GuyInACube Před rokem
Much appreciated! Thanks for watching 👊
@scottwilliams6741 Před rokem ⁺¹
Amazing, info! Thanks, Bob!
@GuyInACube Před rokem
Appreciate that Scott! 👊
@denyschamberland8897 Před rokem ⁺³
Question: I can easily extract data from a .html financial stocks table using a PySpark notebook w BeautifulSoup crawler, define schema and column headers, create a dataframe and save result in Delta (Stocks.part01,Stocks.part02..,Stocks.part0n) in Lakehouse subfolder under Files. (This could also be achieved using Python .py file as Spark Job Definition w schedule).Still Delta doesn't bring much options other than read (Preview) at this stage. So guess we have not much choice than create a SQL Table in Warehouse using CETAS. Sample shows a single file, but what if we need to... sort of "concatenate" all Delta parts (Stocks.part01,Stocks.part02..,Stocks.part0n) into one table from same Delta location.
@clinthofeldt8299 Před rokem ⁺²
Any word on if we will have the ability to see usage metrics when moving items over from SQL into a Fabric Lakehouse or Warehouse similar to how we currently can utilize the Monitoring features in Azure to see Azure SQL server space and DTU usage when performing read/write operations? If I point a Power BI Report/Dataset to the Lakehouse SQL Endpoint and set an hourly refresh rate on that, it would be helpful to compare that read/write usage to the comparable Azure SQL metrics to see what makes sense to use cost-wise for different scenarios.
@ravindersunkara5295 Před rokem ⁺¹
Fabric & one data lake concept is cool
@GuyInACube Před rokem
We agree! Can take things to a whole new level 👊
@kanishk7267 Před rokem ⁺¹
Amazing content.
@GuyInACube Před rokem
Glad you liked it! 👊
@Bizza12345 Před 11 měsíci
great video
@gpltaylor Před 27 dny
things are getting very excited! Is there are way to stream live SQL Server production data into Fabric?
We need to report on live data but we don't want to drop and recreate the data sources.. what's the best option for streaming live data into Fabric?
@synthmania7275 Před 11 měsíci
Yo ! To both of y’all!!
@Khan-mc3gi Před rokem
I want to ask that as on one click fabric show dashboard report of power bi same as on one click fabric can show sql report mean as we find total sales,total order,max sales etc same these and other sql query can find in one click in fabric if this happen then it will be great then we dont need to learn sql we can get all sql queries in one click in fabric pls give my question
@zorgath Před 11 měsíci
Is there anyway to do this prior to SQL Server 2022?
@akthar3 Před 7 měsíci
Am i right in understanding that you can not create shortcuts in a MS Fabric lakehouse to directly access PAAS sql server tables ? So you need complex work-around to make it happen .
What is the story behind this limitation ? Is there a more straightforward solution on it's way in future?
@sonicdoomofficial9200 Před 11 dny
Got a specific question and don’t know how to navigate. I have an Azure Manager SQL Instance in the same Resource Group as my Fabric capacity however I cannot for the life of me get Fabric Data Factory to connect to it. I can connect to the instance remotely SSMS. Any idea what to trouble shoot? Looks networking related somehow.
@ItsNotAboutTheCell Před rokem ⁺¹
#QuickCreateEverything!!!
@GuyInACube Před rokem
hahaha it got the job done!
@mnhworkdev1652 Před rokem
I wonder what the SQL DB compatibility level needs to be to achieve this.
@_HugoQueiroz Před rokem ⁺¹
There's no compatibility level requirement for CETAS.
@robbief1 Před 11 měsíci
How long until the Azure SQL table can exist as a Fabric table (the lines are becoming blurred).
@GuyInACube Před 11 měsíci ⁺¹
Great question! Not sure if that's even in the works. Get your feedback in though at aka.ms/fabricideas if that's something you'd like to see.
@joshash5944 Před rokem
Just waiting for the perfect world where delta performs as well as, or better, than an indexed relational data warehouse.
I'm skeptical of Microsoft's use of delta in Fabric Warehouse - and I hope it improves, the performance has been poor. It seems like Synapse Serverless.
@dagg497 Před 4 měsíci ⁺¹
If It aint broke don't fix It. This is just ETL with convoluted extra steps 😅
I am really starting to miss Blob Storage and Gen 2 storage and a fool prrof SQL Database..
If the Datalake filesystem was a GUI, fine. But cases I've seen you need a Scala/Python framework to handle Parquet ingestion/conversion. Oh and querying the data needing an Azure URL string is no fun 😭
@kc4848_cal Před 2 měsíci
Now how do I convince my IT team, who controls all this but doesn't use any of these tools, that we need this by the end of the week?
@szklydm Před rokem ⁺³
#IWatchPatrick #TheFabric
@GuyInACube Před rokem
haha Patrick appreciates that! 🙏👊
@Karenshow Před 9 měsíci
Cold data vs Hot Data ?
@dagg497 Před 4 měsíci ⁺¹
Hot data is what you typically use. For instance measures and KPIs in Power BI, you jsually compare to "Same Period Last Year". So hot data is usually 2 years.
Cold data is historic data >2years. Lets say year 3 through year 10.
@JasonHurley Před rokem ⁺³
Important question... are people seriously calling it "The Fabric"?
@TedSeeber Před rokem ⁺⁵
"I didn't write a single line of code"- as he shows approximately 60 lines of SQL Scripting.
I think this no-code thing is not quite as ready for primetime as Microsoft wants us to believe.
@reams Před rokem
Right....I was like how did you create the archive tables 😂
@bobward436 Před rokem ⁺²
@@reams It is a fair comment. What we meant is no code is needed once I had the cold data archived in Azure Storage
@dagg497 Před 4 měsíci ⁺¹
Agreed. Settkng up external table managing probably isn't static and has to run in a stored procedure. On top of that tou have that sweet sweet SQL that needs to Query an dxact Azure storage URL 😂
@tomek8464 Před rokem ⁺¹
I just did a yo😂
@GuyInACube Před rokem
BAM! and... YOOO!
@vincenzobuono3841 Před rokem
the same thing that we have already in synapse analystics but now we can use a new word...shortcut....
@GuyInACube Před rokem
Shortcuts are amazing! This helps you to leverage the next generation of Synapse!
@thomasbrothaler4963 Před rokem
Geat video, but anyone else feeling uncomfortable due to a lack of naming conventions used for sql tables and columns? 😀

Další v pořadí

Automatické přehrávání

Understanding OneLake within Microsoft Fabric