Hi: The Dark Theme is available in SAS Studio version 5.2 or later (5.2 ships with Viya 3.5). The Dark Theme is not available in the SAS Studio 3.x series (SAS 9,4). If you are using SAS Studio on the free SAS OnDemand for Academics server account, you will not be able to use Dark Theme.
@@SASUsers Original site validation data Current version: 9.04.01M6P110718 Site name: 'SAS ONDEMAND FOR ACADEMICS'. Site number: 70094220. CPU A: Model name='' model number='' serial=''. Expiration: 31DEC2022. Grace Period: 45 days (ending 14FEB2023). Warning Period: 49 days (ending 04APR2023). System birthday: 05JAN2021. Operating System: LIN X64 . Ooops does Sas demand expire?
Access to SAS OnDemand for Academics can expire after an extended period of inactivity. However, to reactivate your access to SAS OnDemand for Academics, visit 2.sas.com/6054JEdZM enter your SAS Profile credentials and accept the terms of the license and the terms of use and conditions. Learn more about SAS OnDemand for Academics under our Q&A section here 2.sas.com/6055JEdZ3
Hey, Pavan - for a quick intro, check out my Jedi SAS Trick blog posts "Finding Tattoine with DS2" (bit.ly/JSTTattoine) and "Warp Speed DATA Steps with DS2" (bit.ly/JSTWarpSpeed) For more depth, have a peek at the Introduction to the DS2 Language in the online docs (bit.ly/DS2Intro)
DS2 is a Base SAS programming language for advanced data manipulation. Here are some great papers and documentation on PROC DS2. •2.sas.com/6057J8XpP •2.sas.com/6058J8Xpu •2.sas.com/6059J8XpR •2.sas.com/6050J8Xpr
Sachin, unfortunately, we can't post code or screen shots in this CZcams feedback area and too much is unknown about your question. The short answer to your question is that using a DATA step or PROC SQL, you could create a new SAS table with new columns from your existing SAS data table. The challenge is that it's impossible to show code without seeing a sample of the data and understanding what is in your one column and what criteria should be used for splitting. This is the kind of question that is best asked in the SAS Community Forum, where experienced SAS users help one another with coding questions. We would recommend posting your question in this forum: 2.sas.com/6050JD4A0 .
Thank you so much for this insightful tutorial. i learnt new things today. i have a question regarding the DS2 since im new to SAS. i tried your code with the DB i have on oracle and took about 40 Min to complete with the below log Thread 4 processed 1370861 observations. Thread 1 processed 1370496 observations. Thread 2 processed 1369472 observations. Thread 3 processed 1371264 observations. NOTE: BASE driver, fixed byte UNICODE encodings not supported, using UTF-8 instead WARNING: BASE driver, table encoding reset to utf-8 NOTE: BASE driver, creation of a VARCHAR column has been requested, but is not supported by the BASE driver. A CHAR column has been created instead. NOTE: BASE driver, creation of a NUMERIC column has been requested, but is not supported by the BASE driver. A DOUBLE PRECISION column has been created instead. NOTE: BASE driver, creation of a TIMESTAMP column has been requested, but is not supported by the BASE driver. A DOUBLE PRECISION column has been created instead. A format has been associated with each column. NOTE: Execution succeeded. 5482093 rows affected. 55 quit; NOTE: PROCEDURE DS2 used (Total process time): real time 40:57.16 cpu time 1:36.98 if you can give me a quick tip on what should i modify or check into that would be great. Thank you
From the log, I note two things: 1. Your code extracted about 5.5M rows from the Oracle table. DS2 provides multi-threaded processing, but on the SAS Compute Server. So all of that data had to be transported over the network to the SAS Compute Server before DS2 could process it. 2. Your elapsed time is about 41 minutes compared to a cpu time of about 1.5 minutes. Because elapsed time is WAY larger than CPU time, we can conclude that this process is not CPU bound - it's I/O bound instead. Multiple compute threads don't help when a process is I/O bound. Think of it this way: It takes a crew 40 minutes to load and unload a truck (I/O), and it takes 2 minutes to drive the truck to the destination (CPU). If you only have one loading crew, there's no benefit in adding more trucks. As a matter of fact, because using multiple trucks requires extra coordination, it will probably take longer if you use more than one truck. So back to the problem at hand - because this process is not CPU bound, processing speed will not be improved by multi-threading in DS2, and it may actually be slower. Now, DS2 does not support a WHERE statement, but it WILL accept the results of a FedSQL query on a SET statement - something like this: SET {select Customer_ID, Product_Name, Quantity from mydb.Orders as o inner join mydb.products as p on o.Product_ID=p.Product_ID where quantity >6}; This would push the join and WHERE subsetting into the database, and could significantly reduce the amount of data retrieved to the Compute Server for further processing by DS2.
George, The definition of CPU-bound is addressed in the video around 12:50 (in Tip #5: What if you have a CPU-bound process) and demoed by looking at the log. Since your program’s Real time ~= CPU time, then you do have a program that is CPU-bound. If you can rewrite the traditional data step using DS2, DS2 should improve the CPU-bound process. To know for sure, you’ll have to re-write the code using DS2, benchmark and read your log to make the determination. You can also learn more about DS2 in our High-Performance Data Manipulation with SAS® DS2 course.
Hi: Unfortunately, we can't provide bench marking help or Tech Support in this CZcams feedback area. Your best resource for this question is SAS Technical Support. To open a track with Tech Support, fill out the form at this link: 2.sas.com/6057JHyL3 .
I THINK you are asking about the relative processing speed of the WHERE= dataset option vs. the WHERE statement - something like this: set mydb.big_table (where=(order_type=1)); vs. set mydb.big_table; where order_type=1; If so, these two techniques are completely equivalent as far as efficiency is concerned. So you can pick the one that works best for you without having to worry about speed ;-)
@@mark.jordan7446 thanks, I see the similarity in run times. In my test over a 90 million record table, the "IF" statement performed worst at 7:46:30 while the "where with keep" dataset options came in at only ~7.27. But, proc sql beat them all at 7.19! Even Proc SQL with a keep= option on the "FROM" statement performed slightly worse at 7.34
Thanks for your patience while we looked into this! The best place to request user feedback with this kind of question is in the SAS User Communities forum. Here's an example of a posting about using Eclipse: 2.sas.com/6052JBC4r . So we would either recommend posting in the Administration Community 2.sas.com/6053JBC4T or the Programming Community: 2.sas.com/6054JBC4p . Hope that helps!
I never knew what DS2 was actually *for*. This was a breakthrough for me.
Also please keep making tutorials like these. They are incredibly helping :)
We sure will! Thanks for watching, Ahmed!
Fantastic, learn a lot! Thank you very much
Glad it was helpful!
Video playback speed = 2 . That's what I am talking about.
Oh, yeah - the ONLY way to listen to tech videos :-D
Thanks for the info
Absolutely! We're glad you found it valuable.
Good one learned many new concepts... thanks much 👍😊🙏
Thanks, Pavan! So glad you found this useful :-D
Thank you so much ! This video was enlightening ! 😊
We're thrilled to hear that, Sarah! Thanks for letting us know, and we'll be sure to share your comment with Mark!
Thanks for the feedback, Sarah. You just made my day! Nice way to start my weekend :-) May the SAS be with you!
Wow. 💝
Thanks, Argentina 🙂
Hola hola!!!!👏👏👍
¡Hola! 🙂
I love the background your sas editor, How can I achieve that, tried to look for an option was to vain, plus thanks for the presentation
Hi:
The Dark Theme is available in SAS Studio version 5.2 or later (5.2 ships with Viya 3.5). The Dark Theme is not available in the SAS Studio 3.x series (SAS 9,4). If you are using SAS Studio on the free SAS OnDemand for Academics server account, you will not be able to use Dark Theme.
@@SASUsers Original site validation data
Current version: 9.04.01M6P110718
Site name: 'SAS ONDEMAND FOR ACADEMICS'.
Site number: 70094220.
CPU A: Model name='' model number='' serial=''.
Expiration: 31DEC2022.
Grace Period: 45 days (ending 14FEB2023).
Warning Period: 49 days (ending 04APR2023).
System birthday: 05JAN2021.
Operating System: LIN X64 . Ooops does Sas demand expire?
Access to SAS OnDemand for Academics can expire after an extended period of inactivity. However, to reactivate your access to SAS OnDemand for Academics, visit 2.sas.com/6054JEdZM enter your SAS Profile credentials and accept the terms of the license and the terms of use and conditions. Learn more about SAS OnDemand for Academics under our Q&A section here 2.sas.com/6055JEdZ3
Could you please explain proc ds2 concept ?
Thank you for your feedback! We are checking in to this for you!
@@SASUsers thank you, looking forward for your video on this 👍
Hey, Pavan - for a quick intro, check out my Jedi SAS Trick blog posts "Finding Tattoine with DS2" (bit.ly/JSTTattoine) and "Warp Speed DATA Steps with DS2" (bit.ly/JSTWarpSpeed) For more depth, have a peek at the Introduction to the DS2 Language in the online docs (bit.ly/DS2Intro)
DS2 is a Base SAS programming language for advanced data manipulation. Here are some great papers and documentation on PROC DS2.
•2.sas.com/6057J8XpP
•2.sas.com/6058J8Xpu
•2.sas.com/6059J8XpR
•2.sas.com/6050J8Xpr
@@pavanchandolu08 Here is a video that introduces DS2: czcams.com/video/HP7tzXHxkt0/video.html
If your original SAS data has only one column, what code do you use to split the one column and create two more columns?
Sachin, unfortunately, we can't post code or screen shots in this CZcams feedback area and too much is unknown about your question. The short answer to your question is that using a DATA step or PROC SQL, you could create a new SAS table with new columns from your existing SAS data table. The challenge is that it's impossible to show code without seeing a sample of the data and understanding what is in your one column and what criteria should be used for splitting. This is the kind of question that is best asked in the SAS Community Forum, where experienced SAS users help one another with coding questions. We would recommend posting your question in this forum: 2.sas.com/6050JD4A0 .
Thank you so much for this insightful tutorial. i learnt new things today.
i have a question regarding the DS2 since im new to SAS.
i tried your code with the DB i have on oracle and took about 40 Min to complete with the below log
Thread 4 processed 1370861 observations.
Thread 1 processed 1370496 observations.
Thread 2 processed 1369472 observations.
Thread 3 processed 1371264 observations.
NOTE: BASE driver, fixed byte UNICODE encodings not supported, using UTF-8 instead
WARNING: BASE driver, table encoding reset to utf-8
NOTE: BASE driver, creation of a VARCHAR column has been requested, but is not supported by the BASE driver. A CHAR column has been
created instead.
NOTE: BASE driver, creation of a NUMERIC column has been requested, but is not supported by the BASE driver. A DOUBLE PRECISION
column has been created instead.
NOTE: BASE driver, creation of a TIMESTAMP column has been requested, but is not supported by the BASE driver. A DOUBLE PRECISION
column has been created instead. A format has been associated with each column.
NOTE: Execution succeeded. 5482093 rows affected.
55 quit;
NOTE: PROCEDURE DS2 used (Total process time):
real time 40:57.16
cpu time 1:36.98
if you can give me a quick tip on what should i modify or check into that would be great.
Thank you
From the log, I note two things:
1. Your code extracted about 5.5M rows from the Oracle table. DS2 provides multi-threaded processing, but on the SAS Compute Server. So all of that data had to be transported over the network to the SAS Compute Server before DS2 could process it.
2. Your elapsed time is about 41 minutes compared to a cpu time of about 1.5 minutes. Because elapsed time is WAY larger than CPU time, we can conclude that this process is not CPU bound - it's I/O bound instead.
Multiple compute threads don't help when a process is I/O bound. Think of it this way: It takes a crew 40 minutes to load and unload a truck (I/O), and it takes 2 minutes to drive the truck to the destination (CPU). If you only have one loading crew, there's no benefit in adding more trucks. As a matter of fact, because using multiple trucks requires extra coordination, it will probably take longer if you use more than one truck.
So back to the problem at hand - because this process is not CPU bound, processing speed will not be improved by multi-threading in DS2, and it may actually be slower.
Now, DS2 does not support a WHERE statement, but it WILL accept the results of a FedSQL query on a SET statement - something like this:
SET {select Customer_ID, Product_Name, Quantity from mydb.Orders as o inner join mydb.products as p on o.Product_ID=p.Product_ID where quantity >6};
This would push the join and WHERE subsetting into the database, and could significantly reduce the amount of data retrieved to the Compute Server for further processing by DS2.
What is CPU bond? I have a process that runs that takes a few minutes in CPU time but almost an hour in real time. Would DS2 help with that?
George, thank you for your inquiry! We are checking on this for you!
@@SASUsers
My log from the proc sql:
real time 34:41.91
user cpu time 1:14.04
system cpu time 3:00.52
Memory 791938.68k
OS Memory 805132.00k
George, The definition of CPU-bound is addressed in the video around 12:50 (in Tip #5: What if you have a CPU-bound process) and demoed by looking at the log. Since your program’s Real time ~= CPU time, then you do have a program that is CPU-bound. If you can rewrite the traditional data step using DS2, DS2 should improve the CPU-bound process. To know for sure, you’ll have to re-write the code using DS2, benchmark and read your log to make the determination. You can also learn more about DS2 in our High-Performance Data Manipulation with SAS® DS2 course.
Hi:
Unfortunately, we can't provide bench marking help or Tech Support in this CZcams feedback area. Your best resource for this question is SAS Technical Support. To open a track with Tech Support, fill out the form at this link: 2.sas.com/6057JHyL3 .
Would "where" inside the SET statement change the results?
Hi there! We're looking into this for you and will follow back up with more info!
I THINK you are asking about the relative processing speed of the WHERE= dataset option vs. the WHERE statement - something like this:
set mydb.big_table (where=(order_type=1));
vs.
set mydb.big_table;
where order_type=1;
If so, these two techniques are completely equivalent as far as efficiency is concerned. So you can pick the one that works best for you without having to worry about speed ;-)
Hope Mark's info in his reply helped! Let us know if you have any other questions! 🙂
@@mark.jordan7446 thanks, I see the similarity in run times. In my test over a 90 million record table, the "IF" statement performed worst at 7:46:30 while the "where with keep" dataset options came in at only ~7.27. But, proc sql beat them all at 7.19! Even Proc SQL with a keep= option on the "FROM" statement performed slightly worse at 7.34
@@dgitson Yep - hard to beat SQL for pulling a single result set from a big ol' database table :-)
Does anyone use IDE VSCode or IDE Eclipse to edit and run embedded SAS code?
Hi Carlos! We're looking into this for you and will follow back up with more info shortly! 👍
Thanks for your patience while we looked into this! The best place to request user feedback with this kind of question is in the SAS User Communities forum. Here's an example of a posting about using Eclipse: 2.sas.com/6052JBC4r . So we would either recommend posting in the Administration Community 2.sas.com/6053JBC4T or the Programming Community: 2.sas.com/6054JBC4p .
Hope that helps!