Cache, from History to the Future of CPU Memory

Sdílet
Vložit
  • čas přidán 8. 07. 2024
  • A look back to the early days of cache-less computing, to what's coming next with Zen 2.
    ♥ Subscribe To AdoredTV - bit.ly/1J7020P
    ► Support AdoredTV through Patreon / adoredtv ◄
    Bitcoin Address - 1HuL9vN6Sgk4LqAS1AS6GexJoKNgoXFLEX
    Ethereum Address - 0xB3535135b69EeE166fEc5021De725502911D9fd2
    ♥ Buy PC Parts from Amazon below.
    ♥ NEW USA Store! - www.amazon.com/shop/adoredtv
    ♥ Canada - amzn.to/2ppgYsX
    ♥ UK - amzn.to/2fUdvU7
    ♥ Germany - amzn.to/2p1lX6r
    ♥ France - amzn.to/2oUAK2Z
    ♥ Italy - amzn.to/2p37Uui
    ♥ Spain - amzn.to/2p3oIBm
    ♥ Australia - amzn.to/2uRTYb7
    ♥ India - amzn.to/2RgoWmj
    ♥ Want to help with Video Titles and Subtitles?
    czcams.com/users/timedtext_cs_p...
    -- Video Links Below --
    CC BY-SA 2.5, commons.wikimedia.org/w/index...
  • Věda a technologie

Komentáře • 696

  • @Peds013
    @Peds013 Před 5 lety +273

    Your dad started coding at 60...
    My boss still can't use a mouse :-/

    • @MarikHavair
      @MarikHavair Před 5 lety +35

      @calistorich Reminds me of one of my favorite quotes.
      "It's is of the nature of man to err, and to blame it on someone else shows management potential."

    • @shznn
      @shznn Před 5 lety +10

      Someone very smart would here say, "Hmm, your comment, sir, explains everything that's wrong with society, hmm hmm." :)

    • @rpmTweeK
      @rpmTweeK Před 5 lety +2

      I'm more in awe of the fact that his dad had him at 60 or so years, then started coding. What a legend !

    • @oldtimergaming9514
      @oldtimergaming9514 Před 5 lety +1

      So the hashtag #LearnToCode does apply to ex coal miners? Who would have thought it possible.
      My dad loved coding, building circuit boards and anything electronic but that was his job, not a coal miner. I am impressed. I miss him. Etching circuit boards with him are some of my fondest memories.
      I cut my teeth on a honeywell 6000 mainframe and learned COBOL, FORTRAN and BASIC programming. A staggering 256k of core memory!

  • @issaciams
    @issaciams Před 5 lety +267

    Alright got my food. I'm ready. Go.

    • @V4zz33
      @V4zz33 Před 5 lety +3

      Haha, I just had my breakfast;))))

    • @CaveyMoth
      @CaveyMoth Před 5 lety +6

      Did you bring back any tomweapondamage while you were out?

    • @gustavb3673
      @gustavb3673 Před 5 lety

      You mean "goto 10" right ;)
      Seriously i needed to take a food break in the middle of the video.
      I found this video hard to watch especially the first part since i kept remembering things and dreamed away and had to rewind again and again and again.....
      \o/

  • @jerrywatson1958
    @jerrywatson1958 Před 5 lety +420

    Your long format videos are the best! I know it's a lot of work but, your work/content is better than a commercial tv show. I would go as far to say it's documentary level writing with very high production values. Thank you Jim, do what you need to to do. We will wait.

    • @jordanwharton5286
      @jordanwharton5286 Před 5 lety +6

      I also agree. I've learned so much from your analyses and I always get excited to see what you'll uncover next! Keep up the great work!

    • @CoccoUri
      @CoccoUri Před 5 lety +5

      agree :)

    • @Velkanis
      @Velkanis Před 5 lety +6

      you my dear intenet stranger nailed my thoughs dead on.

    • @_BangDroid_
      @_BangDroid_ Před 5 lety +3

      I know this is totally random but Jerry Watson sounds like the coolest name I've ever heard. Sounds like a cool jazz cat from back in the day.
      Totally agree, thoroughly enjoyed the video also.

  • @mitchellwheeler7107
    @mitchellwheeler7107 Před 5 lety +38

    I'm an embedded software engineer (i live and breathe microarchitecture & memory optimisation, so I deal an awful lot with optimising software around cache usage).
    Note: I did my best to make this as concise as possible, but it's unavoidably a complex topic, so it's a wall of text regardless.
    The 'weird'/flawed runs you're seeing are quite common, and while the causes indeed can be so many things (all of which are difficult to diagnose), the most common cause in my experience is poor page table colouring, sometimes things 'go wrong' with this optimisation depending on the OS, and it results in this kind of behaviour for the entire run of the process (or if you're lucky / the OS doesn't cache page table allocations, lasts until you re-allocate the memory without needing to re-create the process).
    You'd have to understand how virtual memory / paging works to get a strong grasp on what's going on, but the short version is when working within operating systems (or indeed any software system dictated by a kernel with a concept of virtual memory) - memory is allocated by the kernel in 'pages' (due to processors having a limits to it's 'pages' / virutal memory support in it's MMU, and/or due to the kernel optimising around the size of the TLB).
    Some kernels & C runtime library implementations are pretty simple, and when you malloc some memory, you're basically given an entire page (not always true) - and even if you're not, in benchmarks especially, you're often working with chunks of memory that are multiples of the page size. So in an awful lot of cases, you're literally working with memory aligned to a virtual memory page.
    Something I don't think you covered in your video though, is most modern CPU caches are associative (see: en.wikipedia.org/wiki/CPU_cache#Associativity) - which means there's a limited number of entries in the cache, but it still has to be capable of caching 'any' memory address despite it's limited entries... This results in a compromise where N cache entries are responsible for caching up to potentially M memory addresses (where M is far greater than N). Also see en.wikipedia.org/wiki/CPU_cache#Cache_entry_structure on how this works (tl;dr - all memory addresses sharing a common MSB share the same cache entries, how much of the MSB depends on the cache).
    At the start of my comment I referred to 'page table colouring', this is an optimisation made by kernels to 'avoid' this problem - by attempting to ensure contiguous virtual memory pages, get put into 'different' cache entry sets, to make the most use of the processor cache.
    HOWEVER (this is where it all comes together), these two concepts can collide in unfortunate ways. It's very rare, but very possible, that subsequent memory allocations made by a process, happen share those cache entries, either due to a lack of or a failure of page table colouring (the how/why it can fail is a another wall of text, but long story short - non-hard-realtime kernels (which includes windows, non-RT Linux, and macOS) can't easily/efficiently enforce this, due to the non-determinism of scheduling in non-realtime scenarios, lest they serialize everything / bring performance of multi-threaded memory allocation to a crawl).
    In the scenario where this problem occurs, you can/will often see things similar to what you're seeing in your weird/flawed results. It's likely subsequent chunks of memory in the of 8-16MiB memory allocated by the benchmark, have a poor distribution of memory addresses across the cpu cache, resulting in poor cache utilization. Due to the non-deterministic nature of consumer operating systems (as they don't have hard-realtime/deterministic schedulers/memory-allocators/etc), this is why it happens only sometimes, and restarting the process (which ensures the memory is completely re-allocated) makes the problem go away.
    Well written microbenches can avoid this, by ensuring the memory they're given has a statistically even distribution across the cache, but most applications don't bother to check this / they just blindly use the memory they allocate and hope for the best.

    • @PanduPoluan
      @PanduPoluan Před 5 lety +1

      Awesome explanation! Fortunately I'm quite well-versed enough in CPU intricacies to understand your explanation (many thanks to the BYTE Magazine -- I still grieve the loss of that great publication).
      On the flip side, not having much experience in writing software that take all the vagaries of cache management in consideration, do you think slightly reducing the size of the dataset being tested on will help? For example, rather than testing with a 16 MB dataset, we use just 15 MB dataset, giving a leeway of 1 MB for a 16 MB-sized cache?

    • @RobBCactive
      @RobBCactive Před 3 lety

      @@PanduPoluan @Pandu POLUAN watching the video it struck me that the simple natural doubling of data set size is a bit too coarse. It would be interesting to test double and double +/- ½ & +/- ¼ to on the runs with junps in latency to investigate the behaviour crossing these boundaries where Ln-1 effects behaviour of Ln size sets.
      I am not sure what you mean by "help", but basically with a victim cache the data size adds and a benchmark is so dominant in CPU usage on a quiescent system your results are not effected, when they are the whole run needs to be discarded because some CPU intensive operation interrupted the benchmark. This can be mostly avoided by increasing the priority of the benchmark.
      Note in practice, fast programs tend to operate over memory with sequential accesses, which allows anticipatory speculative loads which hide the main memory latency almost completely. I have used that to process data sets the full size of system memory, which behave close to L3 cache bandwidth even though the virtual memory system is loading pages from disk.
      I can recommend a series of articles non lwn.net about cache effects on modern programs, if you are still interested in the subject.

  • @SporkOfDestruction
    @SporkOfDestruction Před 5 lety +196

    Fantastic content. I am an IT pro, and one of the concepts even colleagues struggle with is cache memory and how it's used. I've never heard it explained in such an understandable way - thank you! Now I have somewhere to point them!

    • @erikboesephoto
      @erikboesephoto Před 5 lety +4

      ITPro here as well. Definitely a fantastic explanation!

    • @milkman9055
      @milkman9055 Před 5 lety +3

      Yeah, this was a good one!

    • @EditioCastigata
      @EditioCastigata Před 5 lety

      You'd have to have the term 'memory wall' at least once in college.

  • @nikolaangelov3583
    @nikolaangelov3583 Před 5 lety +112

    Man your job is very hard, but it must be very fulfilling too. And you keep learning new stuff every day. That's truly beautiful. You keep getting smarter every day. That's fun

    • @adoredtv
      @adoredtv  Před 5 lety +13

      This is true!

    • @alexmarin7897
      @alexmarin7897 Před 5 lety

      Shame though that the video is filled with poorly made future projections. Suggesting you would get 64MB of L3 cache in Ryzen 3000 series (32MB per octacore die, 16MB per CCX or 4MB per core) just screams lack of understanding about basic aspects of computer science.

    • @adoredtv
      @adoredtv  Před 5 lety +3

      @@alexmarin7897 I guarantee you that Ryzen 3000 has that cache layout (except not "4MB per core" as you erroneously put it) and I guarantee you that you're the one who lacks basic understanding.

  • @Anaximanderification
    @Anaximanderification Před 5 lety +47

    Aside the IAC methods, you just compressed about 2 semesters of CompSci on architecture into a neat package.
    Very good job sir, hat's off.

    • @Chuckiele
      @Chuckiele Před 5 lety +2

      yep. my heads smoking but it was well worth it.

  • @Trinitos
    @Trinitos Před 5 lety +42

    How many programmers do you need to change a bulb? - none, it's a hardware problem ¯\_(ツ)_/¯

  • @redhaze8080
    @redhaze8080 Před 5 lety +27

    my dad was a rigger but a few of his mates were coal miners here in Wollongong. One of them was mad in to his macintosh 128k right till he dropped from coal dust and asbestos. He was a tough old bugger and had never done anything like that before, but he was bloody inspiring. i was 10 and he was better at coding than me.

  • @dastardly740
    @dastardly740 Před 5 lety +38

    I skimmed replies and didn't see this mentioned (could have missed it) 1ns is the period for 1GHz. So, your 2700X running at around 4 GHz has an L1 cache that takes about 4 clocks to return data. The engineering sample that you called a regression is running around 3.5 GHz, and 4 clocks is about 1.13ns, so it is not a regression but exactly as expected. My R5 1600 was 1.25ns which at 4 clocks would be 0.3125 or 3.2Ghz. So we can be pretty sure that L1 on 1XXX and 2XXX is a 4 clock cache.
    Presumably, the engineers at AMD can fiddle with that L1 multiplier, so maybe they decided to try 3 clocks on the slower chip. 3.2Ghz is 0.3125ns would be 0.9375ns, not quite as fast as the benchmark, but not that far off. Maybe the actual clocks during the cache test were a bit higher. But, if I were an engineer testing chips this is probably a very important test. 5GHz at 4 clocks is 0.8ns. Maybe the engineering sample won't reach 5ghz, but they need to know whether the cache could reach 4.8-5GHz at 4 clocks. So, they down clock the chip and set the L1 multiplier to 3 clocks to see what the L1 is capable of and 0.8-0.9 means the L1 should allow for those high 4 to 5Ghz clock speeds from your leaks.

    • @PanduPoluan
      @PanduPoluan Před 5 lety +3

      Nice analysis! You must be one of the "helpful folks" Jim mentioned in the video :-)
      Hmmm... it seems that AMD's Zen 2 has quite a bit of headroom there... so when AMD finally launches Ryzen 3k, and Intel as expected tried to counter (with great difficulty), AMD can wait until the right moment and totally take out the wind to Intel's sails with another push into the Zen 2 headroom. Intel will then do a Hail Mary move, and AMD delivers the killing stab.
      I can totally see AMD owning the market for the next 2, maybe 3 years. In 2022 maybe Intel will start to become competitive again, but at that moment we will have 2 gorillas duking it out, none with clear dominance as Intel had the past decade, and consumers will profit greatly.

    • @RobBCactive
      @RobBCactive Před 3 lety +1

      @@PanduPoluan they have indeed announced some desperate looking moves including a big/little design in a rectangular package to mitigate excessive power consumption. Their 10nm laptops have lower battery life and performance than AMD Renoir but the problem is finding designs with AMD in them. Most of the market don't seem to care and just accept the inertia of the OEMs and ODMs who are the real laptop manufacturers

    • @RobBCactive
      @RobBCactive Před 3 lety

      Hmmmm, IIRC these low level caches are synchronous with core, so I think the silicon design determines the cache cycle time not a configurable multiplier. A key role of the Infinity Fabric is bridging the mismatch of cache speeds with main memory.
      Without memory in the system how can asynchronous operations function? You would have to stall CPU registers for variable times to permit variable cycle L1 accesses, if it's fixed synchronously the values can be held in the circuit transistors after the store micro op is initiated and the transfer flow through L1 and into L2.

    • @PanduPoluan
      @PanduPoluan Před 3 lety

      @@RobBCactive The AMD models are starting to trickle in. It's quite understandable from the POV of laptop makers to not immediately jump in with both feet; they need to see someone jump in first (Asus did it), and then after they saw how brilliantly AMD Renoir performed, they started to get on the bandwagon and design their systems.
      I think Papermaster did allude to this; he did expect Renoir pickup to be slow but steady. And as we started to see "enterprise" laptops with AMD coming from the likes of Lenovo and HP, I think what he had surmised back then is now proven.

  • @Velkanis
    @Velkanis Před 5 lety +20

    everyone can make a video, everyone can try making something entertaining but rarely do i see someone making something longer than 20 minutes that will make me sit and listen no matter what its the final content, thats the quality level of JIm. for me that i like and really enjoy to knowing how things work and daydreams about how and what the future depare for us, having someone take a earth leveled, reasonable look at the future is a feast upon my eyes and ears (for example that was also the case with path tracing video).
    i only can only be in awe at these videos due to how meticulusly crafted and wonderful are, its a joy for the mind.
    Thanks Jim for how long you have stuck in here against the popular opinion, and for inmeasurable efforth put in these videos! glad to be a supporter! Cheers and have a magnificent day!

  • @Elusivehawk
    @Elusivehawk Před 5 lety +140

    Jim, you really hate my sleep schedule, don't you?

  • @EldaLuna
    @EldaLuna Před 5 lety +5

    all these years ive seen these cache sizes and never really knew how they functioned.. for the first time ever i now understand exactly how they work and why.. very impressive i must say.

  • @michaelkregnes9119
    @michaelkregnes9119 Před 5 lety +79

    I Noticed this channel because of the Ryzen 3000 Series leaks. I got here from UFD Tech, and from that time to now i have watched at least 40 of Your videos. Keep up the grat content, and to top it off i love Your accent.. cant have 1 month without "Aritte guyz howsit goin":)

  • @Starchface
    @Starchface Před 5 lety +54

    Cracking video Jim! Brilliant. Enjoy your rest. You've earned it.

  • @bigogle
    @bigogle Před 5 lety +39

    Brilliant. I was enthralled the whole way through.

  • @ADR69
    @ADR69 Před 5 lety +50

    I know this took forever to make but it was worth it. Thanks for sharing, this was really interesting.

  • @mike-barber
    @mike-barber Před 5 lety +11

    Really good video Jim, again. Being a coder involved in doing some fairly fast stuff, I do know how caches work in moderate detail, but found this to be a really good explanation for everyone. I think you did a great job of keeping the detail at just the right level (without going into all the extra stuff like cache lines, associativity, prefetching etc).
    Also really enjoyed seeing what is going on with Zen. I hadn't clicked that it was a victim cache, and definitely interesting to consider how this affects different CCX's. 16MB x 4 CCX is still just 16MB if you're doing stuff on one thread. Interesting stuff for both application and kernel devs.
    Thanks again. Your videos rock. Keep up the good work.

    • @adoredtv
      @adoredtv  Před 5 lety +2

      Cheers, yeah I've been surprised by just how many people said they didn't know Zen's L3 was victim cache!

  • @kevinglennon7864
    @kevinglennon7864 Před 5 lety +5

    As a scientist, proper interpretation of errors is incredibly important to me. It is not correct to say "This value was lower than the other, but they were within margin of error." If the values are within 1 standard deviation, the only thing we can say is "the numbers could not be measured to be different." We honestly just have no idea which one is actually higher than the other. The number which is perceived higher may actually be the lower number, and was only perceived higher only by random chance.
    Although people often publish at just 1 SD, you should really be comparing numbers at 2 SDs (95% of the area under the gaussian) if you're trying to determine if they are the same.
    Great video, you make it easy to learn about something entirely new.

  • @eubikedude
    @eubikedude Před 5 lety +120

    23:18 16MB RAM eh? ;) An easy slip when you are discussing all the older stuff and cache sizes. :)

    • @The0Gizmo
      @The0Gizmo Před 5 lety +5

      Caught that also, lol.

    • @wewillrockyou1986
      @wewillrockyou1986 Před 5 lety +23

      He made a mistake, whole video must be complete bullshit ;)

    • @cybercat1531
      @cybercat1531 Před 5 lety +6

      Well... Cache is just fast SRAM.

    • @ec1021501
      @ec1021501 Před 5 lety +8

      This is what will happen if you run out of cache and think you could reduce the latency by not looking at your script.

    • @AscendingApsolut
      @AscendingApsolut Před 5 lety +2

      wrong time, it is 23:02 instead

  • @alb9229
    @alb9229 Před 5 lety +73

    Jim learned to code Pascal .... NVIDIA BIAS CONFIRMED 🤣🤣🤣 . Great video as always Jim !

  • @user-wx1zo9ef3f
    @user-wx1zo9ef3f Před 5 lety +71

    If adored studios launch a new game it will be for sure optimised for the ccxs xD

    • @Numenor76
      @Numenor76 Před 5 lety +3

      Looking forward to the game then ;)

  • @phillipcrowley7541
    @phillipcrowley7541 Před 5 lety +4

    I cought that subtle hint at the end. Next video hopefully won't take "7" days to release. Radeon VII video incoming

  • @dionamuh
    @dionamuh Před 5 lety +46

    Did you know UserBenchmark now has a link to this video at every System Memory Latency Ladder graph? Pretty cool. 😎
    Very interesting stuff btw!

    • @PanduPoluan
      @PanduPoluan Před 5 lety +4

      They did? Wow... Jim's truly well on his way to success.
      All the best wishes. His analysis are always the greatest.

    • @RobBCactive
      @RobBCactive Před rokem

      Ironic!! I wonder if Userbenchmark users stumble onto Jim's exposé into unreliable unprincipled world of slanted benchmarking.
      Last time I tried Userbenchmark with a Ryzen it ludicrously recommended a dual core i3

  • @blackheart004
    @blackheart004 Před 5 lety +9

    At the 3 minutes mark I got SUCH A HUGE NOSTALGIA PANG :O
    Back in 1991 when I was like 7 yrs old, my mom bought me a CIP-03, which was a Romanian produced Sinclair Spectrum (I live in Romania btw) with 48 KB of memory. AH THE DAYS of learning to code in BASIC!

  • @Healtsome
    @Healtsome Před 5 lety +5

    Your channel is the only one that could help me to batlle my attention span loss. Thank you.

  • @dbzssj4678
    @dbzssj4678 Před 5 lety +4

    Above the charts on userbench they've added a tidbit at the end as a link :D "L1/L2/L3 CPU cache and main memory (DIMM) access latencies in nano seconds (explanation by AdoredTV).
    "

  • @rick-potts
    @rick-potts Před 5 lety +3

    Couple of years older than you Jim, and some of my fondest memories of "me and my dad" were the hours we used to spend together programming and "gaming" on the Spectrum.

  • @Loundsify
    @Loundsify Před 5 lety +2

    A lot of schools would find content like this really useful for teaching computing.

  • @jonavin
    @jonavin Před 5 lety +8

    All you people nitpicking at his coding sample. He’s just dumbing it down so that an average people can understand it. It’s not really important that any strings in the sample would also need a trip to memory. If you want to be technical, the integers would be loaded into registers before the operation. You’ll just confuse most people if there’s too much more details. I think it was the right level of details for the understanding of multilevel caches.

    • @yottaXT
      @yottaXT Před 5 lety +2

      Haven't read any coment in that regard, at least not yet, but yeah as you said, he did a very simple example so everybody could follow the explaination, a very good one to be fair. I'm a Software Developer myself and found it very on point, i wish i've had a teacher with that kind of devotion and tact to explain things that easy back in my univ days.

  • @TrueThanny
    @TrueThanny Před 5 lety +3

    The 486 also supported a L2 cache, but it was not on the CPU package. It was on the motherboard. This was also the case with the Pentium line, as well as the Cyrix and AMD clones with both 486 and 586 class chips. The cheapest motherboards had no L2 cache sockets. The cheaper ones had the sockets but no chips. And the mildly cheap ones had either half the sockets occupied, or all occupied with small chips (making a cache upgrade much more expensive). The impact on performance between having an L2 and not was immense. The impact of its size was less dramatic, but still significant. Having it on the motherboard also meant more difficult troubleshooting. Rather than just having to check for bad memory SIMMs as causes of frequent crashing, you had to check the SRAM chips as well.
    Intel moved the cache onto the CPU package with the Pentium Pro, Pentium II, and first version of the Pentium III. The second P3 revision finally had on-die L2 cache. It was beaten to the punch by AMD with its K6-III chip, which was also the last AMD x86 chip to use the same socket and motherboard as Intel chips. Next came the Athlon, and the beginning of AMD's rise to nearly half the market, before Core 2 cut them off at the knees.

  • @TheOblacek
    @TheOblacek Před 5 lety +6

    Damn Jim I noticed that on Userbenchmark under System Memory Latency Ladder as explanation they have posted a link to this video. Congrats!
    It's a very informative video I enjoyed it a lot!

    • @PanduPoluan
      @PanduPoluan Před 5 lety

      No kidding! Another commenter mentioned this, and so I just _have_ to check it out... and it's gloriously awesome!
      Jim, you're really making your way up to become one of the Internet's greatest sources. Congrats!

  • @Sarcazmotron5000
    @Sarcazmotron5000 Před 5 lety +4

    Everything You Ever Wanted to Know About Cache But Were Afraid to Ask 👍

  • @The_Nihl
    @The_Nihl Před 5 lety +43

    Sup Jim!
    Any mention over processor uArchs, Im wet.
    Highly educative content, and explained in such great way! I really love how you break down the cache purpose and functionality/principle of operation in such easy and understandable way for even people not exactly PC-hardware iterate.
    Cache size in Zen2 processor is really interesting beast. Bigger cache is always useful, as difference between latency and bandwidth of L-caches and DDR main memory banks is monstrous bottleneck. Many modern processors from long time struggle with waiting or idle cycles, when waiting until memory will be addressed, instruction extracted.... just looking at these CAS latencies today! 14 to 21 Cycles? yugh....more cache on Silicon would allow to store more binary data and instructions, creating significant reduction in waiting or wasted cycles for the memory. specially on doubled cache... 32mb per chiplet? oh my.... I still have my old Cyrix with kilobytes of cache haha
    I really liked this video. level of free education here is absolute champ!

    • @glenwaldrop8166
      @glenwaldrop8166 Před 5 lety +3

      As the size increases so does the latency though.
      I imagine AMD has done the math.
      I am, however, certainly looking forward to the day that we have 16MB L1, though I imagine code will be so inefficient we'll need it.
      Ever notice that no matter how fast computers get Windows is always slow? Yeah, that's the only part about massive CPUs that worries me. MS doesn't seem to understand we got the massive computer to run something other than the OS.

    • @brunogm
      @brunogm Před 5 lety

      @@glenwaldrop8166 There are some papers on this. "Hybrid Memory Cube in Embedded Systems", basically HMC as main memory is better than LPDDR3 + L2 cache configuration.

    • @dreadlock17
      @dreadlock17 Před 5 lety

      Lmao good to see you here lukasz

  • @mattsmechanicalssi5833
    @mattsmechanicalssi5833 Před 5 lety +70

    Back in the day, AMD and Intel CPU's used to fit in the same socket. Cyrix too! What if an engineer is using an Intel board (Though be it heavily modified) in order to match their performance levels. Just a thought.
    Great work Jim. And I love the story of your childhood. NostraScotsman is human after all!

    • @bgk8890
      @bgk8890 Před 5 lety +11

      This sounds like way too much work

    • @glenwaldrop8166
      @glenwaldrop8166 Před 5 lety +11

      @@bgk8890 it would be a good way to run proper apples to apples testing.
      Gotta wonder.
      With the IO chip they could drop the CPU back to AM3+ if they wanted to.
      There would be a performance hit, obviously, but I would jump on a Ryzen upgrade chip in a second.

    • @pleasedontwatchthese9593
      @pleasedontwatchthese9593 Před 5 lety

      I could see this being a thing

    • @darven
      @darven Před 5 lety +4

      That would be a nice thing. But... I doubt intel would like it. They would probably completly rearrange the pins and whatnot with every release just to make AMD waste time.
      But i am all up for flashing the bios to be able to run ryzen and viceversa.

    • @soylentgreenb
      @soylentgreenb Před 5 lety +20

      Matt Christie The reason why AMD used the same socket as intel was that AMD started making X86 CPUs as a second source for Intel. Now why would anyone like another company to act as a second source for their product, reducing profits? Because many of the early customers were military and mission critical and companies like IBM demanded that there was someone else able to supply a compatible product to replace intel if they failed to deliver or went out of business or something. AMD produced exact, identical 8088, 8086, 286 and so on chips under agreement with intel. With the 386, intel thought it was so revolutionary they basically said screw it and refused to let AMD be a second source; this delayed IBMs use of the 386, but other companies like Packard bell and what not made "PC Compatibles" with the 386. AMD didn't truly make their own separate product until the k5 and they didn't really succeede in making a competitive product until the K6 which was based on the Nexgen NX686 (AMD bought Nexgen).
      AMD was remarkably successful in the late 90's and early 00's. If not in sales, then on competitive performance. The pentium pro was a huge leap for intel; the K7 slot A Athlon let AMD catch back up and slightly surpass intel in floating point (which was the new thing since Quake that every 3D game needed). The Athlon 64 beat intels pentium 4 badly; this was because intel expected their process engineers to pull another rabbit out of the hat and make the pentium 4 run cool enough at 8 GHz (double pumped ALU running at 16 GHz) by 2003 and that was just not possible.
      In the late 90's AMD managed to equal intels performance while on average being a process node behind. That was really very impressive.

  • @shznn
    @shznn Před 5 lety +23

    Used to be an Intel fanboy. Thought they had it all figured out. Heck, I was thinking of spending 500Euro on a 9900k on Black Friday, I didn't and am now waiting on Ryzen 3. I thought Zen was a "value proposition" when in fact it is superior to Intel's obsolete engineering in every way, from architecture to a viable cooler included. I'll venture to say that I used to not pay attention to power consumption, until after watching Adored and Coreteks. Now I understand that efficiency = power. I also used to think that nVidia is good, no matter what. Thanks guys. Now I'd only like a 9series Intel if it's for free :)

    • @grizzly6699
      @grizzly6699 Před 5 lety +3

      Jim has great analyses in his vids. I found Coreteks several months ago and does similar content to Adored. Maybe they should collaborate, sounds like a plan :)
      I thought AMD was the greatest, until they released Bulldozer in 2011 and I turned to Intel and never looked back... until Ryzen arrived in 2017. Now I'm planning a Ryzen 3000 system later this year or the next. I can't wait to see what unfolds this year in the tech space.

    • @bartbroekhuizen5617
      @bartbroekhuizen5617 Před 5 lety +2

      @@grizzly6699 Yeah, Coreteks also explained in his video about the energy it requires to run something to one point to another. Jims analysis perfectly fits the explaination of Coreteks. Here is his video: czcams.com/video/oU-NNV2pYTQ/video.html

    • @PanduPoluan
      @PanduPoluan Před 5 lety

      If you can get a 9series Intel for free, please inform me as well.
      Despite my Grand Dream of owning a 12- or 16-core Ryzen 3000, I definitely won't turn down the opportunity of owning a 9series Intel... for free xD

    • @RobBCactive
      @RobBCactive Před 3 lety

      But a free i9 is worthless without an expensive Z series mobo and if you use it for long periods (perhaps gaming) those extra watts add up, especially in warm summer with a/c.

    • @RobBCactive
      @RobBCactive Před 3 lety

      @@grizzly6699 But AMD 64 X2 was already out performed by core duo years before 2011. The Phenom / Bulldozer was disappointing because it meant AMD's new arch hadn't caught up with Intels, condemming them to discounting until their next CPU generation.
      But they really weren't so terrible, people were able to buy 2, 4 (even 3) core chips cheaper than Intel's ... It was in the reviewers' main app they looked bad, benchmarks mainly due to floating point which wasn't very relevant if you had a GPU for fp offload.
      That arch spawned Jaguar used in consoles and some integrated APU designs which performed well in daily use, allowing 3D games to run faster and smoother than the competition.
      The tech press often totally dis designs aimed at markets they have little experience of ... at one time it was "But can it run Lotus?" when ever a higher performance CPU was introduced due to a power user obsession with how they used their PC. I remember a 32bit workstation I worked on having a 16bit so called accelerator added so it could run MS-DOS apps, when I would actually knock up scripts or write C faster than I could use the productivity suite of that era.

  • @introvertplays6162
    @introvertplays6162 Před 5 lety +4

    everytime I see an AdoredTV video in my subscription box I shout out loud: "YES!!!" and some family member comes over to my room to ask what happened. XD

  • @stale2665
    @stale2665 Před 5 lety +6

    when you start the video and your speakers are off and you rewind just to make sure you catch the "alright guys how's it going"

  • @jkd7799Yann
    @jkd7799Yann Před 5 lety +3

    I have Never Seen any other youtuber out there going so thoroughly into such details, that's why i remain a strong subscriber

  • @ADR69
    @ADR69 Před 5 lety +11

    Ah, the history of cache at 0430 in the morning. Yes please

  • @kopasz777
    @kopasz777 Před 5 lety +5

    As a CS graduate, this was a nice recap. You explaining it in such detail made me realize how the "curse of knowledge" affected me, just assuming what I know all others know too.
    But I believe most of your subscribers are more tech-savvy than the average guy.
    Edit: sry, this came out a bit pretentious.

    • @adoredtv
      @adoredtv  Před 5 lety +3

      No you're right in that you are in that higher expertise range on this topic. ;)
      But most of my subscribers have no idea about this stuff. Even at this "entry" level, it's far beyond anything most have been taught. This is actually one of my major strengths - understanding when it's gone too far for most to comprehend, and toning it back.
      I could have gone a lot deeper (I'm no expert on cache and never will be), but had I done so it would have been alienating to the average viewer.

  • @____5837
    @____5837 Před 5 lety +2

    The only thing I would add to your explanation of how caches work at 15:22 is that what gets deleted doesnt just depend on how long it has been since the data was last read, it is also effected by how frequently that data was previously read, so even if bobhealth hasn't been read for a while, it might not be deleted if it was previously read more frequently than everything else.

  • @personaldronerepair6141

    Fantastic explanation !!
    That was time well spent watching .
    Thank you for the time in .

  • @RoyTelling
    @RoyTelling Před 5 lety +6

    KIITOS - THANK YOU....
    you have managed to get me to understand Cache a lot better, at a level I could understand (my 58 year old brain not as quick as it use to be LoL)
    shared this on my FB page because I think many people my like this

  • @tonn333
    @tonn333 Před 5 lety +15

    Trip down the memory lane...

  • @AlmightyGTR
    @AlmightyGTR Před 5 lety +2

    Next week prof. Jim will help us understand LRU, LFU and FIFO. Jim, you are tremendous at ELI5, a born guru.

  • @Jimster481
    @Jimster481 Před 5 lety +2

    Another great video!
    I am a low level software engineer and I have written many small algorithms that directly benefit in terms of caching.
    Infact when designing algorithms that will run almost constantly its important to be mindful of the average cache size of CPU's to be able to achieve as much performance as possible.
    As the case with most AMD hardware... these large cache's wont be utilized immediately and it will take some years of optimizations / software progress to truly speed up the most common of algorithms.
    Although I think that the Ryzen "AI/Intelligent" cache is actually scanning through applications in real time and trying to figure out what data is the best to cache. I have noticed some weird behavior on programs that I develop for my company where specific algorithms (especially those which are heavily multi-threaded) are much slower on my Ryzen vs on my older Intel parts.
    So much so that my old skylake i7 U XPS 13 can beat my 1700x by a full second (or sometimes more) in my data randomization software when targeted on one of my production products.
    The total "processing time" comes out to around 6 seconds on my Ryzen, but using the same 8 threads (or even 16) results in the performance being very much the same while the same task on the i7 U can take only 4 seconds even using the 16 threads.
    The design of my application has a single controller/dispatching thread and then it fills up the rest of the threads with work while it waits for them to complete (not the best design since it has to wait, but I cba to re-design it since its already more than fast enough)...
    Something about the AMD Ryzen Cache + IF penalties make this task so very much slower vs older monolithic intel designs.
    I hope that with Zen 2 that the IF performance is increased again or that the caching is improved to reflect an increase in performance in my specific application (not that the performance needs to be better, but I also use it as a sort of a benchmark).

  • @KuraIthys
    @KuraIthys Před 5 lety +19

    Since I still mess around with these early 8 and 16 bit systems, yeah, memory speed really became an issue.
    Especially the 6502 family had issues.
    See, the 6502 is a memory + register design, where say even the 8086 and z80 are register + register designs.
    You might ask, so what?
    Well, it means that the majority of operations on a register+register design revolve around combining values from two registers and storing the result in one of those registers.
    That means with careful coding, you can keep a lot of stuff in the registers, and keep memory accesses down.
    The Memory + Register design means almost all instructions that exist are ones where one of the values operated on is in memory, and one is in the CPU.
    That means everything accesses memory all the time.
    As long as the memory can keep up, that's fine.
    But as the CPU speeds outpaced memory speeds, it became more and more problematic.
    And the full extent of how bad this could get could be seen in the 16 bit console wars, where you had the Motorola 68000, which is a Register+Register design, vs the 65816.
    Now, the 68000 has a lot of registers to work with, so you really can cut down on memory accesses. In fact, the processor only does one memory access every 4 cycles, so it's even more critical to avoid frequent memory access.
    But it has another consequence too - since memory access is only on 1 in 4 cycles, the CPU can run 4 times faster than the memory without causing problems.
    Contrast this with the 65816, which performs single cycle memory access. Great. Amazing even. IF you have memory fast enough.
    And in fact, the 65816 makes things even worse, because a design quirk means the memory has to respond in half a cycle to prevent CPU instability...
    So guess what, that 3.58 mhz 65816 requires memory rated for something like 120 nanoseconds (eg fast enough for roughly 7.16 mhz single cycle access).
    Meanwhile that 7.16 mhz 68000 requires 480 nanosecond access. (eg. fast enough for 1.79 mhz)
    See the issue yet? Keep in mind that faster memory is more expensive than slower memory, and that was even more true in the 80's and early 90's...
    So, the 65816 system needs 4 times faster memory than the 68000 running at twice the clock speed!
    Does that mean it has 4 times the memory bandwidth? Well, no. The 68000 uses 16 bit memory, while the 65816 uses 8 bit memory.
    But the cost of memory is more closely related to it's speed, than the bit width.
    Also due to that unfortunate half-cycle requirement, the actual rate at which a 65816 CPU accesses memory is half the speed it's memory would suggest. Because of that, in spite of having 4 times the memory speed, the 3.58 mhz 65816 has the same memory bandwidth as the 7.16 mhz 68000.
    By now I'm sure you know which systems I'm referring to. And you can see how awful the SNES's memory performance requirements are, which would have driven up prices of RAM and ROM.
    If you know your 16 bit console hardware, you might also know that the SNES CPU drops to 2.68 mhz fairly often.
    And why is that? Because the system's main RAM, and most of the ROM chips seen early in it's life, weren't fast enough!
    So owing to high cost/low availability of sufficiently fast RAM, the SNES is actually operating at 2.68 mhz much of the time, not it's hypothetical 3.58
    And that's a 33% speed reduction.
    So is there an upside to this apparent weakness? Well, yes. The 6502 family can be said to have a very high IPC. That is, for a processor from that era it performs it's calculations in a rather low number of cycles.
    That's all well and good if memory speeds keep pace with CPU speeds, but of course, they didn't. And that became a problem.
    Because the CPU quickly started to outrun memory speeds, and that's very bad news for a Memory+Register design.
    Plus, where a Register+Register design lends itself well to cache schemes, a Memory + Register design is much less viable if you need a cache.
    Certainly, It's by no means impossible to use a cache with a memory + register design. And the 6502 family in particular could benefit enormously from the first 64k of memory being quite a bit faster than the rest of the memory, owing to the way it uses it's stack, and the zero page/direct page logic that treats the first 256 bytes (or a specified 256 byte range in the first 64k for the Direct Page version) of memory as something akin to an extended register file.
    However, on the whole Memory + Register designs are still relatively poorly suited to cache memory schemes.
    And thus, they largely fell out of favour.
    Although, ironically perhaps, the 6502 family is one of two main 80's designs that is still widely available as newly produced chips, and has even increased in speed. The standard modern 65816 chip runs at 14 mhz, and can easily be overclocked to 20 mhz without much problem. FPGA implementations have even managed to hit 200 mhz.
    the other design that's still widely available is the z80. Though that's not particularly faster than the early 80's versions.
    These chips are largely relegated to use in embedded devices, but the fact that they're still easily available when stuff like the 68000 hasn't been manufactured in something like 20 years now, says a lot about the enduring (if niche) benefits of these two designs...

    • @Leyvin
      @Leyvin Před 5 lety +5

      www.digikey.com/products/en?keywords=MCF54455VR266 (266MHz 90nm 68060 w/MMU/DSP/FPU/Cache/Superscalar)
      They do still produce the Legacy 68K Processors as well (MC68SE as the Search Key) … and while they're not strictly speaking "End-of-Life" as of yet., they are only produced to order now; and they've been threatening to discontinue them for the past 3 years.
      There hasn't been any "New" Developments on the Architecture since 2004., and no new Revisions since 2010-2012... something like that.
      Don't count out the 68K just yet... it's a trooper, especially in the Automotive Industry.

    • @RobBCactive
      @RobBCactive Před rokem

      That is over simplistic, it ignores the practical cost effective performance of 6502 micros. The 8086 & 8088 were 16bit processors, the PCs massively more expensive. Registers were not sufficient as CPU speed improved, RAM was a bottleneck, requiring cache-ing. The Z80 had higher frequency but multi cycle instructions, registers need transistors so add cost.
      The 6502 has an accumulator register but also fast zero page memory access without 2 cycles for a 16bit address. That is elegant and economical on transistors, the instructions operating in 1 or 2 cycles. The memory at the time operated at CPU frequency, so the solution matched the ecosystem using cheap 8bits for 16bit addressable memory. Furthermore memory mapped i/o avoided special but inflexible i/o instructions. The 6502 has a partial pipeline interleaving RAM access with computation.
      Moving on, engineers at Acorn dissatisfied with the available 16/32bit designs were able to make the ARM1 taking advantage of 32bit bandwidth with the CPU coupled with RAM speed by a load, process, store pipelined architecture. It was very fast, cheap and power efficient and outperformed the efforts of far better resourced teams.
      The downsides were: firstly the coupling had to be broken introducing caches as CPU frequency increases outstripped memory speeds, but that broke some software which had relied on the unbuffered RAM characteristic.
      Secondly requiring 32bit RAM was too expensive for embedded application and code density insufficient to meet cost targets of designs that wanted the power efficiency offered. So a 16bit instruction mode was added and it was adapted to minimise RAM chips.
      The point is designs meet a market and cost constraints, the elegant highly optimised becomes unsuitable when conditions change.

  • @CRBarchager
    @CRBarchager Před 5 lety +1

    An instant like for your video as always. These in-depth videos are really the best thing about your channel and loving the knowledge and the details that goes into them. Thank you for this and looking forward for the next one!

  • @acidstorm001
    @acidstorm001 Před 5 lety

    Jim, no one else covers anything like this on CZcams to this extent. It's one of the reasons why I love this channel. 42 minute video, and I didn't even flinch. Most 20 minute videos I watch, I'll jump ahead on. Your videos just flow so well that I do not feel a need to jump ahead. In reality, you can't anyway. You would miss something important leading into your conclusions. Great stuff as always, keep it up!

  • @Half_Finis
    @Half_Finis Před 5 lety +8

    How dear you release this while I'm at work? I need to comment first!!!!

  • @Orochimarufan1900
    @Orochimarufan1900 Před 5 lety +44

    You know, the basic line numbers aren't as useless as they seem. They're basically the predecessor of jump/goto labels. They allow you to later add things between lines (hence you'd usually count 10 20 30 etc instead of 1 2 3) without messing up all your jumps (Trust me, having to go through your whole program and fix up every jump just because you added a line somewhere is not fun, especially since one's likely to repeat it a lot when debugging). At the time there were also few sophisticated text editors available, so replacing a line just by entering the same number again would have been easier than trying to edit the old one. This is especially true on (semi-) write-once media, though i'm not sure how much use this last point was in practice.
    Overall great video though.

    • @dralord1307
      @dralord1307 Před 5 lety +2

      On my comodore 64 it also made debugging a hell of a lot easier :D

    • @RolandSxxx1
      @RolandSxxx1 Před 5 lety +2

      It's also you taught the 10 x table...

    • @pleasedontwatchthese9593
      @pleasedontwatchthese9593 Před 5 lety +1

      I think it was laziness. Labels could have been used for jumps and inserting a new line of code could have been based on a relative generated line number when printing to the screen. I think they where happy that it was working and did not care how well it was.

    • @adriankelly_edinburgh
      @adriankelly_edinburgh Před 5 lety +3

      Don't forget that back then, BASIC code on these machines was interpreted rather than compiled so getting an error such as 'Syntax error at line 40' made it much easier to pinpoint your inevitable typing mistakes. I had an Acorn Electron back then which ran BBC BASIC which supported auto line numbering as you typed and also had renumber command which could re-do all your line numbers if inserted code meant that you were in danger of running out of space between existing lines.

    • @joker927
      @joker927 Před 5 lety +1

      There were no alpha line labels? Couldn't say jump to "loop1?" The line label was the line number?

  • @heckyes
    @heckyes Před 5 lety +7

    You're a god damn nerd treasure!

  • @TheythinkimNinja
    @TheythinkimNinja Před 5 lety

    Thank you for making these Long Videos, I listen to these wall at work and they are really entertaining to listen to and very informative. Keep up the good work

  • @Mattski_83
    @Mattski_83 Před 5 lety

    I just watched this and I would have to say that this is one of your best videos yet. I learnt so much and I enjoyed every second of it. You seem to be getting better every video you make and I've only been watching for a year and a bit. Keep up the good work, I eagerly await your next video.

  • @dycedargselderbrother5353

    Line numbers in programs made a lot of sense in the "goto" paradigm where a goto statement would just jump to a line. This was eventally replaced by function programming, where code was separated into modules. This was a bit heavy for these old computers, though, because you needed to utilize a call stack to keep track of the functions you were entering and exiting. goto was faster, though it tended toward incomprehensible spaghetti code that no one but the original designer could understand or contribute to.

  • @pvalpha
    @pvalpha Před 5 lety

    Once again a very excellent video. Take your time, we understand. :) As someone who started on a TI 99/4A when they were 8 years old, I certainly understand that early computer intro and why its so important to understanding the evolution of computer systems to the modern day. This is one of the clearest explanations I've ever watched and listened to. Thank You.

  • @MKeehlify
    @MKeehlify Před 5 lety

    My grandfather was a coal miner, my dad a software developer, me I'm a crypto miner 8-) ... j/k I'm a software developer too! I wrote my first useful program in 2005. Most programmers from my and later generations have completely different ideas of high and low level programming. It's awesome to get glimpses of the past. The simple and personal introduction was a great way to lead us into the main content. Thanks for the video Jim!

  • @ImTheSlyDevil
    @ImTheSlyDevil Před 5 lety +1

    I really appreciate this extensive explanation, Jim. Also, I'm glad to see that Userbenchmark has added a link to this very video to explain their cache/memory latency section of the benchmark.

  • @RepsUp100
    @RepsUp100 Před 5 lety +1

    Informative as always, thank you!

  • @darkmhk
    @darkmhk Před 5 lety

    Great video Jim, Always enjoy your technical videos, extremely informative and well researched.

  • @awesomelegend3411
    @awesomelegend3411 Před 5 lety

    Love your work Jim

  • @deivytrajan
    @deivytrajan Před 5 lety

    The mix of your background is amazing! Well done, interesting video! Keep it up

  • @chefov
    @chefov Před 5 lety +2

    The mad lad strikes again. What an awesome surprise to get a video while at work. Now I'll have something decent to do! Cheers!

  • @ryankraidich4533
    @ryankraidich4533 Před 5 lety +1

    @AdoredTV/Jim Userbenchmark now has a link back to this video for the System Memory Latency Ladder section!

  • @jowarnis
    @jowarnis Před 5 lety

    Awesome video as always!

  • @andyp123456
    @andyp123456 Před 5 lety

    Thanks for the lesson (and little bit of family history), Jim. Love all the analysis, but also looking forward to seeing what you come up with next if it's less analytical than the last few videos.

  • @mik310s
    @mik310s Před 5 lety +2

    Great video as always Jim. Please dont stop the analysis videos they are the most interesting :)

  • @btw8798
    @btw8798 Před 5 lety

    This video was very informative. Thank you for taking your time to explain all of this important information in a simple manner.

  • @nashfactor21
    @nashfactor21 Před 5 lety

    Really interesting video Jim. Always excited to see one of you video's pop up as it usaully means I may very well learn something new or informative.

  • @forrestglenn2520
    @forrestglenn2520 Před 5 lety

    Thank you for all your hard work!

  • @darven
    @darven Před 5 lety

    Brilliant video. Very well explained. Thank you for spending your time making content.

  • @jamespieske5246
    @jamespieske5246 Před 5 lety +5

    Yes! A 42 minute masterpiece that dropped between dinner & bedtime instead of 3am!
    Earbuds definitely in whilst I do the dishes tonight!

  • @catsspat
    @catsspat Před 5 lety +2

    You made me look up info about the first computer I ever had. GoldStar (LG's ancient ancestor of sorts) FC-150 (FC supposedly meant Family Computer). I couldn't even remember the model number and only got to it after some searching. It was apparently a weird clone of Japanese Sord M5, which itself was likely a clone of something else. Kind of like MSX, but not really. I even had this weird printer (plotter?) attachment that printed using special ball-point pen like insert. It would control whether the pen was pushed against the paper or not, and then move the paper up-down or pen left-right to draw. Insane! Of course, I also had a dedicated cassette recorder attachment to save BASIC programs.
    Nostalgia explosion!

    • @adriankelly_edinburgh
      @adriankelly_edinburgh Před 5 lety +2

      Didn't LG actually stand for Lucky Goldstar?

    • @catsspat
      @catsspat Před 5 lety +1

      @@adriankelly_edinburgh
      Yes, they did for a while (Lucky was another big Korean company that dealt with chemicals and stuff). Come to think of it, the merger was between a chemicals company and an electronics company? Weird. I don't know when LG switched to "Life is Good."
      GoldStar sort of made more sense since they competed directly against Samsung (literally ThreeStar). I miss GoldStar's old ad always ending with the statement, "a moment's decision determines 10 years of outcome" or something like that, meaning you choose an appliance for the long haul.

    • @snetmotnosrorb3946
      @snetmotnosrorb3946 Před 5 lety +1

      I still have a GoldStar microwave oven. I believe it's almost 30 years old. It's still working.

  • @Law0fRevenge
    @Law0fRevenge Před 5 lety

    Another great video! It's seriously phenomenal how much I learnt by watching your content.

  • @MrGunnarPower
    @MrGunnarPower Před 5 lety

    Nice one, I spend all week just waiting for your next video. I get excited when I see the notification pop up that you uploaded again.

  • @SleepyRulu
    @SleepyRulu Před 5 lety +2

    Today my birthday more adobedtv feeding us knowledge making us more better informed consumers.

  • @Cannabis_Connoisseur
    @Cannabis_Connoisseur Před 5 lety

    Awesome video. Best you've made in a while. Best I've seen from anyone in a while for that matter.

  • @retractingblinds
    @retractingblinds Před 5 lety

    Man that was great. I love these long form historical videos. Fun example there with the Speccy.

  • @Mrfiufaufou
    @Mrfiufaufou Před 5 lety +1

    Finally a new video, always a delight!

  • @senri-
    @senri- Před 5 lety

    Love this kind of video, hope to see some more interesting stuff like this! Thanks for the great content.

  • @junkerzn7312
    @junkerzn7312 Před 5 lety +2

    Yah, but you could count cycles! Ah yes, I remember those days. I actually built a 2400 baud modem with an ADC, a DAC, trig tables [256], and very, very carefully cycle-counted code. I could produce perfect waveforms and I could perfectly decode the receive waveform. The sucker actually worked!
    -Matt

  • @MrBoombast64
    @MrBoombast64 Před 5 lety

    Tnx for an very easy to understand video. Great work Jim

  • @M00_be-r
    @M00_be-r Před 5 lety +1

    Great video Jim 43 minutes felt like a blink of an eye, love this indepth approach.

  • @elitetripod4188
    @elitetripod4188 Před 5 lety

    Excellent video! Thoroughly enjoyed it :)

  • @PanduPoluan
    @PanduPoluan Před 5 lety

    VERY educative, good sir!
    I kept forgetting watching this vid, but I'm totally not regretting spending 43 minutes of my weekend for this.
    Keep up the good work!

  • @Najvalsa
    @Najvalsa Před 5 lety +3

    Perfect thing to be presented with after work.
    Thanks for the continued work, Jim. :)

  • @felixonyango5409
    @felixonyango5409 Před 5 lety +3

    Wow. Wow. Wow...... Thank you for a brilliant explanation on how cache memory works. Will be using this to educate my programmers.

  • @warp00009
    @warp00009 Před 5 lety

    Brings back memories! My first computer was the original "Altair 8800" with an Intel 8080 processor running at 2MHZ, built from a kit (in 1975). It started off with 4k of memory, which was actually enough to run Bill Gate's and Microsoft's first product, the Altair 4k BASIC interpreter - which still give you 750 bytes for your running BASIC program. The 8k interpreter was a big improvement and could run the TTY graphics "Star Trek" and text-only "Adventure" games! My Altair eventually was upgraded and topped out with a 4MHZ Z-90 CPU and 56k of memory (out of the 64k that the architecture was limited to). The Altair memory had a 500 microsecond cycle time, where as the colleage's IBM 370/135 mainframe memory only had a memory cycle time of 980 microseconds by comparison...
    Two of IBM's biggest mainframes back in the 70's were the System 370 model 168 (used by big corporations and universities) and the System 370 model 195 (of which only a few were sold, to do things like advanced scientific research - like weather prediction models) - which were basically the same in clock speed, except the significantly better performing model 195 had about 4 times more cache memory - 64k instead of 16k. Back them, cache memory was very expensive - obviously...
    One of my personal benchmarks, for assessing pure computing power, is to compute 10,000 factorial (10,000!) out to every significant digit. In 1985 the largest mainframe that IBM made (with a single user on the system) took 22 minutes to solve that problem. In 1990, my 90MHZ Pentium could do it in about 16 minutes. In 1999, my 1GHZ Pentium 3 could do that calculation in about 6 minutes. In 2007, my 2GHZ Pentium 4 (I think, of I'm remembering it correctly) system could do it in 6 seconds. In 2013, my I7-3930k clocked at about 3GHZ could do it in 80 milliseconds! Now my current I7-5820k clocked at about 3.3 GHZ can do it in 60 milliseconds - quite a jump from the original 22 minutes, on a mainframe that cost many millions at the time.
    Thanks for the video, and the memories! Keep up the good work!

  • @JobeStroud
    @JobeStroud Před 5 lety +6

    All I know is. That Sinclair coding look a lot like Bethesda coding.

  • @1336Studios
    @1336Studios Před 5 lety

    Fantastic video!! Thank you, thank you, thank you!

  • @alexparkish
    @alexparkish Před 5 lety

    Great video...phenomenal even...This is what i will show my 3 girls when they are older to explain how it all works. Hundreds of hours of editing and research but my god you put it together and explain it so well!

  • @adityac1991
    @adityac1991 Před 5 lety

    WoW . Another great deep and informative analysis video jim . Learned a lot today from just one of your videos then I could have learned from a book i read for 3 days. As always love your work Jim, have good one.

  • @j.b.6855
    @j.b.6855 Před 5 lety

    Nice informational video. It gives a basic understanding of whats going on with memory. Something I really didnt have a clue about. Very interesting and well presented.

  • @unclerubo
    @unclerubo Před 5 lety

    What a wonderfully enjoyable video. A good start for my day off. Thanks for the work you put in there, Jim!

    • @rush21hit
      @rush21hit Před 5 lety

      Probably semantics for the sake of explanation :)

  • @razvanmihai3553
    @razvanmihai3553 Před 5 lety +6

    Seems interesting, you should do more of these technical videos

  • @rubenschaer960
    @rubenschaer960 Před 5 lety +6

    Jim, the cache results don't quite match the q740, but they do very closely match the i7-870, another popular H55 era CPU, including the peculiar latency spike when transitioning from L3 to system memory. Also, congratulations: Userbenchmark now links directly to this video in their benchmark results, under the "System Memory Latency Ladder" header :D

  • @kwisclubta7175
    @kwisclubta7175 Před 5 lety

    I really appreciate the educational value of this one! I mean, same with all your videos, but this one in particular. Thanks, dude!

  • @vladdragos5881
    @vladdragos5881 Před 5 lety +7

    Got my coffee, I'm ready for 46 min of pure IT stuff :))