Parallella: The Most Energy Efficient Supercomputer - Ray Hightower of Bridgetown Partners, LLC

Sdílet
Vložit
  • čas přidán 25. 08. 2024
  • Slides: rayhightower.co...
    Parallella is a single-board computer roughly the size of a credit card or Raspberry Pi. Parallella runs Linux. It has 18 cores (2 ARM, 16 RISC) and you can buy it online for about $150. This presentation tells why we care about parallelism and briefly shows how parallel execution differs from serial.
    Presented at Madison+ Ruby on August 22, 2015.
    Presented by Ray Hightower of Bridgetown Partners, LLC (bridgetownpart...)

Komentáře • 450

  • @gene4390
    @gene4390 Před 7 lety +35

    The most efficient computer I ever saw (I own 2 of them) made in the 1980s the Casio FX-790P. It had built in basic programming language, scientific functions, 16kb of ram, and ran at 1Mzh (very good for the early 80s) and could run for 2 years off two tiny little watch batteries! I used mine mainly in collage and wrote my own programs. I even programed several games for it. lol Almost 35+ years later I still use the FX-790P (or renamed Tandy PC-6) durable micro computer to this day.

  • @kevincozens6837
    @kevincozens6837 Před 6 lety +2

    The parallel "hello, world" program failed. At 12:26 there are 20 responses from 16 cores. Three cores (0,1 0,3 and 2,1) never responded. Four cores responded multiple times.

  • @CryptoJones
    @CryptoJones Před 6 lety +1

    Mr. Hightower, this motivates me to study parallelism more in-depth. Thank you for this.

  • @oldchannel6511
    @oldchannel6511 Před 8 lety +110

    18 cores and 1GB RAM.. Absolute savage.

    • @hydrochloricacid2146
      @hydrochloricacid2146 Před 8 lety +8

      Bottleneck FTW

    • @KianGurney
      @KianGurney Před 8 lety +16

      +CasualMods 7 gamers, one CPU.

    • @0xf7c8
      @0xf7c8 Před 8 lety +2

      +Nerd You have no idea what you are talking about.

    • @oldchannel6511
      @oldchannel6511 Před 8 lety +2

      Yeah I do, lmao.

    • @0xf7c8
      @0xf7c8 Před 8 lety +8

      I'll put it easy for you. You have in your head the concept that this cores are even close to a modern x86 core, when this is not the case. This cores are not even as powerful as 1 single Cuda core in a gpu. A mid-range GPU has, let's say, 650 cuda cores and 2gb of ram. And with that amount of ram they have all the memory they can handle without overshooting. And gpus can easily be used as clusters and in fact they are.
      I'm not saying that this design is perfectly well thought and they have nothing to improve, but that 1gb of ram in this kind of device is not as crazy as you think.

  • @JohnVegas
    @JohnVegas Před 8 lety +1

    I always enjoy your presentations. God bless!

  • @antonnym214
    @antonnym214 Před 8 lety +1

    Nice talk. Outstanding machine, and you present it very well.

  • @rainbowbunchie8237
    @rainbowbunchie8237 Před 8 lety +42

    When your electronics become obsolete, put them in a drawer and keep them forever.
    Electronic things are WAY too cool to throw away, no matter how old they are. =P

  • @AndrewHelgeCox
    @AndrewHelgeCox Před 8 lety

    This is quite interesting in that it is a talk given by a person who is clearly not an expert in his subjects of parallel programming, or really of anything he touches on, but it still manages to be a little bit entertaining.

  • @mehmetedex
    @mehmetedex Před 7 lety

    I can listen forever this guy. Great speech

  • @antonnym214
    @antonnym214 Před 8 lety +1

    45-seconds to fully boot is pretty impressive, compared to my win7 box.

  • @antonnym214
    @antonnym214 Před 8 lety

    Dr. Hightower, this is a very nice presentation. I like your style and how well you explain it for the layperson. It's pretty exciting to run a single module with that little solar generator. Makes me think it would be quite feasible to power a huge array of those with just a few solar panels on the roof. It could be virtually free to power the system. Lots of possibilities there, because for most installations, the challenge is covering the operating costs, as opposed to the initial expense.

  • @pieterrossouw8596
    @pieterrossouw8596 Před 8 lety +34

    1GB RAM with 18 cores, for a lot of HPC applications, that is going to be a catastrophic bottleneck. In x86 architecture compute clusters, a "golden rule" is 2GB per processing core, depending on application obviously. Sure these cores are comparatively weak, but since RAM chips are pretty inexpensive, it's a shame that for that price they didn't include at least 4GB of RAM.

    • @MichaelPohoreski
      @MichaelPohoreski Před 8 lety +3

      +Pieter Rossouw Yup still waiting for an extremely low-cost 16 GB + 8 core, or hell, even 4GB + 4 core device. While the Raspberry Pi 2B, Banana Pi and Parallella are all "nice" SoC embedded devices the lack of 4GB+ RAM gimps theses devices from more "serious" work where our data sets are larger. :-/

    • @walter0bz
      @walter0bz Před 8 lety +3

      +Pieter Rossouw these are 'little cores', more comparable to GPU warps or SIMD lanes.
      one x86 core is equivalent to several parallela cores (it might be as many as 16, depends on pipeline depth, simd, execution units, I don't know off hand), so it's still about right.
      The parallela concept is still worthwhile, GPUs prove more,simpler cores have higher throughput. a big-core spends huge resources figuring out parallelism from a single thread on the fly.
      nonetheless the board has other problems, but they have to start somewhere with this new architecture needing new software. it would be perfect for AI work IMO (dataflow)

    • @0xf7c8
      @0xf7c8 Před 8 lety

      +Pieter Rossouw If you see closely this is a Xilinx chip, probably a FPGA, so i would call it a mounted prototype. Its hard to put 4gigs of ram in a FPGA

    • @llothar68
      @llothar68 Před 8 lety +2

      +Pieter Rossouw
      The problem is not only the RAM (yes 1GB per core is important and at least 64KB cache per core) but the RAM throughput. With just one memory channel you will get almost 0 parallelism in many real world tasks.
      I'm not even think it's a good teaching device because of this restrictions which do not let you draw conclusions about bottlenecks when everything is a bottleneck.

    • @walter0bz
      @walter0bz Před 8 lety

      it's really a forward looking experimental device, PGAS architecture would scale far better than anything else, but they didn't got the budget to build a large chip with newer process yet (the concept only makes sense when scaled up to thousands of cores). Chicken/egg situation with software

  • @reezlaw
    @reezlaw Před 8 lety

    This video being 360p in 2015 showed that we must be actually going backwards

  • @ragsdale9
    @ragsdale9 Před 8 lety +3

    im curious if the parallella would increase wattage under high utilization.

  • @larrycastro7937
    @larrycastro7937 Před 8 lety

    I stumbled onto this website, and thought it was fascinating. All I know is about Moore'slaw, doubling of transistors on a microchip every eighteen months.

  • @davecc0000
    @davecc0000 Před 7 lety

    Excellent presentation, understandable, great examples.

  • @dogeeconomist4825
    @dogeeconomist4825 Před 6 lety

    I'm gonna have to start buying one of these every now and then and setting them up as an ever-growing cluster for BOINC. Much interest in future offers and capabilities as well as competing products as they emerge.

  • @ForbiddenUser403
    @ForbiddenUser403 Před 8 lety +5

    What we really need is a parallel platform with the individual nodes configured like hot swappable modules with the ability to plug them all into a centralized expandable location with a virtualization software solution that's able to recognize the resources of all those "blades" and utilize them, and see them as traditional PC hardware allowing the use of traditional software and OS's without the need to rewrite all application to make use of parallel processing individually..

    • @jgbreezer
      @jgbreezer Před 7 lety

      Computer (software) can't yet parallelise problems for us automatically well enough, we still need to write things in a way ready for this. Its getting more and more the default way of writing things for scaling horizontally rather than vertically nowadays in the commercial world, but still not ready for low-level parallelism in a large way. Cultural change required.

    • @stevebez2767
      @stevebez2767 Před 6 lety

      so buy the board write the program too do nust tjat,parallel programming next stop quanta?

    • @neilruedlinger4851
      @neilruedlinger4851 Před 6 lety +1

      Sounds like a worthwhile project for a savvy start-up company?

  • @TrueRebel
    @TrueRebel Před 6 lety

    XcellenT info Ray... Parallella is the future of Super Computing and that audience couldn't make the math, ha ha ha ha ha ha ha ha. Congratulations Ray

  • @05Rudey
    @05Rudey Před 8 lety +4

    I want one just to tell my mates that I've got a super computer.

  • @tigerbody69
    @tigerbody69 Před 6 lety +4

    "Will it float?"

  • @stevebez2767
    @stevebez2767 Před 8 lety

    Well done with that,actual methodically harden course to project kickstart as well!

  • @12kenbutsuri
    @12kenbutsuri Před 3 lety +1

    I ordered one once, it was completely broken by the time it arrived.

  • @SudoPi
    @SudoPi Před 8 lety +28

    It will be way cooler if this would be maybe about 40$ or so. 150$ is a big price to ask from consumers to purchase a SBC

    • @assaulth3ro911
      @assaulth3ro911 Před 8 lety +1

      +The Random Stuff Yeah. It is however different from a Pi, I think $75-$100 would be more fair.

    • @mysticvirgo9318
      @mysticvirgo9318 Před 8 lety +2

      +The Random Stuff will most likely get less expensive per unit as they sell more and more :)

    • @supercompy
      @supercompy Před 8 lety +2

      +The Random Stuff They are $75 for the micro-server version and $99 for the desktop version right now on amazon.
      I think that is a fair price considering the number of cores.

    • @voyager1bg
      @voyager1bg Před 8 lety

      +The Random Stuff not that expensive, we're talking supercomputing here... I believe such advancements are the future

    • @SudoPi
      @SudoPi Před 8 lety +1

      Yea but if the price is 35$ like the Raspberry Pi than it would probably be more interesting to customers since not everyone would be willing to pay $150 just to tinker around but as you said not that expensive but it really depends on who is looking at the price point and for me, the 35$ price tag on the Pi 2 is cooler

  • @BoggyBogdan
    @BoggyBogdan Před 8 lety +1

    That's awesome
    Thanks for sharing

  • @GrennKren
    @GrennKren Před 8 lety +1

    I saw the future!

  • @justy1337
    @justy1337 Před 6 lety

    Just wished that the video was in full hd.

  • @w.rustylane5650
    @w.rustylane5650 Před 7 lety

    Nice video on parallelism, for what it's worth.

  • @eggraf
    @eggraf Před 8 lety +19

    Run it in parallel on the Mac. he only ran it serially...

    • @SarahC2
      @SarahC2 Před 7 lety +2

      3.6 seconds...

    • @minecraftermad
      @minecraftermad Před 6 lety +1

      kek now do it on a 5W vega or ryzen based thingy (most energy efficient from what i've heard but might be wrong about ryzen)

    • @jvebarnes
      @jvebarnes Před 6 lety

      2015 vs 2018 we cannot know the future

  • @MrGencyExit64
    @MrGencyExit64 Před 7 lety

    GPU cores are general-purpose too, they just work at peak efficiency when coupled with specialized hardware that handles scheduling, memory fetches, etc. for the sorts of patterns (i.e. 4 pixels at a time) used in rendering. You'd need A LOT more of them to achieve their kind of performance without that extra support hardware :)

  • @idhan
    @idhan Před 7 lety +5

    the prime calculation program can be easily run in parallel on the mac.. assuming it has 4 logical processors.. it could run in about 3.5 seconds. That should be the real comparison.. saying that, still the Parallella is an amazing peace of hardware :-)

    • @altEFG
      @altEFG Před 6 lety

      4 times vs 13 times faster in parallel

    • @tigerbody69
      @tigerbody69 Před 6 lety

      please make a vide and show us

  • @JasonAndJodiesAdventures

    Interesting video, on geek.com they listed the parallella as achiving around 90 GFLOPS. So if i did the math correctly there, a supercomputer with the processing power of Tianhe-2 (listed on wiki at 17.6 MW) would require a cluster of around 400k parallellas and run at around 2 MW?

  • @FlumenSanctiViti
    @FlumenSanctiViti Před 6 lety

    I'm not a programmer, but... his code at 12:56 should return TRUE for input number 4?

  • @draken68
    @draken68 Před 6 lety

    Very interesting video. What i got out of that is in metro Australia we pay $2-$2.50 in rural Australia $3-$3.50 (Australian Dollars per Watt)

  • @atranimecs
    @atranimecs Před 8 lety

    It's not about the raw power of parallella, its about the performance/watt ratio and also the heat output.

  • @duderobi
    @duderobi Před 7 lety +1

    3:25 dit I hear right 2 ARM (Acorn Risc Mashine) and 14 Risc cores?

    • @56335130
      @56335130 Před 4 lety

      xilinx zynq is a fpga soc

  • @jessstuart7495
    @jessstuart7495 Před 7 lety

    Any programming language or compiler developments for developing parallel software? I would think the compiler would have to know a lot about the underlying architecture to be able to produce software to efficiently allocate and manage the cores and memory.

  • @frostgreen5527
    @frostgreen5527 Před 7 lety +1

    nice presentation, open source and small power consumption, not bad...

    • @stevebez2767
      @stevebez2767 Před 6 lety

      yunno the actual grounding back ground of living in a gent with some batterries you have too recharge,some windmills,some solar lights,etc to proove you can be coz of this wailing 'universal'failing that ego yelling utter liar of any exsist invites,pays,an builds club war den non men for meter maids count sell no wellys sheep shag act of 'had you all'you know you think I was???Yellow Lines,no on,no approach too know a Meter,answer door,in gets 'bill'big blue 'company guy'your all comparing exsists too pay build three arse non element teee red robe 'kill the giy'gsus vet war law own yer run into ground zero sport o non lord manger e state yells of sit yer on a stamp,get tirkey work or starve,full slave driven yer man,carzee lie sence have it learns???watt o?

  • @chrisking7603
    @chrisking7603 Před 6 lety

    This video its presenter are quite entertaining, but I wanted: #1 a clear comparison of megaflops per megawatt-hour against currently optimised supercomputers; #2 explanation of how linearly adding parallel cores can really compensate for a limit in polynomial growth of chip density.
    Apart from being properly RISC, this is seemed very Transputer-ish.

  • @DaHaiZhu
    @DaHaiZhu Před 8 lety

    He never did say how much more energy efficient the Parallella was per core to the Chinese Supercomputer. In other words, how many petaflops per watt does the Parallella use compared to the Chinese Supercomputer???

  • @Rarius
    @Rarius Před 8 lety +78

    1) Note that he compares his 18 core system with just a single core of the Mac, not with running on all four cores!
    2) I coded up this algorithm in C# on my two year old PC (Intel i5-3570K!)... and even running single threaded it managed it in 6.65s... three times faster than this Parallella, and twice as fast as the Mac!
    3) This is a pretty poor algorithm for finding primes... There are FAR better ones. For instance, on my PC, the sieve of Eratosthenes algorithm gets the same result in 0.38s! Better algorithms often (usually?) yield better results than throwing more hardware at a problem.
    While applaud the effort going into the Parallella, it needs to be significantly faster before it is worth investing in.
    It might be interesting to see how a stack of Raspberry Pi 3s (you can get 4 Pis with change from $150) would do with their 16 cores.

    • @fatkidd7782
      @fatkidd7782 Před 8 lety +7

      everybody needs to read this

    • @dialupdavid
      @dialupdavid Před 8 lety +2

      This was my first thought too, no idea why in the hell anyone thought it would be a good idea to compare a single thread of a Quad core/ Eight thread system to a Dual core ARM chip with 16 Co-processing cores. Makes no logical sense too me, were they that offended in how low the performance was? To me this was nothing technical, this guy was no Engineer/Enthusiast; solely a salesman with a sales pitch.

    • @owatson67
      @owatson67 Před 8 lety +9

      Yeah but does your PC use 5 watts and did it cost $150? I haven't run this algorithmic on my PC yet but i know it would push a good time. It has a i7-6700HQ which is quad core CPU with 8 threads but I know that it would beat it but it's not the point.

    • @Rarius
      @Rarius Před 8 lety +2

      No my PC doesn't consume 5 watts or cost $150... but neither does the Apple he compares the Parallella with.
      Actually, you could build a PC for less than $150 using second hand parts that would outperform the Parallela AND be much easier to program.
      I suspect that a $150 cluster of Raspberry Pis would give it a good run for its money too.

    • @dialupdavid
      @dialupdavid Před 8 lety +1

      Thunder o Well, the Tegra X1 has about the same Power requirements, and a 256 Core Maxwell GPU. Anyone who's going to do parallel processing is going to be 10x better off using CUDA or openCL. Not to mention the Actual A57's in that SoC are probably faster than the entire Parallela board by a factor of 3-4.

  • @GeekBoy03
    @GeekBoy03 Před 8 lety +21

    Seems Parallella is fizzing out. Three years, and no new models.

    • @teknostatik1055
      @teknostatik1055 Před 8 lety

      Could just be gaining momentum since it's a fairly new product.

    • @GeekBoy03
      @GeekBoy03 Před 8 lety +10

      Tekno Statik Three years in technology is a very long time. It's had more than enough time to get grounded, and new models to appear.

    • @teknostatik1055
      @teknostatik1055 Před 8 lety

      Yeah, no... What parallela is doing is it's adding one or more dimensions for instructions to be run in parallel to each other (hence the name). Where it gets complicated is HOW the work is split because not all tasks require the same level of splitting, not all tasks can be split the same way, this example took trial and error to be split into the correct number of cores, and the program had to be re-written from serial to work in parallel.

    • @GeekBoy03
      @GeekBoy03 Před 8 lety +2

      Tekno Statik I take if you have zero understanding of product life cycles. I was referring to nothing new coming out in three years, not learning how to program the thing.

    • @teknostatik1055
      @teknostatik1055 Před 8 lety

      And I take it you know nothing of programming.
      How are we supposed to translate every program from serial into parallel? Have you any concept of the implications that go BEYOND your so-called "technology" and "product"? Without parallel programming there will be no product.

  • @daveb5041
    @daveb5041 Před 7 lety +3

    Why not put it in series with a 5w light bulb, the brightness of the bulb will show power consumption. The best way to save electricity is to make a computer that runs on vacuum tubes instead of transistors. A tube can take the place of three to five transistors so you can shrink a billion core processor down to 300 million tubes. To power it dont run it on coal, have monkeys pedaling bicycles hooked to generators. Feed them GMO bananas made by monsanto to cut down on food costs. You can also have them make copies of Shakespeare by putting a type writer in front of each one. Statistics proves that with enough time and type writers one will publish a complete work.

  • @djprodigalsun
    @djprodigalsun Před 7 lety

    He is using the battery in that solar pack, why don't you tell us what the solar efficiency of that panel is..

  • @hinasamal8406
    @hinasamal8406 Před 6 lety

    Parralela supercomputing is a fantastic idea

  • @antonnym214
    @antonnym214 Před 8 lety

    18 cores for $150 is pretty spectacular, especially compared to a standard AMD or Intel CPU.

  • @KittyKittaw
    @KittyKittaw Před 6 lety

    Motorola - 16cores, 1985 -parallel processors, round the same time. Course it burned more power, or did it?

  • @erickleefeld4883
    @erickleefeld4883 Před 7 lety

    Could I use something like this to run Handbrake video compressions, and use an app on my Mac to administer it?

  • @ultraviolet.catastrophe

    Why would I buy this when I can buy 6 Raspberry Pi 3 boards that would give me a total of 24 cores?

  • @rudde7251
    @rudde7251 Před 8 lety +3

    When you find primes, do you check up till sqrt rounded up or rounded down?

    • @MrBrew4321
      @MrBrew4321 Před 8 lety +2

      +Rudde down

    • @rudde7251
      @rudde7251 Před 8 lety

      +Brew Dain Thanks man :)

    • @fliptmartley
      @fliptmartley Před 8 lety

      +Rudde, I square the prime I'm testing against and and check to see if it's larger than the candidate.

    • @mullermanden
      @mullermanden Před 8 lety +2

      +Rudde
      Instead of using:
      for(int i=3; i

    • @MrBrew4321
      @MrBrew4321 Před 8 lety +3

      You can calculate sqrt(p) above the loop and store the result in a variable to use as the upper bound, but i*i changes each iteration so that isn't possible..

  • @ammonlu8566
    @ammonlu8566 Před 6 lety

    superb talk thank you very much

  • @mike_98058
    @mike_98058 Před 7 lety +13

    Mr Hightower failed to demonstrate that Parallela was circa 2015 the most energy efficient supercomputer on the planet. He failed to compute the efficiency in terms of FLOPS/watt which was his initial basis of comparison.

    • @GeekBoy03
      @GeekBoy03 Před 7 lety +4

      The item is actually from 2012, but released in 2013. Four years, and nothing new from them

    • @neilruedlinger4851
      @neilruedlinger4851 Před 6 lety +1

      I did a computation based on Watts per Core.
      The Parallela is 18.26 times more energy efficient than the Tianha-2.

    • @pwnmeisterage
      @pwnmeisterage Před 6 lety +1

      Now there's Epiphany-V, a 1024-core RISC SoC, and Epiphany-VI is already underway.
      Tianha-2 undergoes constant rotating upgrades as the Xeon/Phi cores are rescaled up and out.
      The presenter did explain that raw core count is not entirely meaningful in "real" (complex) problems, he was only able to demonstrate their overwhelming advantage in "optimum" (simplex) problems.
      The Tianha-2 was designed to be a unified distributed supercomputing platform, not an ad-hoc modular system with "limitless" expandibility - I suspect that in the real world it uses far less electrical power than the huge number of Parallella SBCs that would be needed to solve the same problems in the same time. Even energy efficiencies aren't linear, you can't just keep stacking LEGO computer modules together indefinitely, there are diminishing returns.

    • @monad_tcp
      @monad_tcp Před 6 lety

      Without a new programming model you can't extract all that performance. Of course the culprit is the C language, but not everyone can program in Haskell yet.

    • @andrewyork3869
      @andrewyork3869 Před 5 lety

      @@monad_tcp what about ASM?

  • @sigmareaver680
    @sigmareaver680 Před 6 lety

    The only thing attractive here is the energy efficiency. Would it be worth crypto mining with?

  • @MrManerd
    @MrManerd Před 6 lety

    Does the Parallella use ECC memory?
    That's all I want.

  • @jarisipilainen3875
    @jarisipilainen3875 Před 7 lety

    is it 18cores and 18 extra cores? if some core will broke. some intel prosessors have 9 cores but there 9 exrts to fix broken core at fly. OR you can activate them all lol 18 cores. propably they work paraller. not atleast serial lol

  • @jarisipilainen3875
    @jarisipilainen3875 Před 7 lety +1

    are you scared to show how fast mac is on multiple threats? it was faster anyway on one lol. but yet it cost more and allmost 3 times faster core. ,ac could do it on 7 seconds and way under. your probram was only think benefit of your board lol

  • @Tommo_
    @Tommo_ Před 8 lety

    macs are only expensive because of how compact they are. if you look at the inside of a 12 inch macbook, the whole 8gb of ram and 500gb of storage fits into about 10 by 5 cm of space. The rest is the battery. And it runs without a fan. Amazing.

  • @jerryschull2122
    @jerryschull2122 Před 8 lety +6

    Seems way too pricey. The Pi2 and Pine64 are really cheap and have significant processing power, fits most project requirements.

    • @0xf7c8
      @0xf7c8 Před 8 lety +1

      +Jerry Schull Google cluster, that is what this is designed for.

    • @stevebez2767
      @stevebez2767 Před 6 lety +1

      yeah like all back too The Simpsons as some crazy kid finds 'dirty riffs in basic 'coo yells for skitso anarchy run hells exterimination repeat of give it too the keyIDzz sig moon frieds?

  • @RinksRides
    @RinksRides Před 7 lety

    i think mores law is still relevant, it;s just taking a different direction. We're getting more and more powerful computing ability while the cost and power consumption can be lowered at the same time. So, if you view it from a performance per watt level then Moore's law is still relevant in that context.

    • @smorrow
      @smorrow Před 7 lety

      Moore's Law proper is about number of transistors.

  • @madgamer3974
    @madgamer3974 Před 8 lety

    cloud phone connected by internet to supercomputer = best phone ever :D

  • @roschereric
    @roschereric Před 7 lety

    Just think that at the same time, Nvidia had the 970 for less power per GFLOP already. Pair that with an ARM dual core and you are better performer

  • @einsteinwallah2
    @einsteinwallah2 Před 4 lety

    make this in 480p or higher

  • @artlab_one
    @artlab_one Před 7 lety

    Would love to see a Blender 3D test on this device :)

  • @afronprime51
    @afronprime51 Před 6 lety

    Can you use them as a render farm?

  • @38KSW
    @38KSW Před 7 lety

    Too bad can't find this thing any place

  • @rospotrebpozor3873
    @rospotrebpozor3873 Před 8 lety +2

    The problem is that program has to compute one result before it can make decision for another.
    parallel processing does not solve that problem.

    • @Thyhorrorchannel
      @Thyhorrorchannel Před 8 lety +1

      +rospotreb pozor RISC .

    • @walter0bz
      @walter0bz Před 8 lety +1

      +rospotreb poor many algorithms parallelise fine. changes the way you program and the types of work you can do. see deep-learning (which became viable due to GPUs, and its a poor use of a CPU), it can use huge parallelism across layers and deeper nets, but suffers at the moment from communication bottlenecks in clusters. the point here is parallelism with local memories and an on-chip network overcome that.

  • @tenshi7angel
    @tenshi7angel Před 6 lety

    The problem with Parallella, there are programs that cannot be done on multi-core or multi-system setups.

  • @adavistheravyn573
    @adavistheravyn573 Před 7 lety

    I happen to work in the field of high-performance computing and had something similar in mind for my own numerical simulations or BOINC stuff.
    Before spending hundreds of Euros for RPi3 boards, I did some tests with a special version of my nbody code which is written in C, highly optimized and utilizes the OpenMP library for parallel computing. My benchmark focussed on floating-point performance with negligible RAM consumption. What are my results? Well, it's disillusioning. My benchmark task took 65 minutes on a single core of my RPi3, while a single Core i5-6500 solved the problem in 65 seconds! Using four threads, the RPi3 still took more than 18 minutes, while my Intel Core i5-6500 got that job done in 17 seconds.
    Conclusion: Neglecting communication overhead, I would have to come up with more than 60 RPi3 boards to get on par with a decent Core i5-6500 ... ARM might give you more FLOPS per Watt, but when it comes to pure floating-point performance, the architecture is still far behind.

    • @GeekBoy03
      @GeekBoy03 Před 7 lety

      ARM processors come in a very large variety, with up to 8 cores, The Raspberry Pi 3 uses a lower end ARM Cortex-53. The upper end is the Cortex-A75. But remember, ARM has low power usage as a priority. ARM is certainly getting more powerful, and some companies has started making Laptops with AMR processors.
      Remember, the RB Pi is just for projects, and prototyping.

  • @williamhart4896
    @williamhart4896 Před 8 lety

    hmm this board plus a other companies board both of them running in one device the parella in CO process and a pine a64 ln main hmm super compute in a tablet case ?

  • @Btw_visit_____todacarne-com

    Have a look at OpenPiton before buying this board.
    And have in mind that Andreas Olaffson has abandoned ship (left Adapteva).

  • @terrance_huang
    @terrance_huang Před 6 lety

    ditch the soft cores and do it on bare metal verilog, you can get another 10x performance

  • @friggindoc
    @friggindoc Před 6 lety

    I need this!

  • @garryclelland4481
    @garryclelland4481 Před 7 lety

    very impressive

  • @Raven-fu1zz
    @Raven-fu1zz Před 3 lety

    Why can't you just use a GPU to do the calculations, they have thousands of cores, and per watt you would get more performance

  • @Donatellangelo
    @Donatellangelo Před 8 lety

    By my calculations, building an actual supercomputer with these would almost be $100,000.00!!!!! D: Holy shit!

  • @DAVIDGREGORYKERR
    @DAVIDGREGORYKERR Před 8 lety

    I wonder has anyone built a super computer around 16 boards containing 64 IMOS T800 Transputer’s each which equals 1024 Transputer cores that will run Linux.

  • @ChrisD__
    @ChrisD__ Před 8 lety +17

    If I could run Blender Cycles on this, I'll take fifty.

    • @mutantgenepool
      @mutantgenepool Před 6 lety +2

      Was thinking the same thing. xDD

    • @Art7220
      @Art7220 Před 6 lety +2

      Can it run XP or Crysis, or Bitcoin Mining? Someone always asks about Crysis.

    • @afronprime51
      @afronprime51 Před 6 lety +1

      Reading my mind

    • @Phoen1x883
      @Phoen1x883 Před 6 lety +1

      With only 1 GB of RAM, you'd be fairly limited in your scene size.
      In addition, rendering requires lots of high speed access to _all_ the memory, as rays need to bounce around the scene (and therefore, around memory), Just looking at the block diagram, you can see that none of the cores have direct access to a large block of memory. Unless there is some extremely fast communication bus between cores, that means long pauses in execution while data is fetched from memory.
      Would be nice if we could get someone familiar with Cycles internals to take a look and evaluate whether the Parallella architecture is usable for rendering. I did some quick searches, and didn't find anything solid.

  • @hanniffydinn6019
    @hanniffydinn6019 Před 6 lety

    Anyone remember transputers?

  • @jarisipilainen3875
    @jarisipilainen3875 Před 7 lety

    if anyone intersted how fast is 5 rasbperry pi3 cluster and it not cost 180 :) but i didnt say this board not good. need more cores lol

  • @Gamepak
    @Gamepak Před 6 lety

    cool but does it do Crysis?

  • @TheTurnipKing
    @TheTurnipKing Před 6 lety

    16.21 That says far more about the overpricing of the Mac to me than anything else

  • @Donatellangelo
    @Donatellangelo Před 8 lety

    I hope this doesn't have any of the NSA's poison on it.

  • @diskgrind3410
    @diskgrind3410 Před 8 lety +1

    Other than the Jobbathehut in the audience I thought it was a good speech.

  • @Masoudy91
    @Masoudy91 Před 8 lety

    A mac with 2.4 GHz toke 14 sec.
    18 (or 20?) 1GHz should add up to 18 GHz or 20 GHz?
    Yet it toke 18 sec?
    Not really familiar with computation stuff .. :(

    • @ToriRocksAmos
      @ToriRocksAmos Před 8 lety +1

      +Yousif Tareq you can't just add the numbers up. Those are entirely different machines running different architectures.

    • @Masoudy91
      @Masoudy91 Před 8 lety

      +Marcel Krebs yep, so I heard.

  • @ilivill
    @ilivill Před 7 lety +1

    Parallella running Parabola :D

  • @RobbieFPV
    @RobbieFPV Před 8 lety +35

    haha I saw "hightower" and immediately expected a huge black cop.

    • @StefanBlurr
      @StefanBlurr Před 8 lety

      he died a long time ago :'(

    • @RobbieFPV
      @RobbieFPV Před 7 lety

      O haha yea ofcourse! I hardly play that map though. I'm more of a goldrush or dustbowl player :v

  • @maxlol0
    @maxlol0 Před 7 lety +1

    could be good as a linux media server or NAS. a bit weak for main computing task.

    • @iluan_
      @iluan_ Před 6 lety

      It has a ZYNQ FPGA chip from Xilinx. For many applications, that's more than enough for high performance computing.

  • @jarisipilainen3875
    @jarisipilainen3875 Před 7 lety +1

    you only used 1 thread on mac lol

  • @JohnSmith-ut5th
    @JohnSmith-ut5th Před 6 lety

    A GPU is *far* more energy efficient in comparison to its processing power. A *low-end* GPU would absolutely smoke this device. The main advantage of this device is physical *portability,* not energy efficiency.

  • @mann2.088
    @mann2.088 Před 7 lety

    u know there is also the proposal of a quantum computer

  • @fy7589
    @fy7589 Před 6 lety +1

    This is not a new idea. We already build cluster computers using super high end hardware and one chip in them is capable of the same speed as thousands of parallellas or raspberry pi's Just one chip in them. And it is much more power efficient and space friendly than building super big Pi Clusters. Instead, FPGA chips will become more popular in the future .

  • @IraQNid
    @IraQNid Před 8 lety +3

    A fractal Parallela cluster is the real answer. But how well does it run beneficial programs such as SETI@HOME and BOINC? These are programs that seek to solve our most pressing issues of the day using idle distributed CPU and GPU cycles. That idle CPU and GPU processing prowess is then used with tiny segments of data sent to users all around the globe to analyze data. Results are then sent back to the researcher's computer centers. I used to participate on SETI@HOME, Einstein@HOME, and BOINC to help solve the mysteries of our Universe, to find a cure for cancers, and to produce better rice yields to feed more people with an improvement in how the rice is grown.
    You might want to research the computational power of a Titan series GPU and something called "CUDA" :)

    • @marcusdudley7235
      @marcusdudley7235 Před 8 lety

      I used to run BOINC too, but, although it claimed to only use idle cycles, my CPU and GPU showed much more active cooling with BOINC running and my power usage almost tripled.

  • @SamuelBSR
    @SamuelBSR Před 6 lety +1

    2015 Hahaha :))))
    It's 2018 and where is parallella now?

  • @vinny142
    @vinny142 Před 8 lety +3

    @16:19 "a 150 dollar device was comparable to a $2000 mac"
    Well, to one core of the mac, he's onlty using one ore on the mac, not all four, which is what the $2000 costs. So really he is comparing a $500 mac to a $150 device. Loose the screen and the rest of the hardware, and the price is about the same.
    And even then it's only true for this particular application. Do you do much prime-number checking? I've never done it either.
    Parallel computing is ofcourse nothing new, back in the early 2000's companies like Industrial Light and Magic and Pixar learned very quickly that you get much more bang for you buck if you add many many many small cheap nodes, than fewer faster but more expensive nodes. Adding one 2Ghz core to a system adds 2Billion instructions a second to the system, which is the same as upgrading four cores from 2Ghz to 2.5, which is a lot more expensive than one 2Ghz core.

  • @monetize_this8330
    @monetize_this8330 Před 5 lety

    RISC is Reduced Instruction Set *Complexity*

  • @JeremySiedzik
    @JeremySiedzik Před 6 lety

    pure sales pitch

  • @Nomoreidsleft
    @Nomoreidsleft Před 6 lety

    I don't know why he's even calling it a supercomputer. Only 16 cores, and probably doesn't even do floating point.

  • @-ColorMehJewish-
    @-ColorMehJewish- Před 7 lety

    Ill just connect and link my RPi's thanks - least til something truly better comes out - not just scaled into one board.

  • @Petr75661
    @Petr75661 Před 8 lety

    Mobileye EyeQ4 pulls 2.5 teraflops @ 3 W. Parallela gives only 0.09 teraflops @ 5 W.

    • @llothar68
      @llothar68 Před 8 lety +1

      +jednoucelovy
      Yes it's all fake. In real world nothing in the ARM world beats Intel on performace/Watt (except GPU if you use matrix algorithms in single preceision).

  • @j.macjordan9779
    @j.macjordan9779 Před 6 lety +4

    When did # of CPU cores become the sole measure of one's genitals, with zero regard for RAM? I'd think this has it backwards if anything... If my analogy fits - & I think it does - It's like a genetic freak show: a dude with 18 testicles and a centimeter long wrecker. That ain't cool...it's not getting the job done. I certainly don't want that...