You have this incredible talent for speaking really clearly, providing all the information with no chatter or wasted words, explaining it all in a concise manner. Attributes pretty much unheard of on CZcams. I doff my cap sir.
Mike is a very intelligent guy. He speaks too fast for me to follow a lot of the time but that is my problem, not his. I just have to frequently replay certain bits.
Nice idea and I'm going to try out something similar, now I know it stands a good chance of working without needing 6 × SPI channels and a 144 pin micro. Many thanks.
Great video. I love playing down with the bits and DMA. Reminds me of setting up a Texas C64 with multiple chained DMAs to split apart incoming I*Q data steams from a complex QUAD down converter chip.
I really liked your idea of using a second dma to write to the t3con register. I thought of the external gate too while you were explaining the problem. But I would have stopped there. Nice thinking!
I do a similar thing with the Spixels project (that we were using in FlaschenTaschen) on the Raspberry Pi (well, writing out per DMA that is, but not using the DMA controller to trigger on an interrupt).
question: in ARM, you can set and clear each pin bits with single word, but it will only effect pins that are 1 in that register, so in theory, you can use 16b, toggle odd ones, and use as inputs or outputs even ones with no head smashing. Maybe where are same things in PIC?
Great video, Mike. Thanks! Also nice idea to use DMA to stop the clock. Was wondering if it would be possible - albeit a bit brain-frying - to shuffle the input data using DMA gather 'n' scatter. Dunno if this part has it... Cheers
some Freescale (now nxp) arm microcontrollers have a "flexio" peripheral which can be configured for many different types of serial output. unfortunately only the high end parts (I think) have enough channels (more than 4) for this kind of task. but you may be able to combine them with the existing spi peripheral to get enough spis (4 flexio outputs + 2 spi...maybe).
I just realized something -- why isn't the foreground task (green trace - pin switching on and off) active after the bit-fiddling is done? Shouldn't the foreground task be active during the DMA transfers as they are asynchronous?
@Mike - Nice work.. If you enjoy exploiting the peripheral set to emulate other functions, you would love playing with the xmos chips. You essentially write your own peripherals using a bunch of resources such as timed ports (with shift registers), user clocks under software control and all without worrying about interrupt latency because you dedicate a core to doing just that task (so it's always read to react without context switch). No need for significant buffering. Great for doing high channel count bridging (ethernet/serial to serial/pwm etc.)..
Not looked in too much detail at XMOS - too much new stuff to learn for occasional use. The 72 channel board is actually to replace an XMOS board that a customer uses, which went obsolete.
Did you try to use the PMP? Seems like the first obvious choice. You mention the clock pulse of the PMP would be very short. Too short? I want to try this with those 32by16 display panels. Thnx for sharing your quite brilliant and out of the box insights ;)
How long did it take you to come up with this implementation? I can only imagine how long it would have taken me to develop something like this and your design is VERY clever. I love watching your videos Mike.
Can you not enter the interrupts faster by managing what you touch, and using a naked interrupt call? That makes the compiler not automatically save the stack context (so you have to be careful to not clobber the stack, or fix what you clobbered), but it can considerably speed up an interrupt.
The point is you need to save the CPU registers. If you want to avoid that, you could allocate a number of them for ISR use, and make sure you never touch them outside of the interrupt, but that would mean writing everything in assembler yourself.
That won't work as you need to maintain framing of the packets - if you suddenly re-enable after doing whatever you're doing, you could be in the middle of a packet destined for another node, and not know when your one starts. (There is a way to avoid that issue, by using 0xff as a start-of-packet marker that;s never used within packets, but it's then hard to implement variable-length packets, and you need to not use 0xff within data. )
Do you have some more upcoming teardown vids? Btw, why did you turn off the ads from your channel? It could at least re-scoop you some money which you could invest to buy cool stuff for teardowns.
Yes - some really interesting stuff coming up but needs a lot of time to do justice & rather busy atm. I enable minimal ads on new vids & turn more on for older ones, so subscribers don't see too many.
You should turn on the adds on all vids. Those people who want to see them, they will wait 5 seconds anyway. It is nerving that informative channels barely make money, and channels which present useless nonsense are having millions of subscribers and they SHAMELESSLY ask for more money and support...
Note, for an advert to count towards monetization, the viewer must let it play for at least 30 seconds, or, obviously, entirely if shorter than 30 seconds.
I don't know if you have the memory, but why not double buffer? You can do your bit twiddling, then swap array pointers to output it with DMA. Then you can do both parts simultaneously. There could be memory bandwidth contention, but I would hope the DMA would have priority.
That would improve total throughput, but that isn't really a problem hare as there are multiple devices on the bus, so there is plenty of time to send.
The running start idea is neat and would see this all over the place in video game console programming, Like on PS2 you could start DMA to start consuming a buffer before you have finished filling that buffer. Or the C64 scrolling routines that were still moving memory at the bottom of the screen while the raster was drawing the top and the raster caught up just as you were finished. But as you say, it's working fine without these optimizations.
The problem with starting the DMA while filling the buffer is you have to account for the load that an incoming serial packet may create. I did look at it but by the time you assume the fill will be slowed by the serial handler the benefit was negligible. Double-buffering would give more overall throughput if needed, but would add latency.
Of course you could have the fill process periodically check where the DMA had got to before proceeding, but that would only potentially improve avarage time, not peak which is what matters.
If you have the RAM, you could instead embed the clock signal with the data, writing the same data twice to the I/O port, but toggling the clock pin. It costs twice the RAM, and the max transfer speed is halved, but you don't need the additional timers and there's no overruns.
Never used them. Used to use NXP ARMs but started hitting code size limit of IAR free version and full version was crazy expensive. One reason for using PICs as lots of RAM in small pin-count packages, also flexible pin mapping on PIC24F. For bigger jobs, ability to buy from MicrochipDirect ready-programmed is a big time saver. Architecture doesn't really matter much.
PIC have become my goto MCU for projects too! Specifically the DSPIC33e's; i quite like the PPS capability too. Do you know any good dev boards for the PIC32s? Ive noticed Microchip have no sane devboards for sale for any chip! Had to roll my own for the DSPICs...
Yes but from what I hear it requires some fiddling about to get debugger etc. working. Nice thing about IAR and MPLAB it's well integrated, and just works, most of the time at least. I regularly also use PIC10,12,16 and 24 , so having the same IDE, programmer and very similar peripherals is a big advantage.
Point taken. If you do want a zero-fiddling ARM env, atmel studio is quite nice. It's based on visual studio, which is either nice or horrible depending on your opinions on visual studio. I like it, but then I like nice GUIs. Hopefully now that Microchip bought atmel, the atmel toolchain people will go and kick the microchip toolchain people in the ass a bit, and things will get better.
Yes you could, but you'd end up with a bigger package, more expensive part, and doing stuff like firmware upload over the bus get somewhat more complex.
SPI (and similar) without DMA sucks in any case. You can either busy-wait between each Tx/Rx buffer load, or you can use an interrupt driven approach (or explicitly task switch) but by the time you have the task switching overhead out of the way you will be due for another buffer load. (Depending on your timing particulars, of course, but I mind this is virtually always the case.) ... not that this is necessarily relevant to your video, which I am only part way through watching ...
Listening to your videos makes me become aggressive. You talk/murmur nonstop and I instantly lose focus, 5 seconds later BOOM I have forgotten everything you have said. I trust you that it's interesting stuff you talk about, but still watching feels pointless for me. Pls slow down a little and talk more clearly. Thanks :)
You have this incredible talent for speaking really clearly, providing all the information with no chatter or wasted words, explaining it all in a concise manner. Attributes pretty much unheard of on CZcams. I doff my cap sir.
As a non-native-English-speaker, I must agree that Mike's fast speech and somewhat lispy pronounciation makes it difficult to comprehend.
Mike is a very intelligent guy. He speaks too fast for me to follow a lot of the time but that is my problem, not his. I just have to frequently replay certain bits.
Evidently the haters don't speak English. Feel free to watch at 0.5x speed.
You are so amazingly fortunate to have found a niche. I would give my talent to learn how to find a niche and grow in it…
Mike, you are a very clever chap. I always learn loads from your videos, when it sinks in!
Pure gold! That's a very nice idea to use DMA to stop the timer. Thanks for the video.
nice explanation and very clever to also stop the clock via dma!
Nice idea and I'm going to try out something similar, now I know it stands a good chance of working without needing 6 × SPI channels and a 144 pin micro. Many thanks.
Very interesting stuff. Glad I discovered your channel.
I love you teardown videos Mike but this has melted my brain!
Nevertheless, keep up the great work.
Great video. I love playing down with the bits and DMA. Reminds me of setting up a Texas C64 with multiple chained DMAs to split apart incoming I*Q data steams from a complex QUAD down converter chip.
I really liked your idea of using a second dma to write to the t3con register. I thought of the external gate too while you were explaining the problem. But I would have stopped there. Nice thinking!
very very nice, cool trick using a DMA to transfer the 0 constant
well done sir. Much respect.
cool. I do love these types of videos
You are a genius. Will have to try if I can do the same thing on low end STM32. STM32F030 is very cheap in price.
I'm starting to understand your videos more and more so I guess college is worth it.
I do a similar thing with the Spixels project (that we were using in FlaschenTaschen) on the Raspberry Pi (well, writing out per DMA that is, but not using the DMA controller to trigger on an interrupt).
Great stuff! Thanks :)
question:
in ARM, you can set and clear each pin bits with single word, but it will only effect pins that are 1 in that register, so in theory, you can use 16b, toggle odd ones, and use as inputs or outputs even ones with no head smashing.
Maybe where are same things in PIC?
Great video, Mike. Thanks! Also nice idea to use DMA to stop the clock. Was wondering if it would be possible - albeit a bit brain-frying - to shuffle the input data using DMA gather 'n' scatter. Dunno if this part has it... Cheers
some Freescale (now nxp) arm microcontrollers have a "flexio" peripheral which can be configured for many different types of serial output. unfortunately only the high end parts (I think) have enough channels (more than 4) for this kind of task. but you may be able to combine them with the existing spi peripheral to get enough spis (4 flexio outputs + 2 spi...maybe).
Brilliant !
Thanks for sharing
I see the NSL stamp what day where you there ?
I just realized something -- why isn't the foreground task (green trace - pin switching on and off) active after the bit-fiddling is done? Shouldn't the foreground task be active during the DMA transfers as they are asynchronous?
@Mike - Nice work.. If you enjoy exploiting the peripheral set to emulate other functions, you would love playing with the xmos chips. You essentially write your own peripherals using a bunch of resources such as timed ports (with shift registers), user clocks under software control and all without worrying about interrupt latency because you dedicate a core to doing just that task (so it's always read to react without context switch). No need for significant buffering. Great for doing high channel count bridging (ethernet/serial to serial/pwm etc.)..
Not looked in too much detail at XMOS - too much new stuff to learn for occasional use.
The 72 channel board is actually to replace an XMOS board that a customer uses, which went obsolete.
Interesting! Keep up the good work - really enjoy the videos. One day I'll own a scope that good...
Did you try to use the PMP? Seems like the first obvious choice. You mention the clock pulse of the PMP would be very short. Too short? I want to try this with those 32by16 display panels. Thnx for sharing your quite brilliant and out of the box insights ;)
How long did it take you to come up with this implementation? I can only imagine how long it would have taken me to develop something like this and your design is VERY clever. I love watching your videos Mike.
Took a few attempts in different directions after I realised I couldn't use interrupts - couple of days maybe.
Can you not enter the interrupts faster by managing what you touch, and using a naked interrupt call?
That makes the compiler not automatically save the stack context (so you have to be careful to not clobber the stack, or fix what you clobbered), but it can considerably speed up an interrupt.
Quite possibly, but this way will always be faster.
Also I'd have to learn more about the MIPS architecture and assembler than I can be bothered to,
You beat me to it, I was going to ask if bit banging in assembler was a option.
The point is you need to save the CPU registers. If you want to avoid that, you could allocate a number of them for ISR use, and make sure you never touch them outside of the interrupt, but that would mean writing everything in assembler yourself.
I'm pretty sure you can tell most C compilers to not touch specific registers, so it shouldn't have to *all* be in assembly.
Even if you can, it means you have to be able to compile all code involved in the project from source (so no prebuilt part-support libraries etc.)
Is this with the internal PIC32 oscillator?
dont recall you mentioning but if you can setup the dma in a circular mode you should use that.
It may not be a bad idea to disable the serial rx interrupt once you get your packet (based on address) and re-enable once the DMA memory is loaded.
That won't work as you need to maintain framing of the packets - if you suddenly re-enable after doing whatever you're doing, you could be in the middle of a packet destined for another node, and not know when your one starts. (There is a way to avoid that issue, by using 0xff as a start-of-packet marker that;s never used within packets, but it's then hard to implement variable-length packets, and you need to not use 0xff within data. )
Ah, good point. Coming up with out-of-band markers is a PITA.
Do you have some more upcoming teardown vids?
Btw, why did you turn off the ads from your channel? It could at least re-scoop you some money which you could invest to buy cool stuff for teardowns.
Yes - some really interesting stuff coming up but needs a lot of time to do justice & rather busy atm.
I enable minimal ads on new vids & turn more on for older ones, so subscribers don't see too many.
You should turn on the adds on all vids. Those people who want to see them, they will wait 5 seconds anyway. It is nerving that informative channels barely make money, and channels which present useless nonsense are having millions of subscribers and they SHAMELESSLY ask for more money and support...
Note, for an advert to count towards monetization, the viewer must let it play for at least 30 seconds, or, obviously, entirely if shorter than 30 seconds.
I don't know if you have the memory, but why not double buffer? You can do your bit twiddling, then swap array pointers to output it with DMA. Then you can do both parts simultaneously. There could be memory bandwidth contention, but I would hope the DMA would have priority.
That would improve total throughput, but that isn't really a problem hare as there are multiple devices on the bus, so there is plenty of time to send.
The running start idea is neat and would see this all over the place in video game console programming, Like on PS2 you could start DMA to start consuming a buffer before you have finished filling that buffer. Or the C64 scrolling routines that were still moving memory at the bottom of the screen while the raster was drawing the top and the raster caught up just as you were finished. But as you say, it's working fine without these optimizations.
The problem with starting the DMA while filling the buffer is you have to account for the load that an incoming serial packet may create. I did look at it but by the time you assume the fill will be slowed by the serial handler the benefit was negligible. Double-buffering would give more overall throughput if needed, but would add latency.
Of course you could have the fill process periodically check where the DMA had got to before proceeding, but that would only potentially improve avarage time, not peak which is what matters.
If you have the RAM, you could instead embed the clock signal with the data, writing the same data twice to the I/O port, but toggling the clock pin. It costs twice the RAM, and the max transfer speed is halved, but you don't need the additional timers and there's no overruns.
Got a link to the source for the strips? They look interesting.
custom made for this installation.
Aw.
At 15:01 is that a Not So Loud disco stamp on your hand?
Nowadays the young generation no longer knows the good old Disco...
Bloody PIC32s... blyat! What type of PIC have you used there? (Want to compare and find a match in the ARM family). Thanks
pic32mx150f128b
i like that you use pics! what is your opinion on the STM32s?
Never used them. Used to use NXP ARMs but started hitting code size limit of IAR free version and full version was crazy expensive.
One reason for using PICs as lots of RAM in small pin-count packages, also flexible pin mapping on PIC24F. For bigger jobs, ability to buy from MicrochipDirect ready-programmed is a big time saver. Architecture doesn't really matter much.
PIC have become my goto MCU for projects too! Specifically the DSPIC33e's; i quite like the PPS capability too. Do you know any good dev boards for the PIC32s? Ive noticed Microchip have no sane devboards for sale for any chip! Had to roll my own for the DSPICs...
Can you not target the ARM with GCC?
Yes but from what I hear it requires some fiddling about to get debugger etc. working.
Nice thing about IAR and MPLAB it's well integrated, and just works, most of the time at least.
I regularly also use PIC10,12,16 and 24 , so having the same IDE, programmer and very similar peripherals is a big advantage.
Point taken.
If you do want a zero-fiddling ARM env, atmel studio is quite nice. It's based on visual studio, which is either nice or horrible depending on your opinions on visual studio.
I like it, but then I like nice GUIs.
Hopefully now that Microchip bought atmel, the atmel toolchain people will go and kick the microchip toolchain people in the ass a bit, and things will get better.
Wont the DMA overwrite the 7 other bits?
Yes, but if you make sure that the other 7 bits are the ones you want, then your settings won't be changed.
i have no idea what you do and why in that video...
Maybe because i have no idea on PIC controllers
Neat, but all I was thinking the whole video was how much easier this would be with a few lines of VHDL in a FPGA.
Yes you could, but you'd end up with a bigger package, more expensive part, and doing stuff like firmware upload over the bus get somewhat more complex.
SPI (and similar) without DMA sucks in any case.
You can either busy-wait between each Tx/Rx buffer load, or you can use an interrupt driven approach (or explicitly task switch) but by the time you have the task switching overhead out of the way you will be due for another buffer load. (Depending on your timing particulars, of course, but I mind this is virtually always the case.)
... not that this is necessarily relevant to your video, which I am only part way through watching ...
Use FPGA and you can have any number of any type of ports :)
what language is he speaking?
2x sped up future speak
It's a 40 minutes English video with half the syllables skipped to make a shorter video 😆
ahh, that could be it
Now all of this works for TX only, of course.
I think Mike's a girl. How else could anyone speak that quickly while barely breathing.
Listening to your videos makes me become aggressive. You talk/murmur nonstop and I instantly lose focus, 5 seconds later BOOM I have forgotten everything you have said. I trust you that it's interesting stuff you talk about, but still watching feels pointless for me. Pls slow down a little and talk more clearly. Thanks :)