What NOT to do: Self Modifying Code - Computerphile

Sdílet
Vložit
  • čas přidán 19. 08. 2020
  • How 'not to code' with our "real" programmer - who, as Julian explains, is demoing what NOT to do. Dr Julian Onions tells us more about Mel.
    / computerphile
    / computer_phile
    This video was filmed by Julian Onions and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Komentáře • 530

  • @hellterminator
    @hellterminator Před 3 lety +1606

    I used self-modifying code to win a code optimization challenge back in university.
    We were writing a matrix multiplication program and competing in whose program could multiply two 10 by 10 matrices in the fewest clock cycles. However, you were not allowed to optimize your program specifically for 10 by 10 matrices. I got around that rule by writing a program which started off by optimizing itself for whatever matrix size it was fed.

    • @DanielBoerlage
      @DanielBoerlage Před 3 lety +240

      Would you happen to still have that code? I would be very interested to see it!

    • @mr.rabbit5642
      @mr.rabbit5642 Před 3 lety +107

      Omg same, please! This sounds really interesting
      Just mention me here

    • @6infinity8
      @6infinity8 Před 3 lety +125

      That sounds like you wrote a JIT compiler. Interesting challenge nevertheless

    • @arslan113
      @arslan113 Před 3 lety +2

      Yep

    • @stardreamse
      @stardreamse Před 3 lety +174

      Git repo link or it never happened.

  • @the3nder1
    @the3nder1 Před 3 lety +575

    Great video but the cake recipe was missing the 12 page backstory of the writer telling us about how their father made this when they were home sick from school.

    • @JobvanderZwan
      @JobvanderZwan Před 3 lety +22

      I really, really want to see a satirical doc comment written in this style now!

    • @TheHandOfFear
      @TheHandOfFear Před 3 lety +3

      @@JobvanderZwan I may just steal that idea. lol

  • @Supertimegamingify
    @Supertimegamingify Před 3 lety +684

    Ah, so _this_ is what I *should* be doing, I see.

    • @RupertBruce
      @RupertBruce Před 3 lety

      Take a look at ConfuserEx on github...

    • @lubricustheslippery5028
      @lubricustheslippery5028 Před 3 lety +3

      It's easy to play around with self modifying code in webpages with javascripts

  • @dibblethwaite
    @dibblethwaite Před 3 lety +268

    The recipe isn't really self modifying code. It just modifies data depending on the time. It's a fine example of bad code though.

    • @sheepphic
      @sheepphic Před 3 lety +38

      Yeah; a self-modifying recipe would be something like "If it is a Saturday, replace step five with 'whisk in 3 eggs'; otherwise, replace the step which corresponds with the number of eggs remaining with 'add 5 mL of vanilla'", which is even _more_ confusing

    • @dmitriybabaylov6606
      @dmitriybabaylov6606 Před 3 lety

      @@sheepphic it sounds like a typical game of dragon poker to me.

    • @maciejmanna9246
      @maciejmanna9246 Před 3 lety +10

      I was going to make the same comment... TBH, he was talking a lot, and - in the end - said very little... :/

    • @iabervon
      @iabervon Před 3 lety +10

      Step 3: set the number in step 4 to the number of eggs in your bowl
      Step 4: bake for 0 minutes
      Step 5: change "eggs" in step 3 to "cups of sugar", cross out step 5, and start over.

    • @0LoneTech
      @0LoneTech Před 3 lety

      @Stay EZ My Friends The main thing his example is doing is mixing concepts, conflating e.g. minutes and egg counts, or temperature and amount of flour (note that it does not say e.g. grams of flour). Those operations are outright invalid (my oven sure doesn't have a flour scale), and require creative reading to figure out (just like his tablespoon of baking powder). Computers are particularly known for uncreative reading.
      The true effect is demonstrating that he doesn't have a proper concept of self modifying code, as he couldn't explain it. iabervon's example in this thread is better, and particularly James Junghan's "algorithm that doesn't fly".

  • @qubex
    @qubex Před 3 lety +305

    The algorithm that doesn’t fly:
    1) Switch steps 1 & 2.
    2) Flap wings.
    3) Goto step 1.

    • @cezarcatalin1406
      @cezarcatalin1406 Před 3 lety +10

      Oh boi

    • @TVSuchty
      @TVSuchty Před 3 lety +3

      this is so bad ... hahah

    • @user-lt9oc8vf9y
      @user-lt9oc8vf9y Před 3 lety +12

      Error: Recursion limit reached

    • @TVSuchty
      @TVSuchty Před 3 lety +23

      @@user-lt9oc8vf9y
      this wont happen

    • @RussellTeapot
      @RussellTeapot Před 3 lety +30

      ahahahah at first I didn't get it, then I mocked up its execution on paper, step by step and realized that it can never flap because "switch steps 1 & 2" gets executed two times, so when "goto 1" is executed the order of instructions is the same as before and "flap wings" is never executed

  • @ywanhk9895
    @ywanhk9895 Před 3 lety +302

    Program: *runs*
    Also program after 2 minutes: I have no idea who I am and where I came from

  • @cosmicrider5898
    @cosmicrider5898 Před 3 lety +125

    Make it self commenting as well.

  • @MrGeekGamer
    @MrGeekGamer Před 3 lety +157

    I think this academic might be surprised by what you can get away with writing in a corporate environment.

    • @insertoyouroemail
      @insertoyouroemail Před 3 lety +2

      Well said

    • @jeremysmitherman
      @jeremysmitherman Před 3 lety +4

      ...he's worked in the corporate world for over 20 years but ok.

    • @MrGeekGamer
      @MrGeekGamer Před 3 lety +27

      @@jeremysmitherman all the more surprising that he thinks you can get fired for writing wacky code.

    • @evannibbe9375
      @evannibbe9375 Před 3 lety +4

      @@MrGeekGamer Maybe the people he works with hire CS people as managers.

  • @zenithparsec
    @zenithparsec Před 3 lety +162

    A better example of self modifying code using the recipe would be to have some instructions like:
    1. Copy this recipe's ingredients and instructions to a blank sheet of paper. Once complete, start following the copied instructions from step 2.
    2. Find all numbers in the ingredients this on page which are bigger than 5. For each of those numbers, do the following: add 100 to it, cross out the original number, and write the new number in its place.
    3. Find all instructions in this recipe which contain the words "add 100". For each instruction, cross it out.
    4. Set oven to 150 degrees C.
    5. ... (and the rest)
    Notice that this would change the actual instructions you should run.
    1. Copy the recipe to a blank sheet of paper == load the program into memory.
    2. This instruction modifies the the code (if it said to use 5 eggs, now it will say you need 105 eggs. ) Notice it only modifies the "data" here.
    3. This line finds the previous instruction, and removes it. It also finds the current instruction and removes that! This is actual self-modifying code. The purpose of this is in case you want to use this program/recipe again: if you don't remove this code, the next time you go to use it, you will have to use 205 eggs, and then 305 eggs, and eventually you'll notice something weird is happening. ;]
    4. And now we start the actual recipe.
    5. .. and we run out of eggs.

    • @osten222312
      @osten222312 Před 3 lety +17

      I agree, the example provided describes nothing more than a bad program really

    • @JNCressey
      @JNCressey Před 3 lety +14

      1. "Read all the instructions first"
      ...
      last. "Put your pen down and relax, watch the idiots rushing to complete all the tasks."

    • @zapazap
      @zapazap Před 3 lety +3

      @@JNCressey : Instruction 1 says nothing about decoding the list of instructions. After reading the list, one correctly proceeds to read-decode-exec instruction 2.
      Imputation of idiocy is dangerous.

    • @ConstantlyDamaged
      @ConstantlyDamaged Před 3 lety +3

      As I mentioned elsewhere, a better example could have been done with a Python program that modifies its own file and, rather than looping, instead calls itself to run via a system command-basically using tail-recursion to ensure the modified code is then loaded and run.

  • @unvergebeneid
    @unvergebeneid Před 3 lety +136

    I learned more from the comments than from the actual video.

    • @Sglman
      @Sglman Před 3 lety +10

      Yeah, this was not a great video.

    • @christopherlawley1842
      @christopherlawley1842 Před 3 lety +21

      The comments are always most important when programming

    • @0LoneTech
      @0LoneTech Před 3 lety +4

      I apologize for not upvoting, but you had 42 thumbs up. (Edit: Since it's now showing 62, have an upvote.)

    • @unvergebeneid
      @unvergebeneid Před 3 lety

      @@0LoneTech haha, that's a fair reason ;D

    • @sunburst8810
      @sunburst8810 Před 3 lety +1

      Yeah this video is awful lol. Tells you nothing other than "self modifying code bad!"

  • @RoskGamer
    @RoskGamer Před 3 lety +84

    How did ypu manage not to mention metamorphic malware?
    Decompile -> rewrite itself -> compile -> infect new host
    Its the coolest thing ever

    • @mr.rabbit5642
      @mr.rabbit5642 Před 3 lety +10

      Yeah, when you put "self modifying code" this way it really doesn't make much sense, yet neural networks could be somewhat what he told about and we see how powerful they can be.
      I was rather thinking of smt like a java program that writes additional classes by itself, includes them withing its source code, recompiles the whole thing and reboot. I'm pretty sure with some clever math we could make it into something really powerful

    • @RoskGamer
      @RoskGamer Před 3 lety +4

      @Stay EZ My Friends what are you on? Wtf does binary exploitation has to do with self expanding code??? It seems as if you like to use fancy words to fulfill your ego, but this won't fly here homie

    • @djm2A
      @djm2A Před 3 lety +3

      Mr. Rabbit exactly I was thinking about the necessity of self modifying code in adaptable programming.

    • @TheStevenWhiting
      @TheStevenWhiting Před 3 lety +3

      Is this not mentioned? Thats annoying as came to watch as thought they'd be talking about that. I was fascinated by those types of virus back in the 90s but having no Internet, never knew where to get info on it.

    • @RBLevin
      @RBLevin Před 3 lety +1

      Not to mention polymorphic ...

  • @dipi71
    @dipi71 Před 3 lety +6

    Reminds me of »Core Wars« in its various incarnations, where you have a virtual machine and have specially written programs battle it out with each other. In a block of memory, one program can copy parts of itself, modify any location in memory, overwrite parts of itself and of other programs etc. Whatever program ends up running (or, by any definition, surviving) is declared the winner. Very entertaining if visualized appropriately.

    • @0LoneTech
      @0LoneTech Před 3 lety +1

      I had a moderately successful program in Core Wars. It ran in multiple phases and was designed around how forking would slow you down. It would capture other programs and have them run a loop to repair its own code, until it had swept all the memory and killed them off. The trivial Imp could beat it, since it never slowed down and always overwrote its next instruction.

  • @phasm42
    @phasm42 Před 3 lety +17

    When I was a teenager in the 90s I thought this was the best thing. One thing I quickly learned was that you couldn't modify the next instruction to be executed because at that point it would already be moving through the CPU's pipeline (80486).
    After a couple decades of professional software development, yeah, never do this 😅

  • @ArabGamesGeeks
    @ArabGamesGeeks Před 3 lety +126

    I think the ingredients resemble the data, and the process resemble the instructions. The example given modifies the data rather than the instructions which what all programs do. I dont think this example actually works for self modifying code.

    • @abuzarov
      @abuzarov Před 3 lety +10

      Yes, I was actually hoping to see code that modifies its instructions, not data

    • @vfnikster
      @vfnikster Před 3 lety +3

      Real self-modifying code changes instructions in the cookbook, adding new ones, erasing, or changing quantities, unlike the example here which simply complicates the recipe.

    • @seraphina985
      @seraphina985 Před 3 lety +1

      @@vfnikster Exactly I was thinking a closer approximation would be to have an instruction like 2: if before 3pm substitute the word grill for the word bake in instruction 5. That would seem more analogous as grill is a different operation to bake though the data the operation is applied to is not changed here but you are changing the result by subjecting the data to an alternate operation. Just imagine in this that steps 3 and 4 both involve placing some items into a metal tray (For purposes of the analogy this tray could be a register) later step 5 says either to bake or grill whatever is in that tray the outcome will be different even though the contents (data) is the same as the operation has changed, grill would work better if the tray contains a burger than if it contains a cake.

  • @thehint1954
    @thehint1954 Před 3 lety +80

    I was with him on self modifying code until he started talking about eggs and flour.

    • @RobertShippey
      @RobertShippey Před 3 lety +6

      I think the recipe should use self-modifying flour rather than self-raising flour

    • @martinsnow6641
      @martinsnow6641 Před 3 lety +5

      This is one of my pet peeves with a lot of people teaching programming.
      I started out in 2015, and then i had a hard time wrapping my head around concepts like objects, polymorphism and reflection.
      It came down to these horrible analogies that just makes the subject much harder to understand.
      Call the things what they are. Honestly.

    • @colto2312
      @colto2312 Před 3 lety +2

      @@martinsnow6641 It's all interconnected shoeboxes with strings; in the boxes are excel spreadsheets.

  • @DaveWhoa
    @DaveWhoa Před 3 lety +4

    self modifying code is usually not the way to go about things, but THE _ABILITY_ to create self-modifying code is POWERFULLY FLEXIBLE, providing something else that not many languages can do. It should not be just discounted stereo-typically as something "bad" to do (even if it often is)

    • @sergey1519
      @sergey1519 Před 3 lety

      You can do literally everything without it but with some fraction of things being just a bit slower/longer without self-modifying code in exchange for not being able to change the code(realistically it would be best to just write new code if you need to change write-only code like this)

  • @congenio
    @congenio Před 3 lety +14

    There actually WERE valid reasons to use self-modifying code in the past. I once used a piece of 8086 code that modified code bytes that were about to be executed in the next instructions. Those instructions could not be disassembled, namely the byte sequence 0xd4, 0x09. Actually, 0xd4, 0x0a is the byte sequence that disassembles to "AAM", i.e. "ASCII Adjust AX After Multiply". This instruction effectively adjusts a multiplication result to the base of 10 (0x0a). As one can imagine, the sequence 0xd4, 0x09 also works, but to the base of 9. Once you figure that out and set appropriate values, you can execute that code in order to get a specific result which you can test. By correcting the second code byte to another value, say, 0x0b, you can change the result. However, if you patch the second code byte within a few instructions before actually executing it, the code byte sequence has already been prefetched and decoded by the CPU, such that the change does have no effect at all.
    What this boils down to is a piece of code that cannot be disassembled and behaves differently when executed normally versus being single-stepped in a debugger (where said code modification actually has an effect), thus making analysing the code hard to almost impossible.
    Nowadays, this is effectively useless since executable code segments are usually flagged read-only in memory.

    • @JNCressey
      @JNCressey Před 3 lety +1

      So, what was the valid reason?

    • @jeffreyscott4997
      @jeffreyscott4997 Před 3 lety +1

      @JNCres The alternative would be to have a branching structure to multiple sets of said code. Or a different instruction that referenced a memory location or register.
      What was described saves a couple bytes and a couple of clock ticks. Today's fast and highly optimized CPUs don't need that level of detailed optimization. But the earliest computers you needed all you could get.

    • @Internetzspacezshipz
      @Internetzspacezshipz Před 3 lety

      Just because code segments are in read-only memory doesn’t mean you can’t change them to read-write memory. There are API calls to Windows that let you do that, at least. I don’t know about other environments though.

    • @passerby4507
      @passerby4507 Před 3 lety

      This is absolutely amazing, you could theoretically just make your debugger also "prefetch" too couldn't you?

    • @congenio
      @congenio Před 3 lety

      @@passerby4507 Not via the method that a debugger usually employs for single-stepping: It sets a breakpoint after the next instruction sequence and catches the state after that. Since the running code is interrupted by the debugger itself, the prefetch queue is cleared. Matter-of-fact you have to disable interrupts there to be sure that something similar does not happen even without the debugger. If you really simulated the hardware (including prefetch logic) instead, you might be able to do it - but how could a simulator cope with an instruction that is not even documented (like 0xd4, 0x09)?

  • @jay_sensz
    @jay_sensz Před 3 lety +6

    One potentially legitimate use case for self-modifying code is to intercept library calls and substitute your own code (hooking). This can be very useful for purposes of debugging, reverse engineering, etc.

  • @hellterminator
    @hellterminator Před 3 lety +12

    While all modern operating systems make code read-only by default, they also give you ways to get around it, be it a flag in the binary or an API you call at runtime.

  • @user-lt9oc8vf9y
    @user-lt9oc8vf9y Před 3 lety +9

    Me:
    Me: YES!! Show me
    Me:
    Me: what why not?

  • @KuraIthys
    @KuraIthys Před 3 lety +37

    Self-modifying code is interesting.
    It lets you do things that might otherwise be impossible.
    But as expected of something so powerful, it can very easily cause massive problems if used poorly.
    In a modern context the downsides vastly outweigh the upsides.
    However, it's worth remembering how people worked around historical limitations.
    For instance, imagine having a CPU that doesn't contain any stack based instructions.
    You don't HAVE to imagine, because these DID exist.
    This might not immediately seem like a problem, but without a stack, you can't do function calls, because you won't be able to store a return address. (doesn't matter whether your CPU has explicit 'function call' instructions. If it doesn't, you can mimic ones using jump instructions combined with manually pushing and popping things onto a stack. However if you don't have a stack... Life gets really complicated.)
    So... How do you implement function calls (to be clear on what a function is at this extremely low level - it's where you change the flow of program execution to a different location in memory, then execute some instructions before returning to approximately the same location in memory that you started from) without a stack?
    As it turns out, the answer is self-modifying code; You include a jump instruction at the end of your 'function', and before calling the 'function' you write the return address to the memory location that contains the returning jump instruction of that function.
    It's an interesting example of a workaround, but also of some of the features of even late 70's CPU designs that we now take for granted but which weren't always present.
    You might not immediately know why a CPU needs a stack (as opposed to it just being a nice convenience feature, as a lot of more complex/later instructions are; Nobody absolutely needs a dedicated instruction to store zero to memory for instance, nor a matrix multiply, nor pretty much anything that falls under the category of a SIMD instruction), but as it turns out, a CPU design that has no stack has some very awkward limitations...

    • @passerby4507
      @passerby4507 Před 3 lety +1

      Isn't that just cheating out a "stack" out of (instruction) memory? It also screws over recursion, kills your kittens and opens Pandora's box.

    • @SimonBuchanNz
      @SimonBuchanNz Před 3 lety +1

      Couldn't you just reserve some register as the stack pointer and manually push/pop? Genuinely curious.

    • @bluerizlagirl
      @bluerizlagirl Před 3 lety +3

      @@SimonBuchanNz Not if you can only write to an exact location which is embedded directly into the instruction itself and not modified by an index register. See the SC/MP for an example. The Z-80 and 6502 processors, which were the main fuel for the 1980s home micro scene, both had hardware stacks; the 6502's hardware stack was more limited than the Z-80's, but its architecture allowed you to implement separate data stacks (plural intended!) in software.

    • @srenkoch6127
      @srenkoch6127 Před 3 lety +4

      He also mentioned bootstrapping as a place where self modifying code is use.
      I have used it for my home build 256-byte Computer/CPU (Yes it literally only have 256 bytes of RAM + 2K of EEPROM, being the biggest, slowest and least user friendly computer in my house :-).
      When I want to load a program (if for instance i want to do multiplication which the hardware does not supply) from the EEPROM, I have a EEPROM program which loads a program from EEPROM to memory.
      But in order to run that, I need to input a program which runs the loader which then overwrites the program used to run the loader....

    • @cigmorfil4101
      @cigmorfil4101 Před 3 lety +1

      @@SimonBuchanNz
      When using a Babbage subroutine from C on a GEC 4000 series processor the parameters normally passed on a stack were accessed using the RY index register (without transferring any stack pointer to RY).
      I think the GEC 4000 processors had no stack , using a single location for each subroutine to hold a return address - trying recursion natively would overwrite the return address to never return...

  • @majormunky
    @majormunky Před 3 lety +3

    Way back when all the browsers implemented different event handler functions, I remember reading about a javascript function that would check your current browser, figure out the way to implement the event handler, and re-write the function to only run the event handler that matches the current users browser.

  • @banderfargoyl
    @banderfargoyl Před 3 lety +82

    Saw a self modifying coder once. He had a piercing and a tat.

    • @totaltotalmonkey
      @totaltotalmonkey Před 3 lety +2

      I had a friend who was a self modifying brain surgeon.

    • @jonathanrichards593
      @jonathanrichards593 Před 3 lety +7

      Ah, but did the tattoo say "N=19; while N>1 wait N days; apply Collatz operation to N; get another tattoo with this instruction appended; end while".

  • @alexeski4109
    @alexeski4109 Před 3 lety +5

    I appreciate this series so much. Thank for you addressing the criticisms of the last episode!

    • @maciejmanna9246
      @maciejmanna9246 Před 3 lety

      I did not know that was part of a series, but if You say that it improved, and for me that video was (in terms of quality of explaining concepts considered here) of very low quality, I tnink I'll pass.

    • @alexeski4109
      @alexeski4109 Před 3 lety

      Maciej Manna he references an episode about ‘real programmers’, which this video continues on some of the points made in that video and addresses some of the criticisms of the last video. As far as self modifying code, this is more related as to why it’s not used in commercial software. Essentially, it is near impossible to debug. However, there are some great examples of self modifying code online, if you’re interested.

  • @RupertBruce
    @RupertBruce Před 3 lety +8

    I remember my excitement when I found out that Amiga Basic allowed a variable as the destination of a GOTO 🙂

  • @leonidas14775
    @leonidas14775 Před 9 měsíci +1

    The Forte programming language was designed to be self modifying. Its like Basic, but you can redefine the values of integers and shuffle line numbers as the interpreter moves through your program.

  • @oisnowy5368
    @oisnowy5368 Před 3 lety +2

    I just used it to change absolute addresses in 6502 opcodes as an alternative to putting pointers in zero page and using indirect addressing. Saved quite a bunch of cycles in inner loops.

  • @MidnightSt
    @MidnightSt Před 3 lety +3

    My secret dream is to write a self-modifying program where the original source code that has been written, upon running, writes a second-order source code in-memory, which, when executed, writes a third-order code, which when run, actually does what the program was meant to do.

    • @minoxs
      @minoxs Před 3 lety

      I think I can do this... (for a really simple program)

  • @rkalle66
    @rkalle66 Před 3 lety +3

    My very first program was self modifiying. A mixture of BASIC and 6510-Assembler on a Commodore C64. A function plotter where I coded the input of the function as part of the BASIC code ... changing from BASIC running mode into editor mode an back into running mode as part of the program. Some basic graphical routines (line plotting) was made in assember for performance reasons. PEEK and POKE at its finest.

  • @nihonam
    @nihonam Před 3 lety +14

    I think demoscene is all about this. Otherwise how can they fit all THAT stuff in 4K?

  • @mcfluffcakes
    @mcfluffcakes Před 3 lety +29

    I never really understand what these videos are about from a technical perspective, but god damn I enjoy listening.

    • @PhyshStycc
      @PhyshStycc Před 3 lety +6

      Meanwhile I'm over here (pursuing a bachelors in computer science) getting PTSD from having to write assembly programs in my processor architecture class

    • @diemaco
      @diemaco Před 3 lety +2

      I use them to fall asleep

    • @AllahDoesNotExist
      @AllahDoesNotExist Před 3 lety +3

      @Stay EZ My Friends Come back when you hand solder your own processors.

    • @mylesskinner7331
      @mylesskinner7331 Před 3 lety

      @Stay EZ My Friends Pfft. The trick is to arrange the subatomic components of those materials so the process emerges organically from your starting conditions.

  • @julianperry4856
    @julianperry4856 Před 3 lety +2

    Microsoft/Commodore BASIC 2.0 used sorta self-modifying code in it's interpreter. This was used on the Vic20 and C64 computers.
    One of the main routines (CHRGET/CHRGOT) (get next/current byte of BASIC program) is dropped into Zero page (kinda high-speed short-addressable scratchpad memory for 6502's) at startup. The character address pointer is a 2byte oprand of an LDA (Load Accumulator) instruction. This 2byte pointer is modified by the CHRGET subroutine itself every time it is called.
    I presume this is done to utilize ABSOLUTE mode access to the pointer (4 clock cycles, no index register) , rather than tie up an index register (y) and an additional 1 or 2 cycles in execution. It's 24 byte subroutine is dropped into precious Zero page so that the zero-page modes of the INC (Increment) instruction, saving another clock cycle. Altogether pretty dodgy.
    I have used self-modifying 6502 code a little when writing code that runs inside Commodore disk drives. With so little RAM to play with, and little control over where your program was loaded, the code had to partially rewrite itself to execute properly (unless the program was written as a programmatic lipogram, 6502 code is somewhat non-relocatable). On occasion these programs had to self-relocate and rewrite themselves on the fly to avoid being stomped on by the DOS itself. Fun times indeed.

    • @0LoneTech
      @0LoneTech Před 3 lety

      This reminds me of how the MIT Handy Board relocated code. Since it reused the external RAM bus for the LCD, the display routine was copied into internal RAM and executed from there.

    • @cigmorfil4101
      @cigmorfil4101 Před 3 lety

      Chargot also set the carry flag to indicate if the character retrieved was not a number, or was.
      By having this routine in RAM it was possible to overwrite it (or the first 6 bytes) so that you could intercept every byte of a basic program and interpret new commands before the basic interpreter tried and failed (usually with a syntax error).

  • @maxusboostus
    @maxusboostus Před 3 lety +5

    I had a project at college that required variables to store a count for refreshing an LED display.
    I couldn't get the assembler compiler to recognise variable names, so I modified it to
    increment the memory address directly after the command so the next loop the number was higher.
    Job done, Luckily the code was copied to RAM before it ran so it could be modified every loop.
    I thought I had been clever to get around the compiler problem, but was marked down for ingenuity!
    I could understand for a massive project but for a single one off device I thought it was a little mean.

    • @uuu12343
      @uuu12343 Před 3 lety +2

      Ingenuity? They should be fired for marking down using alternatives or make things work

  • @aaronj1084
    @aaronj1084 Před 3 lety

    Love the video , keep up with the amazing content man and Stay Home and Stay Safe. Will be looking forward to more amazing content

  • @sharpfang
    @sharpfang Před 3 lety +2

    In Minecraft, the entire domain of Slimestone is all self-modifying code. Redstone is in essence a graphical very low-level programming language, in the same vein as PLC ladder languages or VHDL. Blocks, items, redstone lines etc constitute instructions. Then there are pistons that can move blocks, and slime that can bundle a bunch of blocks togeter for moving. And in the end you can build contraptions that modify themselves, say, moving infinitely, traversing specified path, or even traversing terrain and retrieving it block by block and sending them to storage.

    • @jonathanrichards593
      @jonathanrichards593 Před 3 lety

      In which sense a cellular automaton like Conway's Life is a self-modifying program, in which the instructions are written into the fabric of its universe. We may also live in just such a universe...

    • @skyeturner5003
      @skyeturner5003 Před 3 lety

      Being a slimestoner, I would actually disagree in saying that it is self-modifying, I would say only 0.1% of machines actually are

    • @sharpfang
      @sharpfang Před 3 lety

      @@skyeturner5003 If you represent the program as it is represented physically, region files containing chunks, where every block has xyz position, block state, maybe NBT, and suddenly the data representing your contraption sits in a different .mca file I'd argue it modified itself. But even if you disregard this, take the XYZ-addressable missiles. Machines of different speed launched at different time, upon meeting up a segment splits up flying in a different direction, only to be encountered by yet another machine, combining into a segment that launches along third axis, hitting the exact target - this is very much programs modifying each other.

    • @skyeturner5003
      @skyeturner5003 Před 3 lety

      @@sharpfang my point is that while their positioning may change they still retain their original structure and relative position, the second example you gave is definitely self modifying, but I'm saying that is probably a tiny minority of all slimestone

  • @mennoknol8693
    @mennoknol8693 Před 3 lety +5

    Great video! I did miss a bit of the 'fun' and 'playfullness' element of writing self-modifying code. I agree that such code should never be used in a production-environment. However, it can be great fun (and educative) to play around with code purely for research and experimentation. I'd say the same goes for obfuscated code and coding in weird language like brainf#ck. Or even assembly.
    Playing around with uselessly complex code in a sandbox environment can be good training for dealing with/getting your head around *actual* complex production code. But it should be kept where it belongs: in the sandbox.

  • @TheSMasa
    @TheSMasa Před 3 lety +3

    The idea of a recipes for breakfast, dinner, lunch etc are evolving now.. After seeing this I only need one recipe for each of them O.o

    • @sergey1519
      @sergey1519 Před 3 lety

      And you can even balance your ration by adding coin flips or dice rolls to the recipe to choose the meal.

  • @mentatphilosopher
    @mentatphilosopher Před 3 lety +13

    Let’s all program in Malbolge now.

  • @Amonimus
    @Amonimus Před 3 lety +16

    The recepie is more an example of if else case, it's not modified on the run.

    • @Amonimus
      @Amonimus Před 3 lety +2

      Don't see a problem with terminology, it's just that the example that doesn't match the description. The macros, recursions, jumps, or async threads that actually modify executable's behavior would be worth mentioning.

    • @0LoneTech
      @0LoneTech Před 3 lety +1

      It is self modifying code in that there's no instruction to copy the list of ingredients to working memory; the instruction on the recipe actually says to alter the recipe (delete baking powder from ingredients, not put the baking powder away). You're just applying too much common sense to the act of reading it. It still could have been more explicit, e.g. "change step 4" to write in code memory rather than data memory. This level of misunderstanding is not uncommon; it's how OpenSCAD mixed up the concepts of functional programming and parametric design to end up with a weird assignment sequence effect thing.

    • @piotrarturklos
      @piotrarturklos Před 3 lety

      ​@@0LoneTech You could interpret the example your way ("ingredients is a section of the recipe") if you really wanted to, but the video doesn't say what "ingredients" are, so they can be also interpreted as variables, a dictionary, struct fields or any other dynamic state. Hence, the example is bad.
      Also, "working memory" is not clearly defined in the world of cake recipes. :).

    • @0LoneTech
      @0LoneTech Před 3 lety

      @@piotrarturklos We agree that the example is bad, particularly because it demonstrates at least two other very bad obfuscations (mixing units and uses, and spreading out conditionals with no reason or demarkation) and doesn't give a decent sense of the supposed topic.

  • @cbmeeks
    @cbmeeks Před 3 lety +7

    I literally just started working on a Commodore 64 project (yes, Commodore 64) that uses self-modifying code. It's not that difficult to know what's happening if you use proper modules, comments, etc. Self-modifying code certainly can have a place in a constrained computer like the C64.

    • @williamdrum9899
      @williamdrum9899 Před rokem

      I'll often set aside a few null bytes that are only intended to be used in a specific function as temp variables,

  • @AJMansfield1
    @AJMansfield1 Před 3 lety +4

    One type of self-modification I've encountered is when setting up super-fast interrupt handling stuff, where you set up the interrupts ahead of time by pre-determining the sequence of register writes and then writing literal instructions into the interrupt vector, paying all the indirection penalties up front so you don't need to dereference any memory or do anything else slow during the time-critical section. This is a very structured kind of self-modification though; it only gets weird when you start combining that with lookup table shenanigans or re-entrancy.

    • @0LoneTech
      @0LoneTech Před 3 lety +1

      A little hard to follow there, but it reminds me of inlining ISRs on those architectures where the interrupt vector table is code, not pointers.

    • @AJMansfield1
      @AJMansfield1 Před 3 lety

      @@0LoneTech Yeah, for inline ISRs especially, if you can make the _very_ first thing the interrupt does be "tweak this output bit" and only then jump into some more elaborate service routine to set up the next ISR.

    • @williamdrum9899
      @williamdrum9899 Před rokem

      I've gone as far as copying interrupt service routines to RAM just to avoid using CALL statements. On Game Boy a single CALL takes 20 cycles. Ouch

  • @TheSalakiller
    @TheSalakiller Před 3 lety

    I feel like the teacher I had on my first year while doing computer science should have seen this video, having your first final about self modifying code is no joke *What NOT to do*

  • @markwhatever256
    @markwhatever256 Před 3 lety +23

    Self modifying code was essential with older cpus like the z80, e.g. you cant use an index register with a dynamic offset so its easier to save processing by modifying the instruction in memory so the behaviour of a loop changes. Same with z80 not having div/mul and effective use of lookup tables.
    Of course nowadays it shouldnt be used as mentioned tldue to caches/branch prediction etc.

    • @nuk1964
      @nuk1964 Před 3 lety +3

      I do recall on some systems there wasn't really a set of CALL/RET instructions nor a machine stack. A subroutine call was implemented using an instruction that stores the address of the following instruction into a register (i.e. the machine register is assigned the value of the return address) and a branch is performed to the subroutine address. One of the first i9nstructions of the subroutine would be to take the return address and modify the final instruction (which was a branch to an address),. Obviously with such an arrangement you couldn't run recursive code.

    • @nuk1964
      @nuk1964 Před 3 lety +3

      I also remember writing some self-modifying interpretive BASIC code, where it would modify instruction tokens (e.g. PRINT for LPRINT) so that I could use the same subroutine for output to console and output to printer -- just scan through program memory and modify the PRINTs to LPRINT (or LPRINT back to PRINT) just before GOSUB to that subroutine.

    • @0LoneTech
      @0LoneTech Před 3 lety

      @@nuk1964 Technically you could run recursive code, but only by painstakingly implementing a stack in your recursive routines instead of the usual deposit return address. It's actually worse on some architectures that do have a hardware stack, because they're not always accessible by other means than call and return.

    • @rdoetjes
      @rdoetjes Před 3 lety +1

      I used it a lot in 6502 and on the Amiga to update the copper list.

    • @nuk1964
      @nuk1964 Před 3 lety +1

      @@0LoneTech Yep, that's what I saw with code generated by various compilers that supported recursion (e.g. Pascal) on some systems (e.g. Control Data Cyber 750) -- the library code included stack-management routines. The code generated by FORTRAN on those systems generally tended NOT to support recursion (where it was typical to "unroll" recursive algorithms to avoid this issue).
      I'd suspect it was common to implement a software stack on microcomputers based on MOS Technology 6502 -- the hardware stack was limited to 256 bytes.

  • @stellabckw2033
    @stellabckw2033 Před 3 lety +3

    i remember asking about this in the 3rd year of high school, and my teacher challenged me to write a simple program in assembly that did so. i did it, and it worked. but it took so long for me to explain what i did, because i didn't remember what i had written the week before. lol

  • @entyropy3262
    @entyropy3262 Před 3 lety +3

    Life is self optimizing code, since life comes with the genetical (and epigenetic) code individual to each lifeform.

  • @sfperalta
    @sfperalta Před 3 lety +2

    The only practical reason I could think of for using this technique is for embedded systems where a patch may be required to fix a particularly egregious bug. Physically remote systems (such as unmanned spacecraft) may use this technique to correct the behavior of the mission and a facility is usually designed into the software to allow for such patching while the mission is underway.

  • @zirconia3
    @zirconia3 Před 3 lety +1

    I used to do some SMC on my 6809-based Tandy Color Computer 3, writing blitting routines to replace the slow and clunky BASIC HGET/HPUT commands. I have a pixel I need to save for later, but pushing to and pulling from the stack is expensive, so I may as well save that data straight into the future instruction that needs it.

  • @iabervon
    @iabervon Před 3 lety +1

    There was an early computer architecture with no instructions for math operations on two registers. If you wanted to add two variables, you would store one of them into the constant in the add instruction. There's also been computers without either "return" or "jump to value of register", and people would get back from subroutines by modifying the jump instruction at the end of the subroutine just before jumping to the start of the subroutine.

    • @0LoneTech
      @0LoneTech Před 3 lety

      Still exist, as a matter of perspective. Look up TTA (transport triggered architecture). TCE is a current parametric implementation.

    • @iabervon
      @iabervon Před 3 lety

      @@0LoneTech There are still architectures that don't use math operations on registers, but nobody these days has only "constants" in the program as values, which then require self-modifying code to work with.

    • @0LoneTech
      @0LoneTech Před 3 lety

      @@iabervon I see. You mean modifying an immediate operand, like the PDP-1 DAP instruction (although that was mostly indirect, not immediate, hence Address Part) did, but in order to parameterize instructions with require immediate operands. Pardon my sloppy reading. Funny how the more direct instruction set would require more indirections in the actual program.

    • @iabervon
      @iabervon Před 3 lety

      @@0LoneTech Yup. You'd have something like:
      LOAD left
      ADD (opcode for SUB instruction in the correct part of the word)
      STORE label
      LOAD right
      label:
      NOP
      IIRC, the opcode for ADD was 0, so you could just write your accumulator into your program if you wanted to add your accumulator now to your accumulator when you reached that location. I think the clock cycle was such that there wasn't time to both access memory for a value and do an addition, so you had to make do with reading and writing the accumulator and looking at the instruction you're executing.

  • @prateekpravanjan2905
    @prateekpravanjan2905 Před 3 lety +1

    I have been wanting to see a video of this topic,thanks a lot

  • @TerrAkon3000
    @TerrAkon3000 Před 3 lety +16

    I got 3 ad breaks throughout the video, each time interupting in the middle of a sentence. That is freaking obnoxious!!!

    • @amciaapple1654
      @amciaapple1654 Před 3 lety +1

      Use the Brave browser instead of the shite you are using. You will not get commercials inside YT videos then.

    • @nem81
      @nem81 Před 3 lety +4

      Install uBlock Origin or similar

    • @0LoneTech
      @0LoneTech Před 3 lety +1

      @@nem81 and "enhancer for youtube", if I recall correctly.

  • @guiorgy
    @guiorgy Před 3 lety +2

    A valid case for self modifying code is optimization! Suppose you code is very low level directly utilizing CPU instructions wherever possible for optimal performance. This means that you'll have to have extra code for cases where a CPU supports or does not support an instruction set, which needlessly uses space, not to mention that the code would have to check every time it wants to use an instruction set to see if the CPU supports it, which negates some of the benefit of doing so in the first place! The solution? Simple, just have your code do the checks on the first run, and modify the code such that any unsupported instructions and extra code and checks are removed, including the code that does the modification! The result is code that is super optimized for your platform. Although, the drawback is that, if you swap your CPU, the code might no longer work and will need a reinstall!
    PS. I've never done this myself, but I know some libraries do that for best performance!

  • @abuzarov
    @abuzarov Před 3 lety +4

    Core wars game is another reason you might want to write self-modifying code

  • @mikezmit340
    @mikezmit340 Před 3 lety +1

    First: I am not a programmer, so when I resort to coding to solve a problem I just mock about until I got something that does the job (satisfactory enough).
    Once way back I thought up an way of doing some "advanced" recursion that needed to change its action depended on the depth of the run. I came up with a (theoretical) solution, using an interpreted language like Perl or Python and having the program appending code to its source file and then calling the file it self again to include the new code.
    I aired my brilliant idea on a forum for programmers, where I was told it was an interesting, but awkward way to go about my problem, and it would most likely not work.
    When asked in the forum what I was trying to achieve, I gladly explained and after 20 minutes or so I got a reply from some helpful fellow containing a single line (some 160 characters) of seemingly random symbols to be run by the perl interpreter with the data file as a parameter and it would then modify the data file as I had described. It worked flawlessly on all input I ever fed it, and I have never had any clue to how it worked. This led me to believe, that when the ultimate truth to the universe is unveiled it will be a perl one-liner.

  • @Longuncattr
    @Longuncattr Před 3 lety

    I've only needed to use self-modifying code once, to achieve an animated color effect applied to a wide 48-pixel bitmap on the Atari 2600; the code replaces a zero-page memory load with a runtime-updated immediate load. Each loop of the code needs to run within one scanline of time, and the code as it is runs in *EXACTLY* one scanline per loop; it only just works!

  • @aaron552au
    @aaron552au Před 3 lety +6

    Isn't JIT compilation also an example of self-modifying code?
    My (basic) understand of JIT compilation is that not-yet compiled functions start out as a jump to "compile this code fragment" which then gets replaced with the actual compiled function once the JIT compile is complete.

    • @AySz88
      @AySz88 Před 3 lety +1

      Depends on implementation - like if the jump is using a function pointer in data, you can put code in newly allocated memory (and mark it as executable) and change the pointer to the newly written section. It would be more self-modifying if you always jump to the same place in memory, and write to a preallocated space... (Though I notice that the function would return back to the next instruction after the line where you called the JIT, so you'd have to be careful with what instructions are there, or some trick to pop or rewrite the stack, or the like.)

    • @ricosrealm
      @ricosrealm Před 3 lety

      Technically yes, but those types of programs run in some sand-boxed run-time environment to support a feature where the process for self-modification is simply a (re)compilation step before the next iteration and is also very generalized for any program. So it isn't in the same bad practice category as a program that is trying to modify steps in it's own process in an ad-hoc manner since the 'compiler' can be run independently of a specific 'program' that is loaded.

  • @TheGreatAtario
    @TheGreatAtario Před 3 lety +1

    Self-modifying code doesn't have to be a nightmare. One of my earliest programs when I was a kid was a BASIC program to draw the graph of a function, which you would type in response to a prompt. It would then lay the formula you typed into a specific line number of its own code, which was surrounded by the support code to make it do the thing. Essentially, I was just avoiding writing my own expression parser by leveraging the one built in to BASIC. Interpreted languages for the win.

    • @danmerillat
      @danmerillat Před 3 lety +1

      BASIC injection vulnerability decades before '; drop table users; -- was even invented.

    • @TheGreatAtario
      @TheGreatAtario Před 3 lety +1

      @@danmerillat Indeed! Although on that platform you were pretty intrinsically limited in the damage you could do, so ¯\_(ツ)_/¯

  • @fletcherreder6091
    @fletcherreder6091 Před 3 lety +9

    That's the thing about these 'real programmer' stories, they all take place in an era of highly constrained machines, hex dumps, and 'teams' as small as one person who *stay there their whole careers*. Some of these techniques are still useful where highly constrained systems still exist, microcontrollers and the like, but even then it usually isn't worth it, especially for network connected stuff where the lack of maintainability can turn into a Problem.

    • @BertGrink
      @BertGrink Před 3 lety +2

      Some microcontrollers, such as the Intel 8051 family and its derivatives, are unable to run self-modifying code because they use the Harvard architecture: completely separate memory spaces for code and for data, and the CPU core is physically unable to alter anything in the code space.

    • @fletcherreder6091
      @fletcherreder6091 Před 3 lety +1

      @@BertGrink Also PIC and AVR, which together with 8051 account for most of the stuff you're likely to run into. I wasn't talking about self modifying code so much as the full suite of 'real programmer' techniques; bit twiddling, per-cycle behavior dependance, intentional race conditions, all that jazz. I felt that the comment was still apropos since this is part of a series discussing these techniques, and I apologize for the misunderstanding I've caused.

  • @randalllasini8772
    @randalllasini8772 Před 3 lety

    Many decades (mid 80’s) ago on my 2nd computer I had a C128D and made my own audio sample card for its single expansion slot. I was in high school and didn’t have access to pcb or etching so I literally used really thick card board poked the resistor/caps and IC (adc chip) through it and soldered it all together with wire underneath. Then I soldered it to the expansion slot female plug (I wasn’t allowed to solder it directly to the c128 as it was the families computer).
    Anyway I then had to write the code to sample it at a fixed interval (via nmi tied to the tv refresh line 200) that would read in the 8bit sample convert it to 4bit (commodore had 4bit volume register) then store it in memory.
    The wrote the program to play it back via the on board sound card at the same rate.
    And to make the most of the memory I stored two 4bits into a single byte and then depacked it on play back (used a LUT in memory as it was faster).
    And then some asked me to play what we sampled backwards.....
    So to save memory i used self modifying code so if it was sampling forward or backwards the INC command was changed to DEC and the 4bit packer changed the packing order as well. And visa versa. And similar if it was reading from the expansion slot or writing to the on board sound, those sections of code would be changed on the fly.
    Ahh fun times.
    About a year later I finally got a c compiler for the c64 to play with and learn other bad habits. But that’s another story...

  • @TheSulross
    @TheSulross Před 3 lety +1

    Modern OS like to assume they can always go back to the executable file if a page of code gets paged out but later needs to reload it for execution again. If the text segment (code) pages are set to not be writable, then code pages never have to be written to the swap file when paged out. Makes things tidy for the OS to deal with. Just memory map the text segment from the executable file image.

  • @amciaapple1654
    @amciaapple1654 Před 3 lety +2

    MS-Windows is using self-modifying code to generate GDI raster ops (ROPs) on the fly. The reason is significant speedup.

    • @az09letters92
      @az09letters92 Před 3 lety +1

      Was using a long time ago. Think Windows 3.11 times in early nineties. Perhaps Windows 95/98/ME. After that, I'd guess not.

  • @lotationx8987
    @lotationx8987 Před 3 lety +1

    It'd be awesome to have your videos subtitled

  • @TheJaguar1983
    @TheJaguar1983 Před 3 lety +4

    Another use case: Redcode in Core War. In fact, it's in important part of the game.

    • @fllthdcrb
      @fllthdcrb Před 3 lety

      Well, it is a game, with things running on a deliberately insecure platform. The whole point is to attack other programs, either killing them-by modifying their code-or surviving their modifying attacks. Not really a normal programming environment. 😆

  • @lukefairbanks8622
    @lukefairbanks8622 Před 3 lety

    makes me think about having a feedback mechanism where machine asks moderator such as person for approval of big changes, with maybe a gui interface or something. Or having different roles for program nodes to manage changes & guide towards a goal

  • @andrewharrison8436
    @andrewharrison8436 Před 3 lety

    I have written code generating programs, that was mind stretching. Code modifying - only seen it once - it was unintentional - it converted the code space (including pointers) to upper case, beyond my ability to work out what had gone wrong.
    I believe COBOL had a "feature" that allowed some form of code modification.

  • @rdoetjes
    @rdoetjes Před 3 lety

    Self modifying code is a very normal procedure.
    I used it a lot on the C64/6502 and 8086. I mainly kept to changing an offset.

  • @Phlip45
    @Phlip45 Před 3 lety

    One of my favorite cards in Netrunner

  • @jeroenleiden
    @jeroenleiden Před 3 lety

    The boot process doesn't use self modifying code. The bootsector is loaded to a specific address and then is executed. That code can copy itself to another location and then jump there. From the new location it can load a next stage at the first location, overwriting it's own code that isn't executed at that location anymore. After the loading it will jump to that next stage.
    The decrypting (or decompression) of code also isn't self modifying code. The decrypting/decompression code knows nothing of the other code and just sees it as a big blob of data to decompress and jump to the entry point when decompression is done.
    When a virus does some encryption (actually is was more obfuscation) it can be an xor or something similar simple with a changed parameter. That parameter can be some byte(s) on a fixed location, but that location can be in an opcode to save a little space. When the location is in an opcode it can be seen as self modifying code, but the modified code won't be executed until the program will be executed again.

  • @opium32
    @opium32 Před 2 lety

    I just started writing a blackjack game in c64 assembler. The picture cards are three blocks of 7*10 characters. The best way I could work out was to use a two dimensional array of pointers... C64 is 8bit system with 16bit address space, so pointer array is a list of there low bytes, then a list of three high bytes. So with one index 0 to 3, I can read low byte, offset index, and read corresponding high byte for where the card data is stored.... Then I modify the code that reads the characters to change an LDA command and use my low byte and high byte values of the address for the actual data. When you combine this with labels it's pretty easy to read and understand. If I was going to do this without self mod code.... I don't even know where to start really. There's also self mod code for reading the colour info, and for modifying the screen memory and colour memory locations to write the data to... Once you've changed the code it's easy to iterate to draw the card. Without modified code, there is no way to indirectly specify a 16bit address using two other 8bit address.... like with STA you can specify a 16bit address to store A eg STA $3000 but you can't say "store A in the address made up of the values in address $4000 and $4001, which store $30 and $00 respectively.... I think you would have to have three blocks of code for each card, and you'd probably have to iterate over the entire screen of 40*25 characters, have some pretty complex maths to identify when you should draw a card or do nothing...

  • @JerrodVolzka
    @JerrodVolzka Před 3 lety

    I wrote a utility that runs a couple of queries against a database and generates classes based on the results. MalwareBytes hits on this application and Malicious AI. I was so proud :)

  • @NickTaylorRickPowers
    @NickTaylorRickPowers Před 3 lety +10

    The recipe really makes it obvious how complex and undesirable self modified cake would taste

  • @cmd2tuts
    @cmd2tuts Před 3 lety +3

    Well put me on a list, I'm only writing self modifying code from now on.

  • @TheMacfruit
    @TheMacfruit Před 3 lety +2

    My company used a fair bit of self-modifying code. Nothing related to modifying the binaries, but there's a lot of string values that are then dereferenced to class names. This was definitely a mistake.

  • @Flankymanga
    @Flankymanga Před 3 lety

    Boy now i have finally understood the bootstrapping FINALLY!!!

  • @kevincozens6837
    @kevincozens6837 Před 3 lety

    I have also seen self-modifying code be used in the early days of 8-bit computers for program copy protection.

  • @Ohmriginal722
    @Ohmriginal722 Před 3 lety

    What if you make a program which modifies it’s code in a new file and then compiles the new code and runs it to replace itself. Then you have a separate folder for all the versions the program creates separate from the starting program

  • @PerryCodes
    @PerryCodes Před 3 lety +1

    His bookshelf looks like mine... including The Art of Programming books that I've tried starting about 35 times. But Code Complete and Advanced Windows... those have gotten some major mileage! Even my Perl Cookbook would have been used quite a bit back in the day... not so much now :(

  • @RickeyBowers
    @RickeyBowers Před 3 lety +6

    Sometimes it's desirable to have a program that is very difficult to read.

  • @Omnifarious0
    @Omnifarious0 Před 3 lety

    As I understand it, at one point in time indexed addressing modes didn't exist. You could laboriously do all the adding, or you could just modify the instruction so it has a next address. :-)

  • @hlavaatch
    @hlavaatch Před 3 lety +25

    Self modifying code is only bad on modern hardware, with deeply pipelined architecture and caches. It was perfectly OK on the ol' 6502

    • @EgonOlsen71
      @EgonOlsen71 Před 3 lety

      Exactly!

    • @hbp2m
      @hbp2m Před 3 lety +4

      Or on an ARM2. I've done it a few times, like when I had to divide millions of 16 bits samples by the same value. No division instruction, so the fastest way was to generate a short piece of code that depended on the value.

    • @simpletongeek
      @simpletongeek Před 3 lety

      @@hbp2m Something like
      dv=1 / N
      Samples = Samples times dv, for all samples?

    • @hbp2m
      @hbp2m Před 3 lety +2

      @@simpletongeek It was a realtime lowpass/hipass filter in an audio application, probably on a 35 MHz ARM3, not an ARM2. The loop was rewritten depending on the cutoff frequency. Can't remember much more, it was about 26 years ago...

    • @gianluca.g
      @gianluca.g Před 3 lety

      @@hbp2m you mean a look up table of 65536 entries, each entry containing the precomputed divided value?

  • @dracenmarx
    @dracenmarx Před 3 lety +1

    You forgot that it is often used in copy protection, where the EXE is encrypted with the CD ROM signature

  • @TimothyWhiteheadzm
    @TimothyWhiteheadzm Před 3 lety

    When talking about the CPU instruction pipeline you did not mention that the CPU may actually do prediction and execute some instructions ahead of time either simple parallel execution where it knows what is coming and knows future instructions don't depend on previous ones, or where it predicts what branch you will take and starts fetching/executing some instructions down that path ahead of time.
    I wonder if there are any bugs in CPUs with regards to this ie if they always correctly recognize when you overwrite an instruction.

  • @Diggnuts
    @Diggnuts Před 3 lety +2

    Arguably, the brain is fundamentally a system that uses self modifying "code" to function. I do not think that general AI is even remotely possibly without it on some level. Not saying that a von Neumann machine is the platform to implement it, but at some point we'll need embrace the concept and actually understand its immersive properties instead of labelling it to hard to understand or debug.

  • @oberguga
    @oberguga Před rokem

    Forth system runtime bassycally is self-modifying code and it's pretty neat.

  • @charlieangkor8649
    @charlieangkor8649 Před 3 lety

    I wanted to use self-modifying code in Twibright Links, which is one of the fastest graphical browsers, despite doing all image processing in photolinear 48 bit color depth, gamma correction, LCD optimization subsampling and dithering even for 24 bit monitor! But due to the modern environment which doesn't allow to modify code pages, I ended up with a wasteful template that generated about 20 different routines for each possible memory organization of the videoram. I basically simulated the self-modifying code in templates. Twibright Links is shipped in every major Linux distribution.

  • @AgentM124
    @AgentM124 Před 3 lety

    Funny thing about the recipes. We actually have recipes here that change the instructions.
    There are 2 options.
    - regular
    - with extra veggies
    For regular, follow steps 1-4 then 6-7
    For extra veggies follow 1-2 then 4-7
    Also put less water in when using the veggies recipe and you can opt to put no meat.
    Or something along those lines.

    • @tcritt
      @tcritt Před 3 lety

      Thats just an if statement or a switch though.

  • @Norman_Fleming
    @Norman_Fleming Před 3 lety

    Started out writing in a near assembly language. We wrote lots of self modifying code. If you know how to write it, you can read it, no problem. Comments help. Of course we used regular patterns so you learned what it looked like.

  • @Liggliluff
    @Liggliluff Před 3 lety +1

    (8:00) That's a nice set of units, except tsp/tbsp. If someone were to actually write a recipe, stick to only using metric. tsp/tbsp aren't universal.

  • @ScorpioneOrzion
    @ScorpioneOrzion Před 3 lety

    is a more save way to have a file what the program can modify and read from, and it does the instruction from there?

  • @piotrarturklos
    @piotrarturklos Před 3 lety

    One can make the argument that the cake example is not really self-modifying code, it is just some bad code that changes variable values in unexpected places. In the world of cake recipies, any value can be interpreted as a dynamic variable rather than a part of instruction.
    Hence, to be self-modifying in the world of cake recipies, it would have to modify instructions after they are executed, and then execute them again with a jump, such as a "go back to X" instruction or a loop. Numbering would have to be introduced to keep track of what we are modifying and where we are jumping.

  • @fllthdcrb
    @fllthdcrb Před 3 lety +4

    Objection: The second recipe shown is _not_ self-modifying code, at least not in the way you surely mean. It only modifies the ingredients, which is akin to a program modifying its variables, which they do literally all the time. Granted, it is written in a confusing way, but that's a little different problem. An actual self-modifying code analog might be, e.g., if we take the first recipe and add a first line to the instructions that says, "Replace the phrase 'one at a time' below with 'two at a time'", or "Insert the word 'soy' before 'milk' below". Those are rather tame examples, of course, but you get the idea.

  • @JimCoder
    @JimCoder Před 3 lety +1

    Also, self modifying code can make proper testing nightmarish because the code to be tested won't exist until after the program begins to run. It can be fun to play with though.

  • @marianoV612
    @marianoV612 Před 3 lety +1

    Now I want to do this

  • @allanrichardson1468
    @allanrichardson1468 Před 3 lety

    I once worked in a shop that used IBM S/360 hardware with two different operating systems, one to run (or simulate) a high throughput, fast responding, online system (back when even mainframe CPUs were rated in MHz rather than GHz, and the largest real RAM, using magnetic cores, was under a megabyte) for responding to text-based transactions with text-based replies; and the other one, at first DOS/360 and later OS/360, to maintain batch production data and programs for batch and online systems. A critical program for building batch-mode test scripts for a test copy of the online system, written in IBM S/360 Assembler for DOS and later for OS, contained a first-time-through switch to fall through a setup routine when the program was first loaded and executed, but to bypass it if the program was executed again without reloading. So far, OK, but ...
    RATHER THAN using a byte of the data area as a switch, this program used the binary op code of the switch testing instruction itself AS the switch. Even in those days this was considered bad code, but somebody at IBM wrote it and shipped it with the online system and its accessory programs. The switch testing instruction was CODED as NI, And-Immediate, which replaced one data byte with the logical And of itself vs an “immediate” byte coded in the instruction. This set the condition code so the next instruction would not branch around the first time code. But the byte that was altered by the NI instruction was ITS OWN OPCODE, changed by the carefully chosen “mask” byte to a TM, Test Under Mask instruction, so that subsequent passes through the program WOULD branch around the first time code (TM uses the byte of immediate data to select only the bits of the data byte (which is now the TM opcode itself) for which the mask has a 1, and out of ONLY those bits, sets the condition code to all-1s, all-0s, or mixed 1s and 0s).
    Personally, I would have given this guy praise for his cleverness and made him rewrite the code to use a switch declared as data. I think it was redesigned to be re-entrant, i.e. pure procedure, when it was updated for the higher levels of OS/360.
    Self modifying code is almost always bad, except for puzzles!

  • @dwagner6
    @dwagner6 Před 3 lety

    Love to see KDE installed!

  • @refactorear
    @refactorear Před 3 lety +1

    This reminds me of my time cracking stuff by writing 0x90s to nop other instructions, usually jumps after tests or comparisons👍

    • @fixups6536
      @fixups6536 Před 3 lety +1

      Count me in!
      OK, that was a long time ago, no need to call the police. :)
      Anyway, I stopped after filing a box with games on 5"1/4 floppies. I had cracked all of them, but never played any, because programming and running things inside a debugger was more fun...

  • @JNCressey
    @JNCressey Před 3 lety

    Before we start, lemme guess, it's the loop with no exit condition but an overflow of a variable being repeatedly incremented eventually turns into a jump instruction.

    • @0LoneTech
      @0LoneTech Před 3 lety

      That would have actually connected this to the Mel tales... too bad the video didn't contain that much fun.

  • @homomorphic
    @homomorphic Před 3 lety +1

    All of the software that defends against malware also uses self modifying code to prevent the malware authors from being able to disable the defensive software.

  • @vhm14u2c
    @vhm14u2c Před 3 lety

    I did this on ASP for my personal homepage in past, where if someone tried to access something they were not supposed to, the IP address would be blacklisted, appended into a linked ASP page, and each page calls that IPcheck coding at beginning of each page , where the page would literally stop if the IP was blacklisted.

  • @ryanhaart
    @ryanhaart Před 3 lety +2

    Wouldn't the MMU in moderm computers prevent you from writing into RAM containing the program code?

    • @0LoneTech
      @0LoneTech Před 3 lety

      If instructed to, yes. That function is actually from MPUs, one of the predecessors of modern MMUs.

    • @briansonof
      @briansonof Před 3 lety

      The Not eXecutable flag for memory protection also normally protects dynamic memory allocations from getting executable. Of course, you are often given facilities (operating system's "system calls") to modify your memory mappings and replace a page of program code with a copy-on-write or copied, writable page or to make a your dynamically-allocated memory executable.

  • @jay_sensz
    @jay_sensz Před 3 lety +1

    Modern operating systems protect you from _accidentally_ overwriting your code. But you can still explicitly request write permission to code segments via a system call (VirtualProtect on Windows, mprotect on Unix).