Self Compiling Compilers - Computerphile

Sdílet
Vložit
  • čas přidán 19. 07. 2019
  • Using T-Diagrams, Professor Brailsford shows us how to take our compiler to the next level.
    Previous video on t-diagrams: • Bootstrapping with T-D...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
  • Auta a dopravní prostředky

Komentáře • 245

  • @Kyrelel
    @Kyrelel Před 4 lety +135

    Back in the 80' s we used to do this sort of thing all the time, to extend languages. You would write a compiler that understood the new language extensions, compile it under your existing compiler, then write another compiler that used the language extensions, and compile that using the extended compiler.
    e.g. if your language does not have constants, you write a compiler system that could lex/parse/codegen the term "CONST "
    Then write a new compiler that uses "CONST"'s and compile it with the CONST-aware compiler (you could even re-feed this back through itself to generate a better compiler)
    A few simple tweaks to the compiler code and you can now have ENUMs, and so on
    The most fun was writing optimising compilers and feeding them through iterations of themselves :)
    FORTH was an especially great language to do this process on

  • @Kotfluegel
    @Kotfluegel Před 5 lety +89

    I once read warning in a book about this, that a malicious developer could insert a piece of code that detects if the compiler compiles itself and injects the source code for injecting that backdoor again. And when the compiler compiles something else it could just install the backdoor into the program. After the compiler has compiled itself with the malicious code once, the developer can even purge his code from version control in order to cover his tracks. This is the main reason, why you must keep all the versions of your compiler. If the backdoor is detected only several versions later, you have to go back to the last non backdoored version and start the bootstrapping process again.

    • @AlanCanon2222
      @AlanCanon2222 Před 3 lety +31

      That would be "Reflections on Trusting Trust", Ken Thompson's ACM Turing Award lecture.

    • @oleh6738
      @oleh6738 Před 2 lety +4

      @@AlanCanon2222 Thank you. That was a very interesting read.

    • @AlanCanon2222
      @AlanCanon2222 Před 2 lety +2

      @@oleh6738 It is. I read it 1996, standing in the Stanford University campus bookstore for two hours, my legs aching. I'm from Louisville, Kentucky, but have visited every landmark of computing I can afford, and my experience of Stanford University consists of 45 minutes of walking the incredibly lovely campus, and 4.5 hours standing reading books I couldn't afford in the bookstore.

    • @AlanCanon2222
      @AlanCanon2222 Před 2 lety +1

      @@oleh6738 The thing that fascinates me most about it, and which I struggle to understand the mechanics of, is the task of getting a computer program to output its own source code.... that for me is a more deeply interesting aspect of his lecture than the horrifyingly obvious-in-retrospect light it casts on computer security.
      The paper is doubly interesting because it has deep results in it from pure computer science, and also this vastly troubling real-world implication to it. Perfect example of the math-meets-engineering nature of the field, with all its thousands of years of political and social consequences. Computer Science is a subset of the Humanities.

    • @compilercompiler6216
      @compilercompiler6216 Před 2 lety

      I've heard from others about the Linus version of this

  • @sharkuc
    @sharkuc Před 5 lety +294

    Just realized this is why gentoo compiles gcc twice when doing an upgrade (or used to at least). One of those "it's glaringly obvious but I had to have someone else point it out to me before I noticed" moments.

    • @peter_castle
      @peter_castle Před 5 lety +31

      it's not glaringly obvious. 99.99% of people would have never guessed that.

    • @iabervon
      @iabervon Před 5 lety +36

      It's not Gentoo in particular that compiles gcc twice; gcc's own build process compiles its source code twice. It also at least used to leave out support for features its own source didn't use, such as Fortran or floating point math, since the first build was only going to ever compile one program.

    • @JodyAlford
      @JodyAlford Před 5 lety +13

      Three times I thought. Once with the basic copiler. Once to optimize the compiler. Once to build itself with the optimized compiler.

    • @Camoceltic
      @Camoceltic Před 5 lety +3

      Same for OpenJDK, as long as you have the "jbootstrap" use flag enabled.

    • @sabujpattanayek8799
      @sabujpattanayek8799 Před 5 lety

      Why is it not possible to fully optimize with -O2 or -O3 the first time?

  • @threeMetreJim
    @threeMetreJim Před 5 lety +92

    I remember doing similar when I was 14... Had an assembler running in BASIC that was extremely slow, so wrote a new one in Assembler, got the BASIC assembler to assemble that, and then used the binary version of the assembler to improve and complete itself. Was back in the day, when 6502 assembler was a thing. Was happy with the result, assembled itself something like 100 times faster than the BASIC version in the end.

    • @profdaveb6384
      @profdaveb6384 Před 5 lety +11

      Thank you - and also to a few others on here - who've contributed some nice first-hand bootstrapping tales. Your speed-up factor of 100 must have made it all seem worthwhile ...

  • @VishvakaRanasinghe
    @VishvakaRanasinghe Před 5 lety +249

    Fun fact: C# compiler is written in C#. Also Typescript transpiler is also written in Typescript. The idea of a self compiling compiler has many examples.

    • @xplinux22
      @xplinux22 Před 5 lety +32

      Indeed, it's quite a fascinating process and a rite of passage for many languages. The Rust compiler is written in Rust, but there are two different versions used in the bootstrapping process. The very first original compiler was written in OCaml and was once used for bootstrapping, but it has been replaced by a more advanced version called *mrustc* written in C which is the one typically used for bootstrapping *rustc* these days.

    • @sundhaug92
      @sundhaug92 Před 5 lety +13

      Also: pypy is written in a subset of python

    • @chillyvanilly6352
      @chillyvanilly6352 Před 5 lety +15

      Really...? And nobody mentions Rust-written Rust compiler?! O.o
      I mean of all those mentioned, this is the most revolutionary, don't u think? Hh

    • @TomStorey96
      @TomStorey96 Před 5 lety +8

      Golang is also "self hosted" I believe.

    • @connorhorman
      @connorhorman Před 5 lety +7

      The Oracle Java Compiler is a Java Library.

  • @fullerdb
    @fullerdb Před 5 lety +52

    The machining tools analogy really unlocked this for me! 👍

  • @jasongladen82
    @jasongladen82 Před 5 lety +23

    I Remember a story from "the Jargon file" about someone modifying the compiler to install a backdoor in the login software and a backdoor seed in the compiler. then compiling itself and removing all traces.

    • @Ryndae-l
      @Ryndae-l Před 4 lety +8

      Probably Reflections in trusting trust by Ken Thompson... Terrifying thoughts

  • @cmdlp4178
    @cmdlp4178 Před 5 lety +82

    The best videos on this channel are those with Professor Brailsford.
    I guess the compiler series also has some episodes about disassembly and then decompilation. (Decompilation is hard, because of UB (undefined behavior) in C (the target language))
    Edit: the series should also contain a video about undefined behavior in C and C++, because it is a serious topic.

    • @LordDice1
      @LordDice1 Před 5 lety +1

      Try Decompiling Visual C++.... 😄

    • @noxabellus
      @noxabellus Před 5 lety +2

      Decompilation is hard because its akin to unscrambling an egg. Not sure why you think its hard because of UB. Thats a small part of the equation. UB is simply a consequence of low level code. Not every use case can be known at compile time. If you look at a language like Rust, there is very little UB in the safe part of the language, but if you use the unsafe escape hatch you can produce the same kinds of UB as C.

    • @cmdlp4178
      @cmdlp4178 Před 5 lety +3

      @@noxabellus When UB would be ignored, the decompiled code might be incorrect, the generated code has to use either compiler builtins to correctly reflect the binary or insert checks everywhere.

    • @obinator9065
      @obinator9065 Před 5 lety

      My Love & Hate :: C++

    • @subschallenge-nh4xp
      @subschallenge-nh4xp Před 5 lety +1

      Try also to decompile app apk with this app into java , ...etc

  • @incelstate2466
    @incelstate2466 Před 3 lety +7

    Ever since I learned that self compiling compilers exist I live in fear

  • @nicflatterie7772
    @nicflatterie7772 Před 2 lety +2

    Brings back difficult memories from university. I had to write a Scheme interpreter in Scheme. Compilation was the only class I abandoned.

  • @rationalityfirst
    @rationalityfirst Před 5 lety +14

    Great to see that the stock of dot matrix printing paper is still going strong.

  • @MonochromeWench
    @MonochromeWench Před 5 lety +25

    For compilers that aren't C the first step is probably going to involve a simple compiler for some new language written in C rather than written in ASM we can do that cause someone has already written a working C compiler for most every platform and C is much easier to use than ASM

  • @makego
    @makego Před 8 měsíci +1

    I appreciate that he's using up his remaining stock of tractor feed green bar by pseudocoding on it.

  • @JoshuaHillerup
    @JoshuaHillerup Před 5 lety +16

    Another way of dealing with the bugs in the bootstrapping process is to write a really simple to compile C compiler that BinA can compile and can self compiler, then write a second version of a C compiler that's more advanced, and work on getting the first C compiler code to be able to compile that. Adding in a third C compiler could even be needed, and if you're in a very complicated language I could see even more stages being helpful.

  • @zxuiji
    @zxuiji Před 4 lety +9

    Quite interesting, recently started writing my own compiler (mitsy if anyone interested) and right now I'm working on the most essential part of it, namely the bit that takes C literals and transforms them into hard values, part I'm struggling with is the exponent on floats, everything else has been a relative breeze as far as literals go

  • @topsecret1837
    @topsecret1837 Před 5 lety +6

    This greatly reminds me of the RepRap Project, which is especially clear when you may or may not knowingly referred to it when talking about self replicating 3D printers. They very much exist and they are commercially viable given how they’ve gotten the desktop 3D printing industry to drop prices like mad.

  • @Richardincancale
    @Richardincancale Před 5 lety +26

    Very interesting angle on self compiling compilers - improving quality of code and compiler. My own career experience (forty years!) was doing this as a technique to achieve portability - to run software on new machines / chipsets etc. Excellent - thanks!

  • @NextLevelNoob
    @NextLevelNoob Před 5 lety +13

    Please make a video on Reflections on trusting trust by Ken Thompson

  • @LazyToad
    @LazyToad Před 5 lety +72

    Cf. Ken Thompson's 1984 Turing Award lecture "Reflections on Trusting Trust"

    • @666Tomato666
      @666Tomato666 Před 5 lety +4

      if there is _one_ paper that anybody talking about security of computer systems should really read it's that one
      (especially if we're talking about any kind of electronic voting systems)

    • @okuno54
      @okuno54 Před 5 lety +3

      That is a beautiful paper. Not the most practical, but absolutely beautiful in its paranoia!

    • @666Tomato666
      @666Tomato666 Před 5 lety

      @@okuno54 if you are talking about elections in a large country, spending few million to get the result you want _is exactly what big corporations are already doing_

    • @okuno54
      @okuno54 Před 5 lety

      @@666Tomato666 I was not... where even...?
      I mean also, try a few billion if we're totaling it all up, and congrats on the quadruple post.

    • @666Tomato666
      @666Tomato666 Před 5 lety

      @@okuno54 "try a few billion if we're totaling it all up"
      go to opensecrets, you'd be surprised how cheaply politicians can be bought

  • @PrettyBlueThings
    @PrettyBlueThings Před 5 lety +4

    Thank you Professor Brailsford!

  • @LordomusPL
    @LordomusPL Před 5 lety +160

    When i see prof. Brailsford i click faster than Thanos can snap.

    • @sebastianelytron8450
      @sebastianelytron8450 Před 5 lety +9

      Then how fast do you click when it's Pound?

    • @grainfrizz
      @grainfrizz Před 5 lety +1

      But you watched Nerdist news first

    • @deamon6681
      @deamon6681 Před 5 lety +1

      I've watched end game, and can say he doesn't snap very fast.

    • @Blox117
      @Blox117 Před 5 lety

      my click was stored in cache

  • @jansustar4565
    @jansustar4565 Před 10 měsíci +2

    So if I understand correctly:
    Have a compiler v1
    Write a compiler v2
    Compile v2 with v1 to produce v2_slow
    Use v2_slow to compile v2 to get v2_fast

  • @PatrickPoet
    @PatrickPoet Před 5 lety +4

    I wrote an assembler using a monitor on the vic-20 (6502) then used the assembler to write a better assembler with labels, then used that to write a small-c compiler, then used that to write a c-compiler. That's how I learned to program.

    • @AZOffRoadster
      @AZOffRoadster Před 3 lety

      HESMon? I used it to reverse engineer the C64 Kernel, stripped out all the cassette routines, and replace it with bank swapping (using the cassette control circuit) on a 4x sized rom. Built in all the tools I wanted.

  • @LLoydsensei
    @LLoydsensei Před 5 lety +3

    I'm glad you made that video because I read somewhere that when compiling a compiler, it would need to get compiled by itself again and again to make sure it wouldn't break and shoot bugs but it was not explained that the quality of the latest versions would be better, so I thought it would become worse. Your explanation was very clear as to why not :)

  • @MisterBr0wn
    @MisterBr0wn Před rokem

    thank you so much for the video.
    finally it answered the many questions that I had

  • @will16320
    @will16320 Před 5 lety +10

    legendary professor this one

    • @ikickss
      @ikickss Před 5 lety +1

      Yoda of Computer Science. Yup.

  • @gigaherz_
    @gigaherz_ Před 5 lety +4

    The original executable format for DOS was the COM program format, which was a "bare" (no headers, single code and data segment, meaning up to 64kb for code+data) 16bit executable. DOS would put that single section at a fixed location in memory and then execute from a fixed adress. The DOS EXE "MZ" file format (MZ being the initials of Mark Zbikowski, who defined the format) was the advanced format that would allow multiple segments, separate code and data, etc. 16bit windows EXE files use a newer format called NE (New Executable), and 32bit/64bit windows EXE files use PE (Portable Executable -- the portable came from how the same file format could be used in all the platforms Windows used to work on at the time).
    EDIT: Added some details.
    Addendum: If you have ever opened an NE or PE EXE file in a hex editor (or even in notepad, but don't do that, it would corrupt the file if you save by mistake), you'd see it has MZ at the beginning, which is there for anyone who tried to run a windows program in DOS, who would receive a message saying the program can't be run in DOS.

    • @MonochromeWench
      @MonochromeWench Před 5 lety

      dos itself thinks that windows exe are just fine to run because it ignores all the windows specific code and data. the message saying the exe can not be run in dos is output by a small dos only code segment in the exe that dos sees and executes

    • @gigaherz_
      @gigaherz_ Před 5 lety

      @@MonochromeWench Yes. The "dos segment" you speak of is the beginning of the file, the header contains a number of reserved bytes before the actual data starts, and those bytes are used to create the little dos program that prints an error message. Both NE and PE executables have this.

    • @daniellatikaynen1389
      @daniellatikaynen1389 Před rokem

      MZ is not "there for anyone who tried to run a windows program in DOS". DOS .exe files have had the MZ preamble well before Windows. The thing they did with Windows executables was to include a .TEXT stub which does nothing else than writing "This program cannot be run in DOS mode." to standard output, at the same time keeping the format backwards compatible and exit gracefully when started from DOS. the Windows executable loader would look for the NE/PE after that stub, and ignore it.

  • @TheDarkOne629
    @TheDarkOne629 Před 5 lety +4

    I was asked this question so often, for every language you can imagine...
    The second question is always: "But why would you write it again, if it can already be compiled?"
    There is one thing you could have mentioned, because it's always the third question. "If a compiler can make code executable for machine A, how can it make the code executable for machine B?"
    I apologize if I just didn't hear you when you said it.

    • @wfzyx
      @wfzyx Před 5 lety +2

      The second question can be answered as: Because writing an C compiler in C, is easier than ASM, so the final product tends to be better, and its kinda of a self "unit test".
      The third question can be answered as: If machine B have the same architecture, you don't have a problem, just copy the C-MK2 compiler over. If it is a different architecture, then you need to use machine-B's ASM to write a C-MK1, and then feed your already coded C-MK2 in this B-C-MK1, to have the same C-MK2 as machine A.
      Quoting back to first video of this series, you always only need to rewrite this small C-MK1, in each new architecture in order to generate MK2.

    • @TheDarkOne629
      @TheDarkOne629 Před 5 lety +2

      @@wfzyx The answers seemed too obvious to me, so I did not bother. This was really ignorant of me.
      I learned all of this years ago and have been writing my own compiler for some very specific stuff. This made me forget how hard it was to learn all of this from zero.
      Anyways, thanks for answering :)

  • @krishnabirla16
    @krishnabirla16 Před 2 lety

    Miss him.

  • @MrNickolay1986
    @MrNickolay1986 Před 2 měsíci +1

    What I don't understand is how something made with quality 0 binary can produce better code (let alone faster)? I mean, if I wrote a C compiler in assembly how my self hosting version could possibly be faster? It can't be faster than original version made in assembly (let's call it version 0) from scratch, right? Or the point is that during bootstrapping optimizations possibilities open up, which are hard / impossible to do directly in assembly in version 0?

  • @Goodvvine
    @Goodvvine Před 5 lety +6

    I learned this with Go 1.5+, still a "pain" to build from source

  • @Rob9
    @Rob9 Před 5 lety +1

    Could you have a bug in BinA that doesn't show up until say BinC? In that case would you have to fix it and iterate through A, B, and C again or could you somehow fix it in just your next iteration?

    • @mrmimeisfunny
      @mrmimeisfunny Před 3 lety

      Not really. BinA shouldn't change what BinB does. Theoretically BinB on BinB should do the exact same thing as BinB on BinA but better.

  • @oglothenerd
    @oglothenerd Před 4 dny

    I am making a compiled language, but I am still not sure how I would allow it to work on other CPUs if I bootstrap.

  • @JaseyStudios
    @JaseyStudios Před 5 lety +2

    What an enticing title!

  • @chaosopher23
    @chaosopher23 Před 5 lety +4

    I was going to write a spoof on Monty Python's "Decomposing Composers" with "Decompiling Compilers" but there aren't enough abandonware compilers to fit the bill.
    Ouroborous compiling...

  • @Rfc1394
    @Rfc1394 Před 2 lety

    You don't necessarily have to write the bootstrap compiler in assembler. Nicklaus Worth write the first Pascal compiler using FORTRAN. He wrote the compiler originally in itself, then hand-translated it into FORTRAN. Once he built that, the compiler could be used to compile itself, and once all bugs were fixed, all newer versions. Another possibility is to write a subset compiler,, just enough to compile the smallest subset than can do something, then once that is compiled, add more of the compiler to itself until you get it to where it can compile the full language.

  • @vega1287
    @vega1287 Před 5 lety +4

    fun fact : the reproducing rapid prototyping machene or reprap for short, is the basis of modern desktop 3d printing the idea behind the reprap project was to make a machene that you can use to replicate the machene given enogh time and money

    • @toddmarshall7573
      @toddmarshall7573 Před 5 lety

      Where does something like yacc (yet another compiler compiler) fit into this discussion?

    • @okuno54
      @okuno54 Před 5 lety +2

      @@toddmarshall7573 Yacc isn't really a compiler compiler as advertised; it only compiles a parser. Handy, but not nearly the interesting bit.

  • @theja63
    @theja63 Před 5 lety

    There's this subtle movement of the cam that makes it very computerphile-ish. Is it an intentional aesthetic you stick with? or is it because you just don't bother setting up tripods and stuff?

    • @asinzuar
      @asinzuar Před 5 lety +1

      Also the way he zooms in to the face sometimes.

  • @tertrih9078
    @tertrih9078 Před 5 lety +1

    It makes me think about the human mind actually. Programming, quantum physics, brain surgery is absolutely nothing compared to the thing that produced it which is the human mind. We use our mind to try understand our mind and things obviously go wonky sometimes :D To improve yourself you have to think in a way that is "computable" for your current thinking capacity. You have to think about problems you are having and work out problems in your thinking with the thing that has problems and hopefully make your new thinking a bit better. You have to start at something simple and build on that. The human mind is always building on itself which is fascinating to me :)

  • @xybersurfer
    @xybersurfer Před 5 lety +1

    thanks. great video. i had to rewatch the previous one to get back up to speed, but it was worth it

  • @Rfc1394
    @Rfc1394 Před 2 lety

    The creation of a compiler that can compile itself is called "self-hosting." All assemblers started this way. This is common for Pascal compilers; I know of at least 5 (XDPascal, Stanford Pascal, AAEC 8000 Pascal, UCSD Pascal, and Free Pascal, all of which I've had reason to read their sources). As of now, all "world-class" C++ compilers, including LLVM's Clang and GNU GCC, are self-hosting. So is Free Basic. And there are others, Any programming language where it can read text, and write to a disk file, can be self-hosting, even if it started from something else.

  • @johnnychang4233
    @johnnychang4233 Před 5 lety +14

    The Hamlet of computing, To-C or not To C ;)

  • @Enigma758
    @Enigma758 Před 2 lety

    The same is true of compiler-compiler generators (e.g. yacc written in yacc)

  • @AZOffRoadster
    @AZOffRoadster Před 3 lety +1

    And no mention of Forth? Sun's SPARCstation bios is Forth. PostScript looks like it's Forth to me.

  • @Soulthym
    @Soulthym Před 5 lety +3

    I must not have been following so well, I am still confused as I can't think of a system that only builds on itself and is able to run on a specific architecture, without that architecture translation layer

    • @TheExalaber
      @TheExalaber Před 5 lety +2

      That is what the compiler is. It translates from a high level language to a low level language. Once you have the compiler working, you no longer need the low level references. For instance when installing Linux you go through a tool chain. You start by downloading a working image with a working compiler. Then you use the image compiler to cross compile a compiler for your system. Then you boot your system and recompile your compiler with the cross compiled compiler to create a natively compiled compiler which you use for the rest of your installations. At no point in this process did you use a program written in assembly or in binary because you were able to start with a working compiler.

    • @xybersurfer
      @xybersurfer Před 5 lety +1

      it still remains. the C compiler will describe how to create the binary file. the difference is that it's described in C instead of assembler

    • @codycast
      @codycast Před 5 lety +1

      Soulthym there is nothing to follow. This was just random rambling rantings

    • @xybersurfer
      @xybersurfer Před 5 lety +2

      @@codycast you're wrong. this video makes sense

    • @agsystems8220
      @agsystems8220 Před 4 lety

      Pretty much all high level languages compile to C and let the hardware vendors do the work from there. They still face the problem of getting C code to run well on their new hardware, we just don't see it.

  • @snoopyjc
    @snoopyjc Před 4 lety

    Back in the day, we wrote the FORTRAN VII compiler in FORTRAN VII

  • @connorbishop6819
    @connorbishop6819 Před 3 lety

    How was the assembler written which produced BIN A?

  • @chillyvanilly6352
    @chillyvanilly6352 Před 5 lety +18

    I would so wish for prof. Brailsford to compare his ever so loving C with the Rust programming language!
    Really curious what his thoughts are on it...

    • @noxabellus
      @noxabellus Před 5 lety +3

      He probably doesn't care much for all the hype. If you've been working with manual memory management for a few years it doesn't really bother you anymore so the tradeoff of huge compile times doesn't have so much appeal

  • @soraaoixxthebluesky
    @soraaoixxthebluesky Před 5 lety +1

    Believe me I’m PC enthusiast but I’m no scientist, PHD holder or engineer in computer world. My question is why would we want a new compiler for our current language if current compiler is working as expected I mean, it can traslate english (computer language) to binary and as long as it’s translated to a binary (executable code) with no intermediate or algorithm that need to be run during run-time like, I suppose, Java, it is good enough for any programme that we write. So why would we want to change how the compiler work?

    • @ChrisLee-yr7tz
      @ChrisLee-yr7tz Před 5 lety +3

      To create more efficient binary and also to develop the language.

  • @esra_erimez
    @esra_erimez Před 2 lety +1

    Professor Brailsford please take care of yourself. You are a treasure to humanity

  • @thorodinson8022
    @thorodinson8022 Před 4 lety

    Is he talking about cross compiling?

  • @BrendanMetcalfe
    @BrendanMetcalfe Před 5 lety

    Crazy!!

  • @windowslogo3577
    @windowslogo3577 Před 5 lety +5

    I love this dude :D

  • @andrewgebbie6410
    @andrewgebbie6410 Před 5 lety +3

    Can anyone remember a program called the 'Last One'. It was really hyped about 30 years ago to be the last compiler program you would need. The program would produce another runnable program from your Input statements. It was a spectacular failure. Did you buy a copy and what was it really like?

    • @noxabellus
      @noxabellus Před 5 lety +1

      Sounds like "esolangs" we have today that try to emulate natural language

  • @xiongbenjamin
    @xiongbenjamin Před 2 lety

    Wow! Bootstrapping in a way use pointers to rid of the unwanted product.

  • @stevenm4319
    @stevenm4319 Před rokem

    Wait a second Computerphile = Computer file OMFG

  • @jyoules9833
    @jyoules9833 Před 5 lety +3

    I'm not sure why all these complicated diagrams and notations are needed to describe something quite straightforward.
    It all started with a piece of software written in machine code to make writing and running programs easier ("Initial Orders").
    Then a compiler was developed using the above to convert assembler instructions to machine code.
    Then that compiler was rewritten in assembler, compiled and became the new version of the compiler.
    The process of enhancing the compiler to introdeuce new language features continued, using the previous version of the compiler.
    New languages were developed and compilers written in languages for which there were working compilers, including producing code for other types of computer.
    So it is no different to any other technological process - we create new and improved tools using existing tools.

  • @eternaldoorman5228
    @eternaldoorman5228 Před rokem

    I believe that these days the GNU C compiler needs a C++ compiler to compile it. I suppose that was seen as inevitable, like eventually writing the assembler in C, ...

  • @imveryangryitsnotbutter
    @imveryangryitsnotbutter Před 5 lety +1

    There are psychological horror games that are premised on the game itself seemingly changing its fundamental behavior, but an examination of the code reveals that the "aberrant" behavior was always programmed in, and can be de/activated by changing one variable.
    Could you make a game that actually rewrites its own code, to mess with even the players that snoop around in the game's files?

    • @davidmcgill1000
      @davidmcgill1000 Před 5 lety

      like Doki Doki Literature Club?

    • @Markd315
      @Markd315 Před 5 lety

      Certainly. That would be a fun challenge although I'm not sure how accessible/marketable it would be as a horror gimmick, I doubt it would sell well by itself.

    • @xGOKOPx
      @xGOKOPx Před 5 lety

      @@davidmcgill1000 DDLC doesn't rewrite its code though

    • @SimonBuchanNz
      @SimonBuchanNz Před 5 lety +2

      Doublefine's game "Hack 'n' Slash" has you at first edit logical properties of objects in memory, then to edit the raw memory of the scripting runtime, and eventually edit the raw, on disk byte code (like assembler for a scripting language) of the game. All in a Zelda-style game interface.

    • @Alex-dn7jq
      @Alex-dn7jq Před 5 lety +1

      You would need to install a compiler inside the game itself.

  • @retepaskab
    @retepaskab Před 5 lety +1

    why is it hard writing a self-compiling compiler?

    • @TheDarkOne629
      @TheDarkOne629 Před 5 lety +1

      Because if you wrote a very basic compiler for C in assembly (BinA in the video) and then again in C (BinB), you had to write it twice. After that, it gets easier and more readable with later versions.

  • @TheStevenWhiting
    @TheStevenWhiting Před 5 lety +1

    Has he retired now?

  • @AexisRai
    @AexisRai Před 5 lety +10

    The visualizations for this T-diagram series are still confused about the concept being explained. 5:53 does not show the right thing at all.

    • @SimonBastienFiliatrault
      @SimonBastienFiliatrault Před 5 lety +1

      I was wondering the same couldn't wrap my brain.

    • @AexisRai
      @AexisRai Před 5 lety +9

      @@SimonBastienFiliatrault In my opinion (and I said this in my comment on the last video) the critical part of this visualization should be to show how code-as-instructions and code-as-data are *interchangeable.* That is, the code when it acts as a running program, and the code when it acts as the data being transformed by that program.
      The T-diagram ought to represent the code as a program that transforms its input into its output.
      What they should be doing is taking another T-diagram, *shrinking it down and passing it through* the first T-diagram to be transformed into a different one - to show how the C-code representation of that program gets turned into a binary-code representation of it.

    • @SimonBastienFiliatrault
      @SimonBastienFiliatrault Před 5 lety +1

      @@AexisRai I can't agree more. That would've been clearer.

  • @RH-zr5or
    @RH-zr5or Před 5 lety +5

    I was accused of being a computerphile

  • @Kbumb001
    @Kbumb001 Před 5 lety +3

    When I first learned of this in the mid-seventies, it dawned on me that computers would eventually take over the world. Not like now with our PCs, tablets and smartphones, but an absolute, complete takeover.

    • @bytefu
      @bytefu Před 5 lety +1

      Self-modifying AGI is the future, no doubt about that. Though, not the future of humanity, but _the future_, since biology will inevitably become obsolete with all shortcomings it has.

    • @viciouswaffle
      @viciouswaffle Před 5 lety +1

      @@bytefu Luckily computers, and family, have their shortcomings too :)

  • @amilasenadheera9365
    @amilasenadheera9365 Před 2 lety

    Golang compiler is also rewritten in Golang

  • @skilz8098
    @skilz8098 Před 4 lety

    How about designing a compiler that at its core is based off of Machine Learning, Pattern Recognition and AI where the structure of the program that compiles your language of choice is a self contained polymorphic mutating algorithm. A compiler that learns and becomes self aware the more you use it and adjusts itself accordingly. So in a sense if person A writes a series or programs in some language and compiles those programs their compiler will adapt to the habits of that programmer where person B using the same compiler writes their sets of programs in some language and it adapts to their habits. The only thing that is unified between the two is the syntax and rules of the language itself. However, at the end of the day the same compiler on two different machines that was used between two different people end up being two completely different entities so to speak. Would such a construct such as this be feasible? As I stated the off the shelf compilers for person A & B are exactly the same during installation. The compilers on both machines will accept Language (L) with its alphabet, identifiers, keywords, language syntax and rules that is the same for both parties, so they have to write coherent code in said language. However over time and through the use of the compiler the compiler will adjust itself or self modifying to the behaviors of each user based on the set of source code it is trying to compile. This way the compiler can optimize the code more efficiently for that user. For example; user A programs 3D applications using Vulkan over and over again while user B is writing embedded driver codes for micro controllers and user C is writing source code for a specific architecture Operating System. So over a period of time; each of the 3 users or code developers start off with the same off the shelf compiler, but through time and use the compiler itself begins to self mutate as it learns the sets of data it is processing from that user. So in user's A case it will efficiently compile and optimize Vulkan API calls where user B it will optimize driver, kernel or library type code base and user C's version will effectively optimize code based for an operating system on a specific architecture. Of course this can be and is also fine tuned by the user since a user can changed the default settings of the compiler to their desired settings, but much of this could also be automated through Machine Learning - A.I. with the use of polymorphic mutating algorithms which would allow the compiler itself to evolve over time. I know this is different than what is proposed in this video, but it's just something to think about.

  • @Petrolhead99999
    @Petrolhead99999 Před 2 lety

    Sounds like the von Neumann machine... Preferentially selecting for the creation of the new-best version of itself.

  • @Car0linaPh03nix
    @Car0linaPh03nix Před 5 lety +2

    @Computerphile in the future, when a host wants to cover up part of the paper, don't let the use a flipped-over piece of used paper. Really visually confusing because you can pretty clearly see through the paper. Just use a clean sheet.
    EDIT: eg @ 7:05

  • @SevenDeMagnus
    @SevenDeMagnus Před 4 lety

    Cool

  • @venkateshbabu1504
    @venkateshbabu1504 Před 3 lety

    I need a Japanese c compiler.

  • @lucidmoses
    @lucidmoses Před 5 lety +6

    I would think cross compiling is used more now. Maybe that's an idea for an episode.

    • @noxabellus
      @noxabellus Před 5 lety +2

      Weird to compare which is used more since they are totally different things

    • @lucidmoses
      @lucidmoses Před 5 lety +2

      @@noxabellus They are both ways of getting new opcode based systems off the ground. If you don't know why that's directly relavent to the subject at hand then that shows there there is a need for an episode on the topic.

    • @okuno54
      @okuno54 Před 5 lety +1

      @@noxabellus Just to be explicit about what makes them similar: imagine that the Bin_A and Bin_B didn't represent two qualities of binary on the same architecture as in brailsford's example, but two different architectures' binary formats.

  • @pradipdhakal2665
    @pradipdhakal2665 Před 4 lety

    when i see wise men with a tutorial, i know i'm not being cat fish

  • @MikeTrieu
    @MikeTrieu Před 5 lety +6

    "feeding me with myself". Isn't that auto-cannibalism? Dr. Hannibal Lector would be proud.

  • @Jone952
    @Jone952 Před 5 lety +5

    You could say it's "machine reproduction"

  • @ClaytonMacleod
    @ClaytonMacleod Před 5 lety +1

    *assembly

  • @sameash3153
    @sameash3153 Před 3 lety

    Me head exploded

  • @russellchido
    @russellchido Před 5 lety +1

    These T-diagrams over-complicate the explanation in my opinion. Bootstrapping is simply:
    ./bina binb.c > binb
    ./binb binc.c > binc
    and so on.
    With the most tedium step being getting bina in the first place. This can be done which can be done in any language which you have available to you or by hand.

  • @nodalFx
    @nodalFx Před 3 měsíci

    I need to buy a box of continuous feed green stripe.

  • @SudaNIm103
    @SudaNIm103 Před 5 lety

    Just now from the next room: “... it drinks in semen! . . . What the F@&# are you watching‽" lolz

  • @strum007
    @strum007 Před 5 lety

    Compilerphile

  • @RogerBarraud
    @RogerBarraud Před rokem

    03:45 Oroboric Compilation? :-)

  • @grainfrizz
    @grainfrizz Před 5 lety +10

    A 3d printer that prints a 3d printer

    • @UpcycleElectronics
      @UpcycleElectronics Před 5 lety +6

      The industrial revolution is largely based on lathes making better lathes.

    • @AlRoderick
      @AlRoderick Před 5 lety +3

      Look up a video by Prusa research about their factory. They make all the plastic parts of their printers on their printers, as a combination test environment and factory floor. It's part of the original reprap project ethos, it's a family of machines that can be made out of only parts they make themselves and off the shelf parts not specific to 3D printers. They haven't reached that ideal yet, since the actual print head involves custom metal parts specific to 3d printers but they're getting closer.

    • @hellcoreproductions
      @hellcoreproductions Před 5 lety +2

      Consumer 3d printing is basically a thing because of this.

    • @ekremdincel1505
      @ekremdincel1505 Před 4 lety

      @Zero Cool so first human was created by Allah.

    • @dreamingwarlord6151
      @dreamingwarlord6151 Před 3 lety

      @@ekremdincel1505 there's no thing called "first human" because **evolution**

  • @AmeshaSpentaArmaiti
    @AmeshaSpentaArmaiti Před 5 lety +1

    Compiled compilers compiling compilers

    • @Kyrelel
      @Kyrelel Před 4 lety

      Compiled compilers compiling compiler compilers

  • @koppadasao
    @koppadasao Před 5 lety +1

    Don Henley - Actual miles…

    • @profdaveb6384
      @profdaveb6384 Před 5 lety +1

      Yes - just occasionally Mike Campbell's plaintive guitar riff on "Boys of Summer" seems to be the ideal track to bring memories of California flooding back ....

    • @sebastianelytron8450
      @sebastianelytron8450 Před 4 lety

      @@profdaveb6384 Impeccable taste in profession, impeccable taste in music. What are you, the definition of "man of culture"???

  • @Antiorganizer
    @Antiorganizer Před rokem

    The throwing away the previous mess that got you there.... and then, out of the blue, you find yourself working around a bug in the latest greatest. If only you kept the entire chain.... better to use a versioned naming convention, and keep them all...
    Thank me later.

  • @4.0.4
    @4.0.4 Před 4 lety

    0:31 - "It drinks in C program statements and it spits out at the other end [...]"
    That's not how it works at the other end.

    • @Kyrelel
      @Kyrelel Před 4 lety

      If you are thinking anatomically then that is exactly how it works.

  • @Roxor128
    @Roxor128 Před 5 lety +2

    Don't forget the classic put-down for when someone won't shut up about their favourite new language: "Has it been used for anything other than its own compiler?"

  • @christopherg2347
    @christopherg2347 Před 5 lety +4

    But shouldn't the next version of a C compiler be "A++" rather then B?

  • @blackAngel88it
    @blackAngel88it Před 5 lety

    I can see how it's an advantage to not depend on a compiler of a different language that might not be maintained in the long run. But why is the compiler written in its own language automatically faster and better, as Professor Brailsford claims?

    • @maximkovalkov1334
      @maximkovalkov1334 Před 5 lety +1

      Well, the initial premise of the video is that there is no compiler for any language, really, so writing a C compiler directly in assembly and then a C compiler in C is the sensible thing to do, in that it requires the least work

    • @MadocComadrin
      @MadocComadrin Před 5 lety +1

      It's not. You could write a horrible C compiler in C.
      However, assembly can be tedious to write, so you might skip out on faster but more complex algorithms or compile-time optimizations; instead, including them in your compiler written in C.

  • @MrTStat
    @MrTStat Před 5 lety

    bootstrapping

  • @oofyeetmcgee
    @oofyeetmcgee Před 3 lety

    Terry Davis wants to know your location

  • @bluxer4225
    @bluxer4225 Před 5 lety

    Какой умный народ англичане

  • @Fingerblasterstudios
    @Fingerblasterstudios Před 3 lety

    I've bootstrapped 3D printer repairs before...it's an eye opening experience...

  • @IARRCSim
    @IARRCSim Před 2 lety

    It sounds like software cannibalism.

  • @pierreabbat6157
    @pierreabbat6157 Před 5 lety

    The first C++14 compiler could not be written in C++14. But it could be written in C++11.

  • @jaydenritchie1992
    @jaydenritchie1992 Před rokem

    self compiling compiler i thought you were talking about jump the gun slipping the sear, turns out your just talking about hacking machines really

  • @cutliss
    @cutliss Před 5 lety +2

    I'm a little worried about how he said "amd"

    • @SimonBuchanNz
      @SimonBuchanNz Před 5 lety +2

      ARM, not AMD. It's a different *type* of CPU, that runs different instructions. It's used in phones and the like, since it can use much less power for the same amount of computation.

    • @cutliss
      @cutliss Před 5 lety

      @@SimonBuchanNz 😯😮

  • @Markd315
    @Markd315 Před 5 lety +6

    This seemed like a mildly hard concept (I mean to grasp, not to execute) explained in a way that made it sound more challenging. Poor visuals and instructional planning I think.

    • @TheAsymmetrical
      @TheAsymmetrical Před 5 lety

      I feel like the concept itself is straight forward, only complicated by the idea that it feels like a monstrous undertaking at the very least in C. Maybe it might be more straight forward in more modern languages but by golly I feel like it will be forever above my patience and (consequently) paygrade.

  • @jmasterX
    @jmasterX Před 5 lety +1

    Yes?

  • @Phroggster
    @Phroggster Před 5 lety +3

    A C interpreter... Now there's something that'd either be a nightmare to program, or a security nightmare to operate.