"Tree-sitter - a new parsing system for programming tools" by Max Brunsfeld

Sdílet
Vložit
  • čas přidán 27. 06. 2024
  • Developer tools that support multiple programming languages generally have very limited, regex-based code-analysis capabilities. Tree-sitter is a new parsing system that aims to change this paradigm. It provides a uniform C API for parsing an ever-growing set of languages. It features high-performance incremental parsing and robust error recovery, which allow it to be used to parse code in real-time in a text editor. There are bindings for using Tree-sitter from Node.js, Haskell, Ruby and Rust.
    We're in the process of integrating Tree-sitter into both GitHub.com and the Atom text editor, which will allow us to analyze code accurately and efficiently, paving the way for better syntax highlighting, code navigation, and refactoring. We'll demo some new features that Tree-sitter has enabled in GitHub.com and Atom, discuss some the interesting algorithms that it uses, and share thoughts on some potential future applications.
    Speaker: Max Brunsfeld
  • Věda a technologie

Komentáře • 34

  • @atrevino1989
    @atrevino1989 Před 11 měsíci +7

    4 years old and still relevant, super nice !

  • @gloverelaxis
    @gloverelaxis Před 2 lety +35

    This is a really, really, cool idea and approach. This is going to be **phenomenally** useful and will improve how *all* code across the world is authored by everyone. I really hope this can be integrated into every editor and web-based code display. Thank you so much for this essential labour you've expended on this. I'm actually shocked at how badly most "code editors" (actually glorified text editors) understand the code they're editing. This addresses a lot of the problems of treating code as text instead of ordered trees (which is what programs really are on a conceptual *and* literal/mechanical level). I hope one day we can move into a much more productive world where most code authors can use good tree-based tools and stop wasting their time dealing with confusing parentheses, forgetting semicolons, and naming things without using spaces.
    "Expand selection" was my favourite feature of Sublime that I've missed since migrating to VSCode. I love that you remember which child nodes you "expanded selection" from, so you can contract them back again without mistakenly choosing (eg) the first child! Also, that idea of using "expand selection" with multiple nodes is SO powerful and useful.
    Just blown away by this, bravo and thank you again!

    • @danielorodriguez1689
      @danielorodriguez1689 Před rokem +4

      It is not “it will” it already does. It is an important tool in my workflow already

  • @noclaf78
    @noclaf78 Před 5 lety +54

    What a fantastic presentation. A very useful project and a great explanation of how it works!

  • @douglasmennella4525
    @douglasmennella4525 Před rokem +2

    Just want to add another comment about how great a presentation this was.

  • @PatrickKellyLoneCoder
    @PatrickKellyLoneCoder Před 4 lety +14

    We've been working on a remarkably similar concept. Hot damn. Actually glad I'm not the only one.

  • @SimonLeinen
    @SimonLeinen Před 4 lety +12

    Awesome job-both the tree-sitter work and the presentation. Thank you! Just stumbled across this via a pointer on the emacs-devel list (in case anyone wonders :-). The one small thing I have an issue with was the critique of existing code-highlighting practices in the "motivation" part of the talk. In some cases I personally liked the "old-style" highlighting better than the treesitter-generated ones; in particular, I have a preference for NAMES OF NEWLY DEFINED THINGS to be highlighted, and that was what the "old" style did in the examples. In fact I think that EVEN OLDER algorithms did it the way Max prefers, i.e. type names one color, variable names second color, etc.
    Anyway, that's a detail. This is REALLY COOL work with lots of potential. Also kudos for respecting the elderly (-: as shown here, old methods such as (G)LR parsing may still have untapped potential. Now excuse me while I get a case of fresh punch cards from the basement and try to write a tree-sitter grammar for LISP

    • @IanKjos
      @IanKjos Před rokem +1

      Grammar for LISP? I see what you did there...

  • @astrosticks
    @astrosticks Před rokem +17

    tree-sitter is now part of GNU Emacs:)

    • @mndtr0
      @mndtr0 Před rokem +4

      This is strange that a cool stuff like tree-sitter is used only in two major and popular code esitors. I'm mean only NeoVim and Emacs use it(if we talk about popular solutions) and other editors like VSCode, SublimeText4 etc. use regex-based syntax hightlighting 🤢

  • @user-zw8uq1rj9m
    @user-zw8uq1rj9m Před rokem

    So informative and the idea is just mind-blowing! Thanks for your well organized presentations.

  • @satishgoda
    @satishgoda Před 4 lety +4

    Absolutely amazing presentation (and software)

  • @melodyogonna
    @melodyogonna Před 3 měsíci

    Now Treesitter has given Neovim superpowers by allowing it understand ASTs

  • @tompov227
    @tompov227 Před 4 lety +9

    You know vscode right now is so popular but I still love and use Atom as my primary editor. I hope now that Microsoft owns Github and Atom they dont get rid of it in favor of vscode. Also, I didn't know about this but the syntax highlighting always did look better to me in Atom but I just assumed the themes were made better, I didn't realize all this tech was going on under the hood

    • @chasecaleb
      @chasecaleb Před 2 lety +15

      Two years later and the day has come. What are you using now?

  • @_ashout
    @_ashout Před 4 lety +2

    Wow, this is crazy cool. Might have to give Atom a try after seeing this

  • @MrRobWalter
    @MrRobWalter Před 4 lety +4

    This is basically getting default features of structural (aka projectional) editing into textual editing. Interesting.

  • @pankaj_jangid
    @pankaj_jangid Před 4 lety

    Fantastic work!

  • @samr.7515
    @samr.7515 Před 5 lety +2

    Fantastic

  • @mgetommy
    @mgetommy Před 4 lety +1

    FANTASTIC

  • @rj00a
    @rj00a Před 4 lety

    Awesome!!!

  • @kaerafeliceoira-turner4164

    Looks interesting. Reminds me of Roslyn for C# in Visual Studio.

  • @IanKjos
    @IanKjos Před rokem +2

    Slick! Leveraging GLR for error recovery is a great idea. Question, though: How would you handle a confused lexer? Consider the insertion of a quote mark near the beginning of a file...

  • @s1n7ax
    @s1n7ax Před 3 lety +16

    Treesitter ships with neovim now

  • @RasikaPereraGovinnage
    @RasikaPereraGovinnage Před 5 lety

    If you need to know what is Tree-sitter jump into czcams.com/video/Jes3bD6P0To/video.html

  • @tretretretre747
    @tretretretre747 Před 4 lety

    Thanks for your presentation. I am learning tree-sitter but i have a problem. I am parsing a source code from file, i need to know his function name not the node name. how can i do that. I read all the doc but i can't have the solution. thanks

  • @gzoechi
    @gzoechi Před 4 lety +2

    I'm not convinced.
    For auto-completion and error reporting, language servers already need to parse and analyze the code.
    I'd expect that to be much more efficient if information for syntax highlighting also comes from that same parser and analyzer.
    I'm not much into language server protocol but I got the impression that semantic highlighting is something that is already supported (even though not implemented for every language)
    LSP is also designed for incremental changes as far as I know.

    • @neikms
      @neikms Před 4 lety +1

      Agree, the main problem with LSP is that it comminucates via RPC, which can be quite slow, especially for thing that needs to be highlighted every key strokes.
      Tree-sitter can be directly imbedded into editor as a quick, standard (not necessarily correct) syntax highlighter.

    • @Megalcristo2
      @Megalcristo2 Před 3 lety

      Also if there are errors on that line you are going to lose the syntax highlight or fallback to the basic one.

    • @jeetadityachatterjee6995
      @jeetadityachatterjee6995 Před 2 lety +3

      I mean as someone who is worked for both LSP is way too slow for this job. Syntax highlighting is something that should happen in the editor and quickly. Plus tree sitter is not making a value add like LSP does. People can and do whatever they want with the tree which is something that LSP (at the moment) does not provide

    • @CianMcsweeney
      @CianMcsweeney Před 4 měsíci

      LSP's are too slow, it was a stupid idea to have to run an entire server that uses JSON for a task that requires minimal latency. Should instead have been a standardized ABI interface that different compilers/interpreters implement, instantly cuts out the majority of the performance concerns and also removes a ton of complexity. Treesitter is a project that aims to patch over the mistake of LSP's since unfortunately it's too late to go back and replace them entirely