Parsing Explained - Computerphile
Vložit
- čas přidán 12. 11. 2019
- How ambiguity is dangerous! Professor Brailsford simplifies parsing.
EXTRA BITS: • EXTRA BITS: How Chomsk...
Angle Brackets: • Angle Brackets - Compu...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Hi everyone, I mentioned in the video that multiply over the positive integers was commutative but as some of you have pointed out, the more precise issue -- in terms of ambiguity side-effects from differing parses -- lies in the associative properties of multiply and divide . Thus, most languages will interpret 8 * 4 * 2 as being (8 *4) * 2 (i.e. left associative) but, in the case of *, the right associative version: 8 * (4 * 2) gives the same answer . This is not so for the / operation where left and right associative versions give different answers. In the absence of any parentheses, as in 8 / 4 / 2, the compiler's job is to enforce the default of: (8 /4) / 2 , i.e. left associativity, to get the correct result. I'm hoping to explain, at some stage, how compilers can do this .
Yeah but your shirt was amazing!
If I recall correctly, bison the parser generator allow designation of operator left/right associativity. But I am not sure about the inner working of it. After all, my favourite language APL has completely eliminated this headache.
Bought from Boden (UK) a few years ago. Sadly no longer available.
Thanks for realising that it *is* a shirt - and not pyjamas .....
Yes. I'm hoping in a later video to use the yacc (bison) parser generator to show how, by using left recursion, one can enforce left associativity. But you're quite right that yacc also lets you write overtly ambiguous grammars and gives you the ability to specify operator precedence with things like %prec
I wish I went to Notts uni now. You have a fantastic way of explanation!
"the robot stroked two furry dice"
--Professor Brailsford 2019
I paid £12.99 for that late night film in a holiday-inn.
V D underrated
But what did the Professor initially program the robot for? xD
The 2020 edition will allow people to insert words like "my" and "your" between "stroked" and "two".
Deep
Every time I watch one of these videos I thank the man that put these great men in front of a camera and made it available to anyone for free.
Grateful of being able to meet them and learn so much
After watching this, calling "programming language" a "language" makes waaay more sense than before.
First time I see a video from this guy and I liked it very much. He reminds me of those few teachers at college that explained very complex issues in ways that are easy to understand.
Big thumbs up! I spent months explaining to the language designers of a 3rd generation language in ICL in 1979 that their proposed syntax had loops / ambiguities! We used an automated tool called SAG (Syntax Analyser Generator) that generates a pseudo code for a little FSM to perform the parsing. It also allowed you to call ‘action routines’ to put the syntactic elements on a binary tree on the go.
The issue is not commutativity, it is associativity
Even further down ( or up ) the line of syntactical translation, it really comes down to contextuality. Do we merge, or do we move? The root of the issue in a nutshell.
Man, I love your videos, you remind me one (only one) of my teachers - his classes were so cool - like your presentations. Thank you
Always a good day when there is a new Prof Brailsford video on Computerphile!
This was excellent, very clear informative and interesting. I'd love to see more on this topic. Especially anything to do with the difference between Parse Trees and Abstract Syntax Trees, what they are, and what use they serve. Also, anything on the Shunting Yard Algorithm and its relation to top-down parsing would be great. Thanks!
I was a second year undergraduate in 1973 studying biology when I first met BNF. I had to take a 6 lesson course in Algol 60 and the lecturer started off with BNF before immersing us in the language. At the time I wrote BNF off as something I had to get through to complete the module. It was only later I discovered how very useful it is for analysing and learning programming languages (a lot of us back then ended up in computing for paid work, Biology graduates were ten for a penny) I’m even using it now to help me learn Hungarian. So thanks, Prof, for this walk down memory lane.
I would like to see the bnf for Hungarian!
@@WebMarketingStgs Me too; I’d love to see it in an actual language!
Some textbooks call it Backus-Naur and others call it Backus Normal form.
i keep thinking the seinfeld theme is going to play every time an item is popped off the stack
Yeah I was waiting for the funky bass to start. Always leave them wanting more, I suppose.
Among many other things I used to write data editors, that is tools that checked the validity of inputs. I used a grammar building tool with EBNF notation. It had a learning curve but I caught on. That grammar building tool taught me that defining a grammar is quite a job.
Thank you so much for the concise explanation! I'm taking advanced programming languages and data structures as courses right now and I hate the way they introduce parse trees. I spent over an hour reading and re-reading my required reading on parse trees and just couldn't understand it. I got more out of this video in 15 minutes than I did sitting with my thumb up an ungodly location parsing through the text.
Today I got an A from our (allegedly notoriously hard) algorithms exam. I've picked RPN and syntactical analysis. It was trough these types of videos from Prof. dfb that I've learned so much and sometimes much more then there was in our materials. Oftentimes these videos explain the concepts in a better and more concise manner than any lecture would. Thank your, professor, for what you have done for me and surely many more people trough these videos.
Parsing in the computer science is lexical analysis and grammar analysis. Such approach simplify the job. Lexical analysis is usually regular grammar done by finite automata with checking one symbol ahead to determine the next step it's input are characters and output lexeme structures. Grammar analysis often works on context free grammar and usually checking one symbol ahead to determine next step it's input are lexeme structures and output is often tree of parsed structures of the language.
Yay the Professor's back!
Couldnt have come at a perfect time, watching whil(st)e writing one for a compiler as I watch.
Whilste?
@@superscatboy While I work. /whistles
That part at 12:47 was so subtle. The hilarious part is that he is so used to Binary that it is easier for him to think in terms of "32 twos" instead of "Two 32s".
Also, "The robot stroked the fuzzy dice". Love it.
I think it has more to do with the order of the numbers. I suppose he could have said "32 times two" or "32 two times", but "32 twos" is a bit more natural.
I started computerlinguistics just weeks ago. We are learning those very things right now b
Thanks a lot, i am making my own programming language and this made me understand what a parser is, thanks you a lot magic man ;) ❤
I'm a simple man, I see Professor Brailsford, I click.
I'm a simple man too, I see comments about Professor Brailsford, I click.
I'm a simple man.
I'm a simple man, I see a simple man comment and I comment.
13:06 small mistake: mathematicians will tell you it's *associative*, because 8(4*2) = (8*4)2 is associativity.
@Paul Kovalov except the operation itself was still 8*4*2 in that order for both cases. the tree itself constitutes the parantheses.
Came here for this :)
Oh of course I was like an hour late on this. Just made a comment that said the exact same thing~
@Paul Kovalov The example he gives says that 8(4*2) = (8*4)*2, and he explained it using commutativity (a*b = b*a), but this is a mistake. This equality is because of the associativity of (*), which says that (a*b)*c = a*(b*c)
In other words, we can write a*b*c without parentheses because the multiplication is associative (and not necessarily commutative)
@@gwennaneliezer8490 you also said it wrong xd
This is my favorite type of ASMR.
I would love to hear Professor Brailsford read some of Shakespeare’s sonnets.
Yes and steer away from explaining anything technical
“Shall I compare thee to a summer’s day? Nay, thou dost not cause a case of melanoma!”
Came for the parsing; stayed for the tractor paper... (ok and clear and interesting teaching that made me smile as my brain hurt a bit from getting filled)
This is a much better description then the boring infix calculator example I had to learn in college.
13:05 the property that matters here isn't commutativity (you can multiply a * b and b * a and get the same outcome - i.e. order of values doesn't matter), it's associativity (you can mutiply (a * b) * c and a * (b * c) and it gives the same outcome. I.e. parentheses do not matter or, equivalently, parse tree structure doesn't matter)
13:06
It is not because a*b = b*a (commutation) that we can do the operation on either way, it is because (a*b)*c = a*(b*c) (which is associativity).
A great way to play with parsing is using Raku (née Perl 6) that has a built in grammar parser (basically, regex on steroids) that makes implementing BNF, etc, very easy.
When I use the divide sign on my calculator the result of 8:4:2 is 1 and when I use the fancy 2d fractions it results in 4, but a single pixel indicated the grouping, so I can force it into being 1 again, if I use a weird way to construct the fraction.
I would say that 8/4/2 is indeed 1 in the absence of brackets, because then it would be evaluated left to right.
Well, 8/4/2 is ambiguous anyways. Give this to a mathematician, and he will say it doesn't have much sense since this notation is not well defined
If you think of divide as multiply by the reciprocal then with fractions it becomes obvious:
8 ÷ 4 ÷ 2 = 8 × 1/4 × 1/2
and now you can do either multiply first:
(8 × 1/4) × 1/2 = 2 × 1/2 = 1
8 × (1/4 ×'1/8) = 8 × (1×1)/(2×4) = 8 × 1/8 = 1
In a fraction the line between the numerator and denominator does mean divide, but what is never overtly taught is with a/b/c is the b a denominator of a/b which then becomes the numerator for c or is it the numerator of b/c which then becomes the denominator of a. The convention is to work left to right with most operators (of the sane precedence to get (a/b)/c; power is the opposite (working right to left) with a^b^c meaning a^(b^c) not (a^b)^c.
This video needs to be seen by anyone interested in tools like ANTLR and Xtext but with no knowledge of parsing
If you're wondering how the described mathematical associativity could have relevance in a normal english sentence to highlight parser ambiguity, the classic "Eats shoots and leaves" joke demonstrates how english sometimes relies on punctuation to hint the correct associativity - without it, the exact same trouble arises and you'll get a different meaning if you build your "tree" in a different order.
And of course, "Time flies" is a common example of the difficulty of parsing spoken language. A longer example is the contrast: "Time flies like an arrow," and "Fruit flies like a banana."
In fact an earlier Professor Brailsford video discusses this exact example :) >Sean
I like garden path sentences am delightful
While Bob ate an apple was in the basket
In writing, make sure to put commas where they belong and in maths make sure to put parenthesis where they belong to avoid parsing difficulties :)
reflexive: x=x
symmetric a=b -> b=a
transitive: a=b and b=c -> a=c
irreflexive: ~(a
So we would ask, ‘what are option of choices?’ when parsing?
12:27 I have created class of octonions overriding multiplication symbol. The two trees give you different results because of non-associativity of octonions i.e. (A*B)*C ≠ A*(B*C).
What do you mean 'it makes sense'? How do you define 'sensible'?
i love this dude
Professor Brailsford - You could explain Quantum Physics to a Kindergartner and they would understand it. You are the master of story telling and explanation.
HOly hell I wish this would have been shown to me in first grade math!
13:05 Commutation has nothing to do with this. The ambiguity here does not matter because multiplication is associative.
When I read the title I immediately thought the Professor will be talking of compilers. It was as if he read my mind.
computer science && parsing => compilers. it's pretty obvious, da?
@@menyasavut3959 Menya Savut Kolya. =)))
@@londonnight937 Привет, Коля, приятно познакомиться
If the ambiguity leads you to two outcomes that would be unequal, then the parse description isn't properly created. I prefer EBNF with slight personal modification to standard BNF....it denies ambiguity from the start.
1:29 redirect to "Angle Brackets" --> redirects to --> "Chomsky Hierarcy" & "Finite State Automata"
14:53 wow, the extra bits part seems interesting
> _"john, now that you're aware that this..."_
i vote for longer video 8)
What about Chomsky?
Context needed!! Your question is context free.
The Indo half of the Indo-European languages tend to be SOV, like Hindi with the verb last. German is actually V-2, which means the verb always comes second, but other parts can move around a bit. That's for very simple three-word sentences. In more complex sentences with more than one verb, it's all the verbs except the main verb that move to the end. Even this is a gross simplification too of course.
irish gaelic is generally vso and i think every sentence form begins with the verb
And then there are Slavic languages (like Czech I use) that use flexion (slight changes in word form) and otherwise have a lot of freedom when it comes to word order (for example, using it to indicate stress).
@@Jamie-st6of That is a Punic (Semitic) substrate. It found ist way into Icelandic, where the verb can appear on Position 1 or 2 without difference in meaning.
And of course, in OSV Yoda speaks.
@@menachemsalomon And The writers of the Torah often used VSO, as illustrated by Genesis 1:1
“In-beginning created Elohim the-heavens and- the-earth.”
In- = “be-“ as a prefix
= “et”
The- = “ha-“ as a prefix
And- = “ve-“ as a prefix
And-= “ve-et”
"If the answer is 1 demand your money back"
I'd demand my money back if it were 4. Division is left-associative.
Oh, but if 2^3^4 comes out as 4096, you should demand your money back. Exponentiation is right-associative :D
Yes, mine came out = 1
I think he said "If the answer *isn't* 1 ...", but it's difficult to hear.
Quite so! I said "isn't" ....
Why cut off when it was just getting interesting? I wanted to hear about Chompsky.
A mathematician would actually tell you that * is associative. Not commutative. Associativity is what gives you this property.
Multiplication is also commutative, but I agree, the issue in this case is associativity.
@@slash_me yeah it's just that commutativity doesn't matter here at all. This would be equally true for matrices or quaternions, say. But not for octonions.
I wish this professor was my grandpa
Actually, the property of multiplication that you are using is associativity, not commutativity.
Minor error: In the given example it is important that multiplication is ASSOCIATIVE, the presenter said COMMUTATIVE.
I just love the Professor; he’s like the “𝑗𝑜𝑙𝑙𝑦” 爪ᗩ𝕊Tᗴ尺 𝚈𝓞𝔇ᗩ of Computer Science fundamentals!
This extra video ("EXTRA BITS: How Chomsky Fits In - Computerphile") quite delightful you might find ;-)
Didiodu Law I unfortunately, cannot find an Extra Bits video on Chomsky; only the original Computerphile episode on Chomsky hierarchies. :( If you have a link please consider posting it.
Unfortunately this is extremely complex for me. I hope i can understand better in time :)
C is awesome
Operator precedence.
The tree structures would be simpler if they had a * as the root, rather than E.
The Dragon book condensed into fifteen minutes. Maybe not quite.
Not even close.
Is this why languages read right to left?
Thumbs up for the Chomsky mention!
Interesting
Parsing, yes
@MichaelKingsfordGray Parson's nose
Parsing has very little to do with compiling. Parsing has a great deal to do with how to live a good life.
removing bones from fish
the man was holding himself not to talk about compilers lol
What a total ledge
i wish it was "furry balls" instead of dice. so much funnier :D
That robot lost no nut November
The example sentences are correct (the obey the Grammar rules), but they are not valid, because there is no situation in reality where they would be uttered. That at least how linguists put it. Speaking of linguistics, the tree model is not the way, language actually works.
It's because really languages aren't context free.
> Speaking of linguistics, the tree model is not the way, language actually works.
A parse tree, or grammar tree is a representation of the concept of the generative grammar which was developed by Noam Chomsky. See also the Chomsky hierarchy of grammars to see were the computer scientists got their ideas from.
@@Conenion Bullseye! And it is dismissed by , say, 99 percent of all linguists. It already was, when I studied linguistics decades ago. It might be usefull for primitive things like XML, DOMs and modern programming languages, but natural language is 3.2 million years ahead of that.
The property of multiplication that is shown is not commutativity (a*b = b*a) but associativity: (a*b)*c = a*(b*c)
50years experience to make token
i dun understand a single thing he is saying ...
I parsed your sentence and analysed it. Compiler Error encountered at "dun". You mean "do not" or "don't"? Or perhaps "did not" or "didn't"? We can probably ignore the lack of capitalisation.
@@another3997 wouldn't it be a parsing error?
Here is a good example of where grammar matters in your writing. Commas matter.
Let’s eat, Grandma.
vs.
Let’s eat Grandma.
Even capitalisation matters: consider the difference between
Peter helped his Uncle Jack off the horse.
and
Peter helped his uncle jack off the horse.
OMG
The first domain I ever registered was parsed.net... worst mistake ever. People would always type my email incorrectly, or ask me to explain what it meant.
So now Robot can Bite Dogs
"if the answer isn't 1 demand your money back"
~ 1:00 I am sorry, but I am afraid Prof. Brailsford mixes up verbs and predicates here. "John saw the man running." A verb at the end, but it is not the predicate, that is ok, even in English ;). So anywhere the word "verb" comes up in this vid: what's really meant is the predicate.
Generally languages are defined SOV SVO etc. You'll see that in grammar books. It doesn't matter that there are modifying elements within those structures. For the same reason the article a/the modifying robot is all part of the subject, running man is all part of the object. Then you can drill down further to find more specific structure.
This was something that confused me when I first started learning a SOV language, because I wondered how they make sentences with more than one verb if it always has to come at the end. Unfortunately it's a problem of semantics.
@@CaptainWumbo it's SPO or SOP. Again, it's the predicate that describes part of a sentence' structure. The term verb describes a class of words, just like the term noun. I haven't seen an actual grammar book that uses 'verb' to describe the predicate. Maybe you can name the book title?
DiAL033 Page 16 of A Dictionary of Basic Japanese Grammar, Seiichi Makino and Michio Tsutsumi, part one of a series of 3 pretty deep grammar books.
It could be that they chose the more common and widely understood word to get across the point, but at the same time they're not afraid to throw words like copula at you which are just as poorly understood. I hear SOV SVO in plenty of videos as well from people repeating what they've read. It could be that predicate is more technically accurate.
My book here does define and make use of the word predicate (and core predicate) in many places. And as you say it doesn't have to be a verb, although much of the time it is.
I think maybe the point SOV gets across is to answer the question where does the verb go, rather than to describe all sentence structures.
I'm so adjective, I verb nouns.
top down bottom up
I tried doing the math problem with my calculator and couldn’t get it .....
Sexo na praia.
He looks like he went to work in his Pyjamas
Two furry dice bit a dog.
They are flying airplanes. Are they pilots or are they airplanes?
Yes.
Just finished a parser, lol
@MichaelKingsfordGray yup, found this video after my parser was finished, haha. I spent my weekend rebuilding a syntax analyzer, and this really could've been a help
Ah yes, furry dice.
I'm getting nightmares of CFGs from my misspent youth taking CS courses. lol
From the cover picture I expected to see a video of the professor picking his nose. Was disappointed.
That footage is only available on the Director's Cut Special Edition DVD ;)
Do I need to be a furry to roll those dices?
yacc
coming soon (I hope) !
@@profdaveb6384 excellent^2
*notices furry dice*
what's this? OwO
its fun listening to him say paaaaahhhzzeng instead of parsing lol
He's saying "parsing" instead of "parsing".
hello
nth comment
(n+7)th comment
@@downstream0114 (n+infinite) th
Whats up people
Boring....... but important tho 😐
no whatsapp
templeos