Can ChatGPT Pass the Oxford University Admissions Test?

Sdílet
Vložit
  • čas přidán 11. 05. 2023
  • Oxford Mathematician Dr Tom Crawford puts ChatGPT through its paces with the Oxford Maths Admissions Test. Is the AI chatbot clever enough to pass the entrance exam?
    Accompanying Maple Learn worksheet: learn.maplesoft.com/doc/z4tp3...
    Sign-up for Maple Learn Premium using the code TOMROCKSMATHS for a discounted subscription. Head to getlearn.maplesoft.com/ for more information.
    The exam is the 2021 Maths Admissions Test (MAT) which is taken by candidates applying to study Undergraduate Maths at the University of Oxford. The syllabus is based on material from the penultimate year of high school, which in the UK would mean the first year of A-level Maths.
    You can download the exam paper here: tomrocksmaths.files.wordpress...
    And the mark scheme is here: tomrocksmaths.files.wordpress...
    Watch Tom take more exams via the designated playlist here: • Exams
    A-level Maths: • Oxford University Math...
    A-level Further Maths: • Oxford University Math...
    GCSE Maths: • Oxford University Math...
    GCSE Further Maths: • Oxford University Math...
    SAT Maths: • Oxford University Math...
    Cambridge University Admissions Test (STEP Paper) Part 1: • Oxford University Math...
    Cambridge University Admissions Test (STEP Paper) Part 2: • Oxford University Math...
    Check your working using the Maple Calculator App - available for free on Google Play and the App Store.
    Android: play.google.com/store/apps/de...
    Apple: apps.apple.com/us/app/maple-c...
    Find out more about the Maple Calculator App and Maple Learn on the Maplesoft CZcams channel: / @maplesoft
    Produced by Dr Tom Crawford at the University of Oxford.
    Tom is an Early-Career Teaching and Outreach Fellow at St Edmund Hall: www.seh.ox.ac.uk/people/tom-c...
    For more maths content check out Tom's website tomrocksmaths.com/
    You can also follow Tom on Facebook, Twitter and Instagram @tomrocksmaths.
    / tomrocksmaths
    / tomrocksmaths
    / tomrocksmaths
    Get your Tom Rocks Maths merchandise here:
    beautifulequations.net/collec...

Komentáře • 329

  • @TomRocksMaths
    @TomRocksMaths  Před rokem +46

    Don't forget to check out the accompanying FREE Maple Learn worksheet: learn.maplesoft.com/doc/z4tp3rbsw6

    • @anubhavpanchal1339
      @anubhavpanchal1339 Před rokem +2

      Try ISI UGA and UGB from India they are for admission in ISI college after highschool and their questions are amazing and if you have the time to only do 1 of them then do UGB
      UGA and UGB are a single entrance test for admission in India statistical institute and in UGA your are give 30 objective questions of 4 marks each to solve in 2 hours and in UGB you are give 8 questions of 10 marks each to do in 2 hours
      And there's an interview after that and at the end they select at most 50 students for admission

    • @deadkachu7678
      @deadkachu7678 Před rokem

      Bro i gave my SQL test with it , i got only 54% cause of chat gpt. I would hv scored more if i didn't tk chat gpt hlp. In simple words this chat gpt go into web and find algorithm similar to question.

    • @deadkachu7678
      @deadkachu7678 Před rokem

      Ruined my life , wanted 100% but failed eventually.

    • @AlleyKatPr0
      @AlleyKatPr0 Před 10 měsíci

      I think if you phrased it like "the following question has one of 4 answers...and the question is [insert question] and the answers can only be [insert answers]" then I think you will find it will then comprehend its logic faster.

    • @peterbenoit5886
      @peterbenoit5886 Před 8 měsíci

      I stayed through the ten parts of the first question and (not surprisingly) got 40/40 points. I'll come back to questions 2-5 when I have more time. GPT looks like a brute-force "thinker," though not an overly clever one.

  • @notnek12
    @notnek12 Před rokem +1145

    ChatGPT is very good at convincing people that it knows what it's talking about even if it's talking complete nonsense, and I think it tricked Tom in some of the "show that" questions 😂

    • @facts-ec4yi
      @facts-ec4yi Před rokem +76

      it actually is a very useful tool tbf. It definitely struggles with maths alot more than other subjects

    • @notnek12
      @notnek12 Před rokem +37

      @@facts-ec4yi Yes it's great. Just this week it has helped me design a rotation algorithm for a badminton club and also gave me a detailed itinerary for a holiday I'm planning.

    • @facts-ec4yi
      @facts-ec4yi Před rokem +5

      @@notnek12 The futures exciting!

    • @milanek1527
      @milanek1527 Před rokem +5

      Yeah GPT 4 fixes some of those issues though

    • @alansmithee419
      @alansmithee419 Před rokem +7

      @@facts-ec4yi it seems to be very good at maths. It just gets answers wrong anyway because it's atrocious at arithmetic.

  • @willbutplural
    @willbutplural Před rokem +753

    I would recommend trying GPT 4! It's logical reasoning skills are leagues better than the version you are currently using

    • @Vinvin5114
      @Vinvin5114 Před rokem +55

      I don't forget to start you prompts by "As a math expert, solve ..."

    • @adzx4
      @adzx4 Před rokem +127

      I don't understand why you would even make this video without using gpt4 in the first place

    • @YOOOOOOOOOOOOOOOOOOOOOOOOOOOO
      @YOOOOOOOOOOOOOOOOOOOOOOOOOOOO Před rokem +5

      @@adzx4 Can regular people even get access to it currently? I thought you had to be put on a waiting list and it costs money for every prompt.

    • @adzx4
      @adzx4 Před rokem +20

      @@YOOOOOOOOOOOOOOOOOOOOOOOOOOOO Regular people can, you just need to pay for premium. The cost is not very expensive, $30 a month.

    • @milanek1527
      @milanek1527 Před rokem +49

      ​@@adzx4 so the answer is no. A regular person isnt paying that for some AI they use once a month.

  • @sofiaknyazeva
    @sofiaknyazeva Před rokem +18

    "^" means exclusive-OR (XOR). I saw sometimes ChatGPT evaluates "^" as power and sometimes as XOR. It's better if you do something like pow(x, y). However latex identify "^" means power.

  • @DavidWhite-rh8gg
    @DavidWhite-rh8gg Před rokem +15

    It normally messes up with basic maths but has good logic with gpt4 with its Mathematica plugin I found it was able to score around a 2-1 and even a 1st in some of the old modules I took in my physics degree as I had some old pass papers and thought it would be a fun little test.

  • @dragoda
    @dragoda Před 5 měsíci +1

    Your video of Gabriel trumpet was amazing. Still my favorite so far.

  • @RC32Smiths01
    @RC32Smiths01 Před rokem +86

    It's interesting seeing how far ChatGPT is coming as far as utilization across many contexts and platforms. Cheers for the lengthy and detailed showcase man!

    • @yuseifudo6075
      @yuseifudo6075 Před 8 měsíci

      And it's so sad 😢
      AI should be sealed away before it's too late

    • @kaplingnag7267
      @kaplingnag7267 Před měsícem

      ​@@yuseifudo6075 ai can give humanity numerous benefits

  • @My-Say
    @My-Say Před rokem +2

    An old manager of mine was Tom Crawford. He was from the Pittsburgh area of PA.

  • @AJBTemplar
    @AJBTemplar Před 11 měsíci +9

    Interesting experiment. Would very much like to see this done again with latest version 4 and maths plug in. (I studied maths at Oxford as post grad and am astounded at how rusty I am when I watch your vids Tom).

  • @scragar
    @scragar Před rokem +13

    I think for 1E it worked out the probability 3 ways and added them together, that's why it got 45/64 instead of 15/64 and did the summing.

    • @johnshortt3006
      @johnshortt3006 Před rokem +1

      the equations for x and y only have 1 solution yet it found 3

    • @fplancke3336
      @fplancke3336 Před 10 měsíci +1

      ​@@johnshortt3006yes, 2 of the 3 solutions are spurious; interestingly, these 2 spurious solutions involve negative numbers so it could have discarded them and fall back on its feet...

  • @rasmus8408
    @rasmus8408 Před rokem +33

    You should try this once GPT-4 becomes available! It will have a vast amounts in improvements. Also you’ll be able to give the program pictures/screenshots of the questions so you don’t have to type it all in😆

    • @thesuperenderdragon
      @thesuperenderdragon Před rokem +3

      i ran it through with gpt4 and wolfram plugin and it got it all right in the multiple choice

  • @Foggydew931
    @Foggydew931 Před rokem +2

    It's actually really good for helping you with some maths problems

  • @thesuperenderdragon
    @thesuperenderdragon Před rokem +2

    I just ran through all the multiple choice with GPT4 and the Wolfram Plugin and it got them all right the first time.

  • @user-pv5hd1vu1t
    @user-pv5hd1vu1t Před rokem +2

    13:30
    I'm not really a mathematician, but rather an aspiring applied linguist but I will give this a shot.
    When trying to set up the simultaneous equations, it got the second equation wrong (the x and y are meant to swap around)
    What it wrote (12:33):
    x + 3y = 10
    y + 2x = 8
    (this leads to the solutions x = 4/5, y = 12/5 where neither are integers and it does not satisfy the third condition, x + y = 6 since there are six vectors)
    What it should have written down:
    x + 3y = 10
    x + 2y = 8
    The second mistake it made was identifying '3 possible solutions' as 'actual solutions' and was trying its best to use the binomial formula with 'all 3 solutions'. This is a system of linear equations, there should only be 1 solution. It didn't bother to reject some of the solutions. (13:01)
    It deduced from its incorrect working that 3y and x need to be even. It chose 'correct y values' that satisfy this condition, but sometimes I have no idea how it got those x-values. It got (4,0) as a solution by using the condition that y needs to be even, so y = 0, 2, 4 and substituting y = 0 into the second equation gives x = 4. I do not know what it did with y = 2, 4 and why it only considered those 3 y values. (12:53)
    Later on, it did interpret the binomial coefficient for P(X=4, Y = 0) as 6 choose 4 [which does end up being the right one to choose as this leads to 15/64, the correct soln]. However, pulling 6 choose 4 like that from seeing 6 vectors and x = 4 sends a few red flags as in this case y should equal 2 for x + y = 6 (11:10)
    However, it treated the other two as valid solutions and tried to sum all 3 which leads to another handwavy 6 choose 2 and 6 choose 4 being calculated with both being equal to each other due to the symmetry of Pascal's triangle. (11:14)
    This led to ChatGPT's answer being three times too big. (11:16)
    It's pretty funny looking at its working.
    In a way, it sort of got it right with incorrect logic. It tried to insert 6 choose 4 or 6 choose 2 somewhere out of its incorrect working. It tried its best to do sensible things, but the logic was not quite there.
    In summary, it made a mistake by writing the second linear equation incorrectly. However, that was not its only mistake. It did not use correct logic based on the incorrect simultaneous equations to get (6 choose 4) or (6 choose 2) even though those were the correct binomial coefficients to use. Lastly, it failed to identify that simultaneous linear equations must have at most 1 solution.

  • @ptyxx
    @ptyxx Před 6 měsíci +4

    So I plugged some of these questions (around 15) into gpt4 and it got all of them correct

  • @OR-hl5lc
    @OR-hl5lc Před 9 měsíci +5

    I did the first part of the test (multiple choice) with GPT4 twice. Once mentioning the solutions, once not mentioning them. GPT4 performed significantly worse when not knowing the possible answers. Which makes sense, a student would be the same way.
    With answers given, it had only 2 questions wrong. Without answers given, it failed 5.
    Edit: Also, GPT has a character limit to its responses. So when it just stops half way through an answer, you can say "go on" and it will continue where it left off.

  • @simoncouture2970
    @simoncouture2970 Před rokem +1

    ChatGPT is able to represent equations with the same appearance as those presented on paper or by hand. On the other hand, we must absolutely specify that we want this kind of writing and not writing on a keyboard that is very complicated to read. Just make a pasted collar of the following text:
    -Now, if you need to write a mathematical expression, use katex notation and
    follow these rules:
    1. If it is a block equation, display it in a single P element and wrap it in dollar signs
    double like this:
    $$e=mc^{2}$$
    2. If it is an inline equation, use katex's double backslash and parenthesis notation,
    like this: \(e^{i \pi}-1=0\).-
    Another thing to keep in mind is that ChatGPT4 is vastly better than ChatGPT3.5. We keep in mind that version 3.5 is simply a demo version and that the vast majority of errors are fixed in version 4. Besides, I myself subscribe to version 4 and I don't want to know anything more from version 3.5!
    Otherwise, great video! Very interesting!

  • @johnchessant3012
    @johnchessant3012 Před rokem +25

    Really interesting experiment! Some of its mistakes are fascinating, like at 6:33 it correctly identifies the substitution that turns it into a quadratic equation, but ends up switching what y and y^2 should be; I wonder what a model trained specifically for math could do. What are your thoughts on how ChatGPT or automated theorem provers like Lean could transform research-level math in the next few decades?

  • @demeterflores8937
    @demeterflores8937 Před rokem

    Love your videos Sir Crawford. ❤Love from Spain.

  • @milesdouglas3443
    @milesdouglas3443 Před rokem

    So interesting loved the video

  • @ritwik3023
    @ritwik3023 Před rokem +2

    Chat GPT is an AI language model and is trained with words and phrases not really mathematical equations, so if u ask it a question, it seasrches it whole database for an answer, it doesnt actually calculate the equation itsself, and if it an equation and it doesnt find it in its database, it will just say it cant solve it as its an AI language model

  • @euclid2718
    @euclid2718 Před rokem +18

    Thanks Tom. Love your content & this was lots of fun. Made me think for a moment that the Turing Test should incorporate some of these OxfordUni math questions & then I realised that most humans would score far worse. So far I’ve found gpt-4 a step up from chatgpt but math still seems a (relative) soft spot in the AI

    • @christophniessl9279
      @christophniessl9279 Před 10 měsíci +1

      Well, AIs must be trained as well, and you just can't learn maths be reading through lots of literature, articles etc. and sort them in some very high dimensional verctor space and retrieve them with applying lots of linear algebra algorithms (at least that is what pattern recognition does, other parts may be different, but I don't think so). Neither do you learn math by knowing the axioms and definitions. You do learn maths by doing the computations and symbol manipulations by hand until the math objects begin to get a live of their own in your mind, with unique characteristics and behaviopurs.
      It would be interesting to let some AI do all exercises from several math textbooks, beginning with adding numbers up to 10 etc and feed back the results into their own self-learning.
      With browsing through heaps of texts with information and sorting by keywords you get a good storyteller but not a problem solver.

  • @My-Say
    @My-Say Před rokem +34

    Pretty interesting. I sometimes use it for VB programming and it writes really good code. You need to be as detailed as possible and it always gives me correct results. I'm very impressed by it. I have been entering some of my old code and it is simplifying it for me, saving me a great deal of time.

    • @nickrobinson7096
      @nickrobinson7096 Před rokem +1

      Same! Saves so much time!

    • @CheeseBlackOps
      @CheeseBlackOps Před 11 měsíci +1

      @@nickrobinson7096 It does sometimes make python libraries up out of thin air for some reason

    • @isaacclark9825
      @isaacclark9825 Před 10 měsíci

      I find that ChatGPT frequently does get things wrong when coding. I can often get it much closer by asking follow-up questions related to what was previously wrong. You may not be asking it to write code that is very complex.
      Also, if you tell ChatGPT it is wrong, it generally folds and apologizes pretty quickly.

    • @CentaurisNomadus
      @CentaurisNomadus Před 9 měsíci

      ​@@isaacclark9825So it is actually cool, man. Programming is becoming more accessible to masses, then.
      I think that it is great for doing "dull" parts of your work, assuming you scan it for obviously wrong parts. But I am taling about non-programming, mostly educational experience.
      P.S. I am a management major student who engages in watching coding tutorials for free and does not know what to start with for sure. Now I am leaning towards making a website with original features, and probably games in it, and do it before 2026 💀

    • @JansthcirlU
      @JansthcirlU Před 2 měsíci

      I pity you for having to write VB, hope you can switch to C# one day!

  • @Yorichii_109
    @Yorichii_109 Před rokem +5

    My dream is to get a chance of admission in Oxford, i am still trying my best to succeed...Huge fan, i love your explaining...keep up the good work🎉

  • @jkadoodle
    @jkadoodle Před 7 měsíci +1

    I used this to help a friend answer medical quiz questions such as suggesting treatments based symptoms. It was right about 90% of the time.

    • @thewhat2
      @thewhat2 Před 6 měsíci

      Can't even pass an entrance exam smh.

    • @zugzwangelist
      @zugzwangelist Před 4 dny

      10% of the patients died.

  • @isaacclark9825
    @isaacclark9825 Před 10 měsíci

    once you find that the central angle is 30, then the area is 1/2 a * b * sine 30, where a and b are indeed radii. 1/2* 1* 1* 1/2 * 12 = 3. This is ninth-grade math. GPT's error was using 30 degrees (and using a special triangle) instead of 15 degrees for the half triangle. That is why it got the height wrong.

  • @marionascimento450
    @marionascimento450 Před 7 měsíci +1

    15:52 to be fair, GPT said "(...) also passes through the point (2,0) for two values of a, namely (...)". You could interpret it as "It passes for another value that I divided before, but also through these two" :D

  • @sergeipravosud1848
    @sergeipravosud1848 Před 9 měsíci

    The same thing happened with the Russian Unified State Examination (ЕГЭ). ChatGPT wasn't able to compete with that test, especially with the second part of the exam where detailed answers are required for complex trigonometry equation, complex logarithmic inequality, economical problem (credits), stereometry, geometry, equation with x and an unknown parameter and the last one is problem from number theory.

  • @JT-vy7nk
    @JT-vy7nk Před 9 měsíci +4

    Particularly in the multiple choice section, it would have been interesting to prompt the model in the identical way that it's asked in the question for humans alongside giving the options, and giving the exact representation of the integral using latex notation rather than using English to describe it. My experience is that it varies wildly in GPT-3.5 but significantly more robust in GPT-4!

    • @georgeparnell6026
      @georgeparnell6026 Před 7 měsíci +1

      Yeah it understands LaTeX formatting nicely, and I think giving the options would force it to work towards one of them, which would definitely increase the scores

  • @LogosNigrum
    @LogosNigrum Před 10 měsíci

    On the problem at 46:40, it is true that b and d are zero, and that is the requirement for a polynomial being an even function. It's just crazy it didn't explicitly show it by expanding f(x) using that prior form.

  • @leonsteed606
    @leonsteed606 Před rokem +4

    I think it got question 1C correct for the wrong reason. It came to the conclusion that a = p - ln2 and b = q - ln2 and therefore that p - a = q - b (since they are both ln2). In actual fact a = p - 1 and b = q - 1 and hence p - a = q - b but because they are both equal to 1 not ln2.

    • @Rodhern
      @Rodhern Před rokem +1

      I think you are absolutely correct there sir. It is interesting to see that complex parts of the questions are broken down into simpler pieces, the very easy parts are answered, and then the 'medium easy' parts, ChatGPT just goes "set y = 0 and solve for x" and voila - some rather random answer emerges!

  • @samayahone3497
    @samayahone3497 Před rokem +15

    Did I just watch an hour and twenty minutes of a guy making Chat GPT do maths instead of revising for my exams? Yes. Yes I did. Do I regret it? Not one bit.

    • @notreallydavid
      @notreallydavid Před rokem

      Came for the sums, stayed for the hair?

    • @TomLeg
      @TomLeg Před rokem

      Good luck with your exams!

  • @kurtkeissingermckenzie4290
    @kurtkeissingermckenzie4290 Před 5 měsíci

    Wow I spent my Christmas afternoon watching. I found this video riveting. 😂

  • @jannord1546
    @jannord1546 Před rokem +2

    I used it for programming and it’s sooo helpful. I don’t know if I needed 10x more time or more for the same task without it or if I would even be able to do it.

  • @asherang7
    @asherang7 Před 10 měsíci

    im curious to see you try the singaporean further maths exam and see how that compares to the one in the uk

  • @ryanjohnson2844
    @ryanjohnson2844 Před rokem +3

    One very big thing I think you might've missed out on experiencing is correcting the mistakes. For example making a follow up query such as "i'm not convinced on your algebra after step ... ". Or "can you explain why you did ... on step 3". The follow up query's are what have helped me learn a lot while cheating on my homework since it makes you have to deeply understand the question to know what part of it is even the answer.

  • @TheVoitel
    @TheVoitel Před 10 měsíci +1

    The first one: ChatGPT simply took the solution for the hexagon (where you really get equilateral triangles) and simply replaces the factor 6 by the factor 12. Which is nice, 3sqrt(3) is definitely larger than pi. In fact if we look at the options all but b) and e) are larger than pi. Basically ChatGPT is saying: I know a solution to this similar problem, and will solve it by simply changing each occurance of the number 6 to 12.
    Also for E) (the combinatorics one): ChatGPT used the equations x + 3y = 10 and y + 2x = 8, should have been x+3y = 10 and x+2y = 8, which gives the correct solution that y = 2 and x = 4, which then gives the correct answer 1/2⁶ (6 2) = 15/64.
    G) Is one heck of a question. It can be spelled by including sin(0)^2 (which is 0), recognizing that this is the same as summing over the cosine squares (just in reverse direction), so 2A = 1 + 1 + ... + 1 = 91. Thus this could be generalized to: Let µ be a measure on [0, pi/2] that is symmetric around pi/4 (so µ(A) = µ(pi/2 - A)). Then integral_[0, pi/2] sin²(x) dµ(x) = 1/2 µ([0, pi/2])

  • @notwelcome2452
    @notwelcome2452 Před rokem

    Hi
    Have you ever considered doing a podcast in which you discuss the latest developments in maths

  • @PBMS123
    @PBMS123 Před 9 měsíci +1

    For question 2, it seems as if the error occurs at 5:05. It says substituting y = a^(3/2)
    But in doing so, it turn the a^3 into y. which isn't right. Don't know it substitutes instead of going straight to factorising. As that's pretty easy, as a^3/2 and a^3 very clearly have a common power, and 3/2+3/2 = 3.....

  • @davidlukin5441
    @davidlukin5441 Před 7 dny

    From David ; chat GPT version 6 or 7 or 8 will 100 percent pass the exams when it gain access to the data ( include concept information : not yet included here in the earlier beta version of chat GPT ). The future test will need to be construct to ( find the capacity of “ human creativity” ability levels ) as oppose to ( data learning of past input ) from ( previous human discoveries )

  • @delusion110
    @delusion110 Před 3 měsíci

    10:16 of course it got this right, cause he helped me to solve the exact same question in my homework

  • @alfrede.newman1838
    @alfrede.newman1838 Před 9 měsíci +2

    Absolutely fascinating exercise (verification?). A similar exercise has been done internally @ MSFT Research.
    I wonder how we could define a comprehensive public review/expert feedback of Chatgpt's math "ability" using the latest version of Chatgpt. The Chatgpt version here does not include all known corpora of maths knowledge(?).

  • @jakobj55
    @jakobj55 Před rokem +3

    ChatGPT is like a good politician:
    It's making sure you thinks it knows what it's talking about even though that is complete nonsense.

  • @Finsternis..
    @Finsternis.. Před 10 měsíci +1

    Remember: when working with these kind of AIs: always stay vigilant.
    Collect its answers, doubt the answers and "at least" let it source its answers against itself.
    For example: I was looking for measurement units where two use the same abbreviation. It said there are none. This made me sceptical as there are two kind of tonnes, that aren't standardized. So I straight up asked it wether long tonnes and short tonnes use the same abbreviation. It told me no - quite confident even, it used the word "nope" - reasoning short tonnes use "t" and long tonnes use "T". I asked it who decided that and it replied that international organisations like SI is doing that. Since I know that tonnes isn't a SI unit (or base unit at least) I pressed it again who decided that this ton is t and that ton is T and it then admitted there is no standardization for this.
    Yes, it is a great tool to skim a lot of information quickly. But sadly, it is just skimming and people need to still input actual work in double checking the responses.

  • @sayarsine6479
    @sayarsine6479 Před rokem

    love this one

  • @NickMunch
    @NickMunch Před 9 měsíci +6

    I'm curious how ChatGPT would have done if you had listed the choices for each question in the multiple choice section.

    • @christopherdigirolamo9879
      @christopherdigirolamo9879 Před 9 měsíci +1

      Not including the multiple choices is pretty big oversight for the stated goal of seeing how ChatGPT would do on a multiple choice test. (or at least for a test that is X percent multiple choice)

  • @brendanmiralles3415
    @brendanmiralles3415 Před rokem +1

    I gave chat gpt a vatiation of one of thr questions from the 2022 mat (i believe question 5) and it came up with an incorrect (albeit very interesting) solution so im curious to see how it handles the rest of the mat paper

  • @dk3153
    @dk3153 Před 6 měsíci

    often if the result is wrong i just type "are you sure about your results?" and if there is an actual mistake in the calculation it will write you a second attempt on the question

  • @potterlover96
    @potterlover96 Před 5 měsíci

    It's fascinating how often it gets tripped up by a simple algebraic mistake. Like on this one 13:27, it's mixed up the x and the y, since it should be x + 3y = 10 and x + 2y = 8, since y is how often (3,2) is chosen. From there it could solve the equations. Also, later on for the same question, it classed x = -1 and x = -6 as valid solutions, but if x is how often (1,1) is chosen then x can't be negative at any point.
    Nice to know ChatGPT won't be taking over any time soon 😂

  • @ai-spacedestructor
    @ai-spacedestructor Před 10 měsíci

    other people have already pointed out that GPT 4 is a large logic improvement over Chat GPT, i think its also worth pointing out that this Model doesnt "think" or "knows" anything.
    its just a lot of data from various different sources fed in to a really fancy algorithm and its pretty good at finding patterns and figuring out the logic but the more detailed results you ask from it the worse it will get because and this is true especially in math even more then other topics, every minor error adds up together and can end up in a very wrong result.
    so it will do a lot better if you formulate it in a way or ask for something that allows to ignore some of the finer details or give it questions that dont require such a fine detail to be able to answer them.
    as a person in the comments already pointed out, it also does a lot better if the possible options are provided to it because instead of finding ANY solution it "thinks" to be valid, it will then instead go through each option and answer which one it is the most confident to be the right answer, which will help gloss over of some of the errors it does because it will just pick whats the closest to the answer it gets.
    unless it mixes up the answers as well which i think it did in one of the questions and then used the number from one answer but the value that the number represents of a different answer.
    the answer value was correct but worth pointing out that even in a simple list of 5 options to pick it sometimes confuses which number in the list has which option.
    Ai "thinks" very different to how humans do because computers have always processed information different then humans and that means you have to also formulate the question for someone who thinks like this Ai model and not like you would for a human to get results comparable to humans.
    this wont get rid fully of all the logic errors it made but that combined with using GPT 4 would probably have elliminated most mistakes it did.

  • @wessltov
    @wessltov Před 10 měsíci

    First time I used ChatGPT for math, I wanted to know battery time with a constant rate of expenditure.
    It started out good, but I was pretty disappointed when ChatGPT multiplied its total runtime in hours (which it calculated) with the amount of weeks in 3 months

  • @kennethha8173
    @kennethha8173 Před 8 měsíci +1

    I noticed that chatgpt is good at most things except maths

  • @user-zh8ux8kw8w
    @user-zh8ux8kw8w Před 4 měsíci +1

    Some slight critique: I think it would have been more representative if for every multiple choice question all answers would have been provided to GPT. I reckon that most applicants are using those answers to cross check wether their answer is correct or if they thought the wrong direction
    Cheers

    • @neilhegarty-mq4dd
      @neilhegarty-mq4dd Před měsícem

      Correct. It shows in the one question you gave it the answers it works back from the answers given. Not giving the answers if not a valid test.

  • @the_kid777
    @the_kid777 Před 18 dny

    10 years later, chatGPT is going to fight back, getting every single question right and breaking the test

  • @bcwbcw3741
    @bcwbcw3741 Před 9 měsíci

    On question 1, you know the area is less than pi since it's inside an unit circle so b) and e) are the only possible answers.

  • @ShadowOfAchilles
    @ShadowOfAchilles Před 7 měsíci

    With the very first question, if you type in the assumption provided in the question, you'll get the right answer. For example, if you type: "Assume the area of a single triangle is 0.25. A regular dodecagon is a 12-sided polygon with all sides the same length and all internal angles equal. If I contrusct a regular dodecagon by connecting 12 equally-spaced points of a circle of radius 1, then the area of the polygon is?"

  • @nitenuc2341
    @nitenuc2341 Před 10 měsíci +1

    I think that you should have given ChatGPT the options too,because then,if it got an answer that is not there in the options,it would have atleast made a guess or try to correct itself. Then it would have definitely gotten more answers correct.

  • @thomasmairowski9101
    @thomasmairowski9101 Před 10 měsíci

    I am German and maybe I have not really understood your evaluation and your opinion about the efficiency of ChatGPT.
    I was astonished ChatGBT was able to solve some of these math problems.

  • @aster_11
    @aster_11 Před 7 měsíci +1

    The fact chat gpt is so good genuinely terrifies me. I'm so scared of ai's potential, and no not in the way "it'll grow so smart that it'll control humans like the matrix" no, more like in the way, it has the ability to take over many jobs and I worry abt a lack of careers in the future...

  • @Shotgunz999
    @Shotgunz999 Před rokem +4

    It is really frustrating that you didnt give it all the answers... i dont think it is fair considering a student would have access to them.

  • @BaileyMxX
    @BaileyMxX Před rokem +8

    If it's multiple choice at least give it the multiple choice options each time, with its true/false verification it would be interesting to see if any of the zero mark questions actually gave you a positive response with the extra info

  • @ianvollbracht7228
    @ianvollbracht7228 Před 27 dny

    Tom - i have absolutely no commercial interest, but I strongly suggest re-running this test with the Wolfram Alpha plug-in for GPT4. WA does the computation. GpT4 just translates from English to Wolfram langauge and back again. So it should massively outperform GPT3.5. All good wishes, Ian

  • @mingsun6249
    @mingsun6249 Před rokem

    You can ask Chat GPT to put the answer into latex too. It can talk but it makes lots of basic arithmetic mistakes.

  • @TheSparkLabs
    @TheSparkLabs Před rokem +3

    I'd want to see how GPT-4 would do.

  • @MORTON__
    @MORTON__ Před rokem

    I wonder if there would be any difference if you also provide the answer? An MCQ is different from a SAQ afterall

  • @DuncanBooth
    @DuncanBooth Před rokem +1

    I don't think you needed to edit the questions nearly as much as you did. I copied the first question into chatGPT exactly as written with the answer options and it came up with the same reasoning and the same wrong answer but stated more clearly: "Therefore, the correct answer is (d) 3√3."

  • @tsuchan
    @tsuchan Před rokem +1

    I tried the first question in ChatGPT4, and it got the right answer (3) 3.
    ---
    tsuchan1:
    Ok. Let's d this step by step to avoid mistakes. This is a multiple-choice question and one answer is correct.
    A) A regular dodecagon isa 12-sided polygon with all sides in the same length and all internal angles equal. If I construct a regular dodecagon by connecting 12 equally-spaced points on a circle of radius 1, then the area of this polygon is:
    a) 6+(3 Sqrt(3))
    b) 2(sqrt(2))
    c) 3(sqrt(2))
    d) 3(sqrt(3))
    e) 3.
    Which answer is correct?
    ChatGPT:
    To find the area of a regular polygon, you can use the following formula:
    Area = 1/2 * n * r^2 * sin(2π/n)
    Where:
    n is the number of sides (for a dodecagon, n=12),
    r is the radius of the circumcircle (in this case, r=1), and
    π is a constant (approximately 3.14159).
    Substituting these values into the formula, we have:
    Area = 1/2 * 12 * 1^2 * sin(2π/12)
    = 6 * sin(π/6)
    Since sin(π/6) = 1/2, the area of the dodecagon is:
    Area = 6 * 1/2 = 3.
    So, the correct answer is (e) 3.

    • @snooks5607
      @snooks5607 Před rokem

      not the same question asked in the video though, he didn't give it the options

    • @tsuchan
      @tsuchan Před rokem

      @@snooks5607 Yes, that's true. Only fair though, it was a multi-choice question. ;-)

  • @medhurstt
    @medhurstt Před rokem +3

    As others have said, chatGPT 4 would likely do better and to be fair, I think you should have included the possible answers when asking the question in the same way a human has access to the possibilities.

    • @medhurstt
      @medhurstt Před rokem

      Replying to myself....on the third question you gave it the answers and it got it right. Of course giving it the answers makes a difference. Especially when the answer it calculates isn't one of the options. But in general, if the path to the answer includes a target (as it went through each solution) then its "thinking" about it differently.

  • @QLY.
    @QLY. Před rokem

    I would love to re-do it with giving the multiple choice answers to the AI

  • @timpreece945
    @timpreece945 Před rokem +2

    This means very little if you are using ChatGPT rather than GPT-4. Surely any student will be using GPT-4 or better.

  • @vinicius123933
    @vinicius123933 Před rokem

    Item e one must ignore the cases where x

  • @AlanZucconi
    @AlanZucconi Před rokem

    Shame I missed the premiere on this!!! 🤖

  • @davidplanet3919
    @davidplanet3919 Před rokem +1

    For the first one the answer is close to pi, so 3.

  • @larryharson510
    @larryharson510 Před rokem

    What an eccentric individual

  • @Bububar87
    @Bububar87 Před 10 měsíci

    Wirh multiple choice questions - chatGPT should get every choice to check for for better results so if it messes up its way to the answer it can recheck

  • @sammarks9146
    @sammarks9146 Před rokem +3

    I'm obviously no expert, but I doubt "exceptionally entertaining" is the standard for the Oxford Admissions test :)

    • @Rodhern
      @Rodhern Před rokem

      I have no clue either, but somehow I guess you typically would not want to be the one to deliver an "exceptionally entertaining" performance for your admissions test. :-)

  • @djfmitv
    @djfmitv Před 11 měsíci

    A better question Doc, is to ask why anyone would want to program any AI to pass things like the Oxford Math(s) Admissions Test in the first place and whether that is an ethical thing to do(?) 😛

  • @Think4aChange
    @Think4aChange Před rokem

    Try it again at some stage now that it has access to the Wolfram Alpha plugin.

  • @piccoloconte5967
    @piccoloconte5967 Před 5 měsíci

    actually for question j chatGPT got it right, it got answer d, which implies also that all of those equalities are true, bot answer a,b,c were wrong becuase of the if and only if form, which doesnt hold since all of the 3 conditions have to be true at the same time

  • @aditivmain
    @aditivmain Před rokem +2

    i feel like its because thats the gpt-3 model, but gpt-4 is far more better and well trained + now there are plugins available to make things more easier and accurate so maybe we need a part 2 of this video sir😜, amazing video as always!!!

  • @KrazyCouch2
    @KrazyCouch2 Před rokem

    Only just started watching the video but I it interesting that for the first question, 3 of the 5 answers are greater than π.

  • @David_K_Booth
    @David_K_Booth Před 5 měsíci

    My algebra teacher used to say *"You have thrown out the baby with the bath water!"* if someone made an error like that division by a-1.

  • @Nickle314
    @Nickle314 Před rokem +4

    It's not going to pass a maths test unless its lucky. It doesn't get rules based logic to the exent that's required.

  • @Zultchy
    @Zultchy Před 11 měsíci +1

    If you provide the multiple choice answers, it will check against them. Which is what a human would do. So I think it's only fair to include them in future episodes. Note that the one time you provided all multiple choice possibilities, it got the answer right.

  • @Axhar1
    @Axhar1 Před rokem

    Some times it present logically flawed information, I ask it to show me section 55 of contract act, it said there is no section 55 in contract act as there are only 75 section in that act.😅

  • @drumstruck751
    @drumstruck751 Před 6 měsíci

    Laws of Potential Energy
    1 Vacuum
    Once you take the contents out of matter the potential of that matter rises and simply converts to kinetic energy.
    Reasoning states that vacuums want to collapse upon itself, but this observation can only be explained by gravity. Matter must be the only thing that can attract or repel matter. Furthermore, any surmountable equivalently rational of matter can attract that matter and any matter less than itself. Presumably walls of a vacuum must be attracting each other.
    ______________________________________________________________________
    2 Matter in a Vacuum
    Matter surrounded by a vacuum, that matter's potential energy will rise.
    Reasoning states that if we have nothing around us we'd move through space easier potentially.
    ______________________________________________________________________
    3 The Complexity of Matter
    The combination of matter causes the matter to lower its potential energy.
    Reasoning states matter in a material-environment; the matter slows down thus dropping its potential energy.
    ______________________________________________________________________
    Newton just simply didn't understand that space was a vacuum. Believe or not I didn't do this for nothing.
    Once you compare matter to the vacuum of space it gives you, perpetual potential energy. Perpetuity of potential energy creates a circuit for matter, and that natural event is called gravity. When the perpetuity of potential energy is met with the vacuum of space the only theories these laws of potential energy and Newtonian laws can describe how vacuums affect as; is a conversion from perpetual potential energy to perpetual kinetic energy giving us an orbital circuit.
    The perpetual sum of potential energy equals to the perpetual sum of kinetic energy.
    A really good example of how gravity works is how tungsten doesn't just explode in Edison's light bulb. Tungsten begins to suppress itself when exposed to the vacuum through gravitational force increasing the amount needed to change its state of matter. Knowing this we could presume that when exposed to a vacuum that its complete spectrum of potential energy is directly increased.
    Accreditations to essential vacuums.
    To leave one quote to summarize what gravity is; I would like to say this.
    Gravity can not act upon other gravity it must be circuited from point a to b. Although gravity can act on matter, as it is a circuit, like copper to electricity.
    (a small excerpt about molecular gravity)
    If I were to explain the determinants of an outcome with a fifty fifty percent chance. Then I would start to explain that the first piece of information; determines the outcomes by randomly choosing the first time and choosing it one hundred percent of the time there after. Sense we know what we know about potential energy we can assume that when dealing with matter it can only affect other forms of matter, as long as force is providing through acceleration two objects will affect each other. This gravity alone could imprint and store energy from the accelerant. This alone could provide a blockade enough to explain the problems within quantum mechanics.
    Let's assume that gravity can store energy in vacuums like between molecules in which the space provided by definition is inert. What that means exactly is that the vacuum itself can not and will not move. Whereas vacuums themselves are massless thus volumeless, densless, and shapeless.
    Conjugating this reason as indisputable facts you can take it to say electricity can be held in superposition due to the lack of matter. This statement alone could explain superpositional frames of energy in quantum physics. Electricity exerts an opposite force on its circuit. Which is why it gets rid of electrons while also suspending the electrons. Though the circuit will store the imprint of that matter, like wax for a record player - down to copper for a voice. Audio particles can be shown by this same trait. Basically the gravitational space is being taken up, and any energy in any state of matter can just simply tell which path was better due to changing its state from kinetic energy to potential energy.
    The third law is very special to me. This came to be for the oddest and most backwards way it could come to fortuition. In highschool we were studying catalysts when I realized that when something like mentos is introduced to cola that liquid state of matter is very futile. The only way the reaction is possible is that the potential energy changes `state by lowering the potential energy of the liquids; that directly touch to some extent the catalyst. Anything beyond that natural event of potential energy lowering; Newtonian laws take over. Basically the high reaction of changing states of matter is because of the high kinetics that's interacting with the liquid which had that basic law of complex matter acting upon it.
    The important part of this is that -when you suck out a sandwich baggy... What's that bond keeping the sides together? More so to say the natural state of gravity at a kinetic state is bonding the matter together. To a point that it won't separate at least without breaking the vacuum itself.
    It's like real physicists are trying to tell you guys; that the only crushing force is force itself. Oh by the way force is a trait of matter.
    The only trait I can think that non matter does is that it changes potential energy in some way.
    catalytic potential problems
    P~ k#1-(k#2)=y
    P~ y+k#1=D
    x equals an expressed version of kelvin as r before or equal to the point matter changes states of matter; within an absolution of force.~ "but not anymore because now we just call it" kelvin.(Allan Adams)
    assume the potential state of matter.
    ________________________________
    k#1- m#2 / k#2 - m#1 =g field
    ~P-k potential as kelvin to express the limit to which its state of matter can hold.
    ________________________________

  • @Roham_planet__explains
    @Roham_planet__explains Před měsícem

    I think that Ɓ / sum of 6²=π²/4*4=f(10)

  • @RealVoidex
    @RealVoidex Před 10 měsíci

    This goes to show you cant trust chatgpt... I've seen a lot of people just trusting blindly and be surprised they got it wrong... always do the work yourself... and check your work.

  • @mwoky
    @mwoky Před rokem

    For the first question, the dodecagon necessarily has to be circumscribed by the circle whose area is 3. Meaning the dodaceagon must have an area less than the circle. The only option being B. You don't even have to do math for that question unless my logic is wrong somehow?

    • @davidrobins1021
      @davidrobins1021 Před rokem

      The area of the circle is pi (approx 3.141)
      Both 2root2 (B) and 3 (E) are less than pi, so this reasoning narrows it down to one of these two, but it doesn't tell us which of these it is.
      3:47 The area is 3 (E).

    • @mwoky
      @mwoky Před rokem

      @@davidrobins1021 Right, idk why I thought the area of the circle was 3. Honestly couldn't tell you where that came from.

  • @iteerrex8166
    @iteerrex8166 Před rokem

    It may do better to use it as a tool, such that to divide each query into smaller questions. But obviously that’s not the title of the video.

  • @jamesheseltine8283
    @jamesheseltine8283 Před rokem

    Hello, I really enjoyed your video. I'm a mature student that failed maths as a teenager and I want to retake it. Are there any websites you recommend? I love Khan Academy, but I worry as it's tailored towards the American system?

  • @youtuber-tg1pu
    @youtuber-tg1pu Před 9 měsíci

    Should use GPT4, this is the very basic version and lags in certain areas.

  • @sskar9390
    @sskar9390 Před rokem

    Just wondering if correcting for real apliccants is so "sweet"( so as to say is much like, no answer exactly the same as the solution gets a zero mark)

  • @srudkin1
    @srudkin1 Před rokem

    Large language models are only as good as the prompt they are given.

  • @jannord1546
    @jannord1546 Před rokem

    Pokeball on the arm lol :D

  • @theglitch8742
    @theglitch8742 Před rokem +1

    Can you try to take turkish university exams amt parts there's 80 questions total

  • @schulem1409
    @schulem1409 Před rokem +1

    Hey Guys,does someone of you have any good math books to recommend, in oder to have a chanche to get a good mark on this test?

    • @mayonakao2488
      @mayonakao2488 Před rokem

      The Barron’s EZ Math textbook series. But textbooks won’t give familiarity with the creative logic needed to answer the silly questions. Rely on youtuber videos for that, like 3blue1brown or numberphile. Most of this test is trivially easy, and can be guessed both quickly and accurately with an understanding of algebra and calculus. Decoding an the absolute proof to what’s fairly obvious is what’s tricky: understanding exactly what is the Question is more important than being aware of its Answer.

    • @schulem1409
      @schulem1409 Před rokem

      @@mayonakao2488 thanks

    • @schulem1409
      @schulem1409 Před rokem

      @@mayonakao2488 like in Hitchhikers Guide to the Galaxy😁

  • @nanamacapagal8342
    @nanamacapagal8342 Před 11 měsíci +1

    Took the test myself, and got a score of 24-12-12-15-10 for Q1 to Q5. Most of the errors were mistakes like accidentally using the wrong inequality, messing up the algebra, or skipping an important detail. But overall it was pretty fun
    I haven't watched the whole video yet but I'm interested to see how chatGPT solves these problems, and what strategies it might use

  • @brucerosner3547
    @brucerosner3547 Před 11 měsíci

    The real issue is to take next year's test and see if there is an improvement.

  • @jarradtait5322
    @jarradtait5322 Před 9 měsíci

    Scarily enough GPT-4 Aces those first two questions especially when given the full question (including the choices). I didn't have the wherewithal to type any more in but I guess it has since figured out math to a higher degree of fidelity