Googles Attempt to take on Open AI
Vložit
- čas přidán 15. 02. 2024
- Google Gemini 1.5 Pro is a medium-sized multimodal AI model that can process text and video. It has a standard 128,000 token context window and can reason across up to one hour of video. Gemini 1.5 Pro can take in about 700,000 words or 30,000 lines of code, which is 35 times more than Gemini 1.0 Pro. It can also ingest up to 11 hours of audio or an hour of video in a variety of different languages.
▼ Link(s) From Today’s Video:
Gemini 1.5 twitter post: / 1758146022726041615
Gemini 1.5 blog post: blog.google/technology/ai/goo...
My video on Open AI SORA: • Open AI Releases the B...
Meta V-JEPA: ai.meta.com/blog/v-jepa-yann-...
Krea's new upscaler: / 1758064761181483363
Lindy release: / 1758174198080684282
► MattVidPro Discord: / discord
► Follow Me on Twitter: / mattvidpro
-------------------------------------------------
▼ Extra Links of Interest:
✩ AI LINKS MASTER LIST: www.futurepedia.io/
✩ General AI Playlist: • General MattVidPro AI ...
✩ AI I use to edit videos: www.descript.com/?lmref=nA4fDg
✩ Instagram: mattvidpro
✩ Tiktok: tiktok.com/@mattvidpro
✩ Second Channel: / @matt_pie
-------------------------------------------------
Thanks for watching Matt Video Productions! I make all sorts of videos here on CZcams! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!
All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.
-------------------------------------------------
► Business Contact: MattVidProSecond@gmail.com - Věda a technologie
2024 is definitely going to be a crazy AI year!
I've been thinking this since ChatGPT
* crazier
@ 4:53 - "It remains to be seen..." regarding consistency as the context window increases. Read their paper: MIRACULOUSLY, it consistency actually IMPROVES as the context window increases.
Oh, that’s a fascinating statement: humans get fuzzier with more context!
I’d heard that it doesn’t retain 100% consistency as it got larger, but still very good.
Yeah, I wonder why Matt didn't read/show the paper in the video at all. I don't remember if it was a literal research paper but the NIAH evaluation (Needle in a Haystack) for Gemini 1.5 Pro 1m context window is out there somewhere.
@@strictnonconformist7369 Haha, no. But its consistency CONSISTENTLY INCREASED, until it reached near 10 million tokens, at which point it made a relatively dramatic shift in improvement. Bucking the stereotypical trend of all other models so far, this one truly does appear to improve with larger token windows. ...
Search for "Gemini1.5:Unlockingmultimodal understanding", locate the PDF, go to page 8, examine the charts and read the text on that page. If that then stokes your wonderment and curiosity, the entire paper. I'm sure you'll find at least page 8 enlightening. Enjoy!
We also have to remember that google is VERY good at search. If it combines their search alogarythems with Gemini I can see it becoming one of the best.
Everybody was talking about Sora even ABC news
Gemini seems to have a real problem with hallucinations - like so bad it makes it practically useless.
yeah Bard was better
Agreed. It's awful.
Agreed
Agreed
Real
My very first attempt: I took some lyrics from a goofy 90s song about a boy who gets into an accident and his hair turned white from the crash and asked Gemini to write a story from them. Gemini refused to do it, telling me I was asking it to do something that was too "spicy". A billion token context length won't make it usable if it's locked down that tight.
Mmm mmm mmm
I saw that goofy band live once
Exactly my sentiment. I literally asked it to "invent some funny football sayings that i can tell friends to pretend I know stuff about football". It outright said "no i will not provide something to make you seem like you're an expert when you're not. Like WHAT? lol. chatgpt did it.
It was probably the birthmarks in the changing room verse that triggered it
@@peaxoop Nah, I only gave it the part about the boy who got into an accident in. It was triggered by the accident, considering it violence.
I'm really impressed with Gemini Ultra! It excels at creative writing, understanding my intent, and maintaining long-term conversational context - key areas where I find it surpasses even ChatGPT 4. The more I converse with it in a single chat, the better it gets. I'm excited to see what Gemini 1.5 brings to the table.
It’s not so great for coding though. And it doesn’t have custom instructions or a GPTs equivalent. I’m planning on canceling at the end of the free trial unless it can produce consistently executable code.
That minecraft ai generated video is actually so crazy
the pig doesn't look right tho
OpenAI released the paper about Sora and it has a lot of relevant information.
Google just has to speedup. They catch up to OpenAI, but when GPT-5 comes they will be massively behind.
they have impressive stuff in development we just don't know
Stop Coping, Sam Altman even stated that GPT-5 will be a disappointment. The only next big thing OpenAI could do is release AGI... lol Wont happen.
Imagine what Billy Mitchel could do with the ai game footage generation
Entirely unexpected but great comment.
If I was google, I'd be making agreements with publishers to allow customers to buy access to any book's chatbot. Imagine a textbook you can chat with. Right now you could do it but it would have to be something open source, public domain or bootlegged
notebook lm exists
For me, the most important thing at this point is for any ai to remember the thread in which it is moving. If it writes code whether in python or js or other so that if I ask it for new functions to stick to what it generated previously and not change and spoil the code
Guessing GPT5 will be out by next month if the examples prove to be true! Things are moving so fast
Definitely no, they recently said that they started training, they have to do safety testing too.
Zero percent chance Chatgpt5 is coming next month. Will be here towards the end of this year.
Matt you were right about SORA…it made the news at least my local news here in Austin…
I've watched tons of videos on this, but MattVidPro AI videos are always "CAN'T MISS!" for me. Thanks, Matt! Always enjoy your entertaining and informed take on the ever-increasing advance of AI.
Let's go man was waiting for your video on Gemini 1.5.
2024 will be a crazy year for AI Development
Be quick enough or you will be left far behind.
The three.js one impress me a LOT. I've been struggling a lot with it, also GPT4 and GPT3.5 have difficulties integrating it in code. If that's 100% I'm speechless, but for my experience I'll have doubts until I can try it by myself
Still every ai chat I used, it stops in the middle when you get an output. For example if you ask for a long code
Not sure if that is a limit in computation or a limit of its abilities.
Where’d you get that lemon LED? I kind of want one.
Their focus on the 1M context window is worrisome. A huge context is great, but the underlying Gemini Advanced model didn't seem very intelligent compared to, say, GPT-4-Turbo. If it can handle more data but is still dumber than alternatives with smaller context windows... the alternatives win for me 🤷♂
Depends what type of data it handles, I've had bad experiences with programming but good experiences with languages and creative writing tasks (even better than gpt4 but that's subjective).
Although I gotta say Gemini is way too sensitive with certain content that might involve sexual innuendos or whatever it deems inappropriate or too close to copyright (some writing styles) and then just gives up half-ways even after generating something it changes its mind. It's really stubborn and buggy, but occasionally produces something that impresses me more than gpt4 which is why I put up with the frustration it causes 😅 it takes a lot of patience and experimenting with prompting different ways than you might be used to, to get it to perform at its best.
It's harder than gpt4 in that regard unfortunately.
@@phen-themoogle7651 For creative writing, Yi-34B is extremely good, and that's free and open source 🤷♂ My go-to test for how smart an LLM is is a basic logic/physical understanding question: "Bob puts a marble in a cup. Then he flips the cup upside-down on a table. Then he puts the cup in the microwave. Where is the marble now?" GPT-4-Turbo gets it right every time, while all other models I've tested (including Gemini) fail by saying the marble is in the microwave, not understanding it would have fallen out on the table.
Admittedly, I don't have access to Gemini Pro, and I'm not paying for it just to test, so maybe it can handle that question... but I'll stick to GPT, as it's much cheaper for the better capabilities anyway.
@@IceMetalPunkMixtral gets it right and is 100x cheaper than GPT4.
@@blisphul8084Oh, nice! Do you have a recommendation for an API provider for Mixtral? Though it's open source, my GPU's paltry 8GB of VRAM can't run it locally 😅 All I've found by way of other providers is Anakin, but that's $10/month; the free tier is only 30 credits a day, though I can't seem to find any clear description of how many credits equals how many Mixtral tokens (their main docs only show the conversion rate for OpenAI models, and one of their posted articles about it shows the conversion in Euros per token instead of credits per token, which doesn't tell me how the free 30 credits/day will work out).
@@blisphul8084 I don't know why CZcams keeps deleting my response, but I'll try to be brief and not use dollar signs or whatever else is tripping its mod heuristics: do you have a good recommendation for an API for Mixtral, since I can't run it locally?
Spent most of yesterday trying to teach an ai a custom tricontakaihexidecimal grid to see if it could guess locations, after setting up some prompts it got rid of it's hallucinations. The only issue I had is sometimes it would say something like "The next space is 52q " when my model has it as 5Q2 but I'm sure with more tokens like gemini I could sus that out too.
3:08 1M tokens is not orders and orders of magnitude more than 128K. It's exactly one order of magnitude more.
Crazy thing about sora - jimmy apples said they've had it since March 2023
Great video Matt.
I would like to see more of a deep dive video on Lindy actually.
Generation time?
yooo 1 million context length, can I finally have an AI dungeon master who will actually remember all my shit?
Great video Matt. But until Google releases the new token limit, i wont believe them. They're so good at hype and underdelivering lately
Just imagine an open source text to video model with Sora power, wow
When is the lifestream?
Slightly confused. Is 1.5 a different model than the recently released Ultra/Advanced? Will Gemini advanced/ultra not get this massive context window/tokens?
Interesting 🤔
9:46 is this publicly available? I feel like its a strong demo, when will this be apart of youtube?
I think maybe Google will be the next Yahoo
Bro this is insane. Would love to hear your thoughts about how the world will change in the near term...
AI making better AI is basically the Singularity. Today we have the steering wheel, soon we will have just the brakes, and then the back seat. The direction we choose today matters.
Is it available for Google premium members ?
You said we could get our art on the wall 😭
Wow this is pretty cool
Hey, Matt, could u explain what exactly is a token? Hah ))
Hey photomaker AI is working very slowly on my system I've rtx2060, 32 gb ram, ryzen5 3600 CPU. But after I applied your setting still I'm not getting any where. Loading......... never stops, even after I waited for 30min. The AI is not generating any images for me. Am I doing something wrong. Please help me out.
Matt, Idea2img came out!!! Can you make a video with it?
Uhm I already made an AI that can do that coding (yeah takes trail and error) though this should really enhance my frameworks capabilities wooh 😮
Everyone sees this stuff and thinks: "No way! So this is how good it can get. Now we've really reached its potential. Neat."
And you just can't wrap your head around the concept that there's no ceiling...
Harpa now used Gemini instead of Bard.
Question in 3 month: „Matt, how you can prove that you are no AI Video Model - prove that you are really „real“ - is it important for you or not? For me as the viewer i‘m not sure that the content i‘ll see right now is not prompted from a 12-years teen in Kenia? I feel unsafe. 😅“
Agi in 7 months, remember
@DaveShap
🔥🔥♫let's go!!! AGI around the corner ftw 🔥🔥♫
Never give up on google. And even if they end up with the second best paid AI, the cost value or things their AI is better at might be perfect for your use case.
We need them in order to keep pushing OpenAI/Microsoft. The more hands that are reaching for the pie, the better.
That’s nice thing to have but it’s just rumors, you could say that eventually GPT-4.5 will one day have the same context window.. it’s not news until we can use this
You can use it now if you are in the Closed Beta... lol A ton of videos showing how good it really is.
Not to long ago Google was shouting code red
Agreed
I like the signs
I genuinely thought the water tower sketch was a frontal zoomed view of a nude cartoon man with lots of pubez
👋
Thank god we have competition, still waiting on Tesla ?)😊
i route for google, no matter the hype. look at Sora for example, do you think it were possible to make that without all those google papers? google gave to ai a lot.
please make correction: "its only impressive what google have done here, if its true"
Neon lights lol
Thanks for not using a title like "Google SHOCKED the entire industry!!!!" (I genuinely hate how many CZcamsrs do that)
Google has a habit of pushing the hype button hard and then not delivering.
Exactly ❤
2024 ai will change the world forever and it is insane
(sorry for the caps)
AWESOME NEON SIGNS!!! I AM SOO HAPPY YOU GOT A COWBOY FROG!!! I'VE MADE AI IMAGES OF THAT!!!
YAY!!!!
(sorry for the caps)
Anthropic has to step up - very, very soon - or it's going to become irrelevant.
Do Lindy plox
coo
You don't understand "order of magnitude"
Lets see if they actually beat gpt 4 in terms of understanding.
One million token isn't enough to input a whole long noval.
Meanwhile Elon Musk just using AGI behind a steering wheel
Claude 2.1's context window isn't 200K, not for free users. Not even 100K. You'll be lucky to get it to accept even 20K, unless you're a paid user (I assume).
Right now there are no free LLMs with more than 20K, and Claude 2.1's "safety" features have gotten so extreme it's pretty much unusable for anything creative.
I am not saying this is bullshit but its Google showing Gemini. We have been down this road before.
ONE million tokens... not ten million.
No ....its 10 million. Read the paper
@@gani2an1 I understood (from reading it...) that they've _tested_ it up to 10 million tokens, but that the model they plan on releasing is 1M. 🤷♂️
Main problem is gemini never listen instructions...just lecture and suggest ownself... 😊😊 for me its waste of time...
yes, but it still cannot maintain conversations over the long term. once you're through with a project, you can't revisit it later with updated information, rendering it useless for a continuing project.
Google is claiming 99% accurate recall! Even with 10 million tokens. Game over.
I wonder if you can feed it the entirety of legal documents for the US? It'd make a kick ass lawyer.
Iirc, it starts to struggle around 10m tokens. It gets 99% for 1m, but it starts to get around to 60ish% when you start to get to 10m.
@@michmach74 Google claims 100% to 1 million and 99% to 10 million.
@@carlkim2577 Okay, then I remembered wrong lmao mb
How many words is 10m tokens tho?
@@michmach74 7 million approx
I watched the Nightly News. They talked about Sora and it doesn't look good. Hollywood is already bucking against it also Artist is also jumping on the band wagon against it. They interviewed someone from Sora and they explained there worried about security plus other things, so I really don't think we are going to get access and no Will Smith eating spaghetti!!!
Plot twist, the narration of Google's video was also generated by AI.
Gemini 1.5 pro is 100% better than GPT-5.
That's why ClosedAI did Sora instead of GPT-5 to counter Google.
But Google will probably do a video demo as well, within a month and release Gemini 1.5 pro. 😂😂
Exactly, they knew crappy useless AI Video will take the hype over just another LLM 1 million token release.
@@helix8847 now I doubt if Google will ever release Sora competition. Because they can't even generate images of white people.
I feel like a clown now. 🤡🤡
blue pill, red pill in a few years...
so? gpt-4 memory eliminates the size limit on the token context size. mid size model is meh. I am beginning to become concerned for goog as a shareholder.
It's an impressive model, but the political correctness is a bit annoying at times.
Definitely! I've been frustrated by it quite a bit for the same reasons. Seems like it takes roundabout prompting to get results sometimes
It actually damages the performance of my test cases. I try to get it to be a language tutor, but it doesn't follow the format because it's stubborn and only follows the Google preset format. I'll stick to using OpenAI and Open Source models in my code for now.
Openai paid this guy to talk negative about google.
Ummm, so why does he do the opposite ?
Try watching the video before commenting.
However then you will probably just say Google paid him to talk bad about OpenAI
Claude is super left in politics. Everything wrong with society.