Velikost videa: 1280 X 720853 X 480640 X 360
Zobrazit ovladače přehrávání
Automatické přehrávání
Přehrát
One is for Natural language understanding and another is for Natural language generation
true
Remember it as INPUT>MODEL >OUTPUTMODEL INPUT (NLU) - txt recognition, vision recognition/image/movie, sound recognition/voice, MODEL OUTPUT (NLG) -txt generation, image/movie generation, sound/voice generation + tool integration MODEL PROCESSING -Basic (classification summeration extraction) advanced (reasoning, planning, orchestration)
this is very useful. Just wanted to add that the gpt decoder doesn't have the cross attention in the transformer block.
What is cross attention
@@Tech_kenya It's when word vectors reference other word vectors as opposed to just referencing themselves.
@@methylphosphatePOET so kinda the opposite of self attention?
@@imran7TWnot necessarily opposite perhaps adjacent is a better word,
Wonderfully put.
Thanks a lot :)
I hadnt known that BERT was an acronym and had been wondering why the Sweden LLM was called Bert. I wonder if this is why. Thanks for the info!
Great explanation. For eg, if I have to read all the client emails and understand their requirements and auto create tasks based on that prediction, which model should I go for? BERT or GPT?
Can you please explain about their training process?
What if I stack both encoders and decoders? Do I get some BERTGPT hybrid?
there is also Whiser model, that has similar facebook BART decoder part, but has audio decoder.
So BERT doesn’t have a decoder? Did I misunderstand
Bert also Drives a trans am!Jokes aside I do appreciate your videos!
good
transformer models are usually parallelly run right?
Not when it's decoding. No.
I love you ❤
Awesome 👏
Thanks So much
One is for Natural language understanding and another is for Natural language generation
true
Remember it as INPUT>MODEL >OUTPUT
MODEL INPUT (NLU) - txt recognition, vision recognition/image/movie, sound recognition/voice,
MODEL OUTPUT (NLG) -txt generation, image/movie generation, sound/voice generation + tool integration
MODEL PROCESSING -Basic (classification summeration extraction) advanced (reasoning, planning, orchestration)
this is very useful. Just wanted to add that the gpt decoder doesn't have the cross attention in the transformer block.
What is cross attention
@@Tech_kenya It's when word vectors reference other word vectors as opposed to just referencing themselves.
@@methylphosphatePOET so kinda the opposite of self attention?
@@imran7TWnot necessarily opposite perhaps adjacent is a better word,
Wonderfully put.
Thanks a lot :)
I hadnt known that BERT was an acronym and had been wondering why the Sweden LLM was called Bert. I wonder if this is why. Thanks for the info!
Great explanation. For eg, if I have to read all the client emails and understand their requirements and auto create tasks based on that prediction, which model should I go for? BERT or GPT?
Can you please explain about their training process?
What if I stack both encoders and decoders? Do I get some BERTGPT hybrid?
there is also Whiser model, that has similar facebook BART decoder part, but has audio decoder.
So BERT doesn’t have a decoder? Did I misunderstand
Bert also Drives a trans am!
Jokes aside I do appreciate your videos!
good
transformer models are usually parallelly run right?
Not when it's decoding. No.
I love you ❤
Awesome 👏
Thanks So much