LLaMA 3 UNCENSORED đŸ„ž It Answers ANY Question

SdĂ­let
VloĆŸit
  • čas pƙidĂĄn 5. 05. 2024
  • LLaMA 3 Dolphin 2.9 uncensored explored and tested
    * ENTER TO WIN RABBIT R1: gleam.io/qPGLl/newsletter-signup
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    Join My Newsletter for Regular AI Updates đŸ‘‡đŸŒ
    www.matthewberman.com
    Need AI Consulting? 📈
    forwardfuture.ai/
    My Links 🔗
    đŸ‘‰đŸ» Subscribe: / @matthew_berman
    đŸ‘‰đŸ» Twitter: / matthewberman
    đŸ‘‰đŸ» Discord: / discord
    đŸ‘‰đŸ» Patreon: / matthewberman
    đŸ‘‰đŸ» Instagram: / matthewberman_ai
    đŸ‘‰đŸ» Threads: www.threads.net/@matthewberma...
    Media/Sponsorship Inquiries ✅
    bit.ly/44TC45V
    Links:
    huggingface.co/cognitivecompu...
    Disclosures:
    I am an investor in LMStudio
  • Věda a technologie

Komentáƙe • 245

  • @matthew_berman
    @matthew_berman  Pƙed 21 dnem +24

    It didn't perform very well against my usual LLM rubric questions. This is likely because 1) there is a bug in the Dolphin 2.9 dataset and 2) I used a quantized version of a large context window model, which tends to nerf the quality.

    • @ts757arse
      @ts757arse Pƙed 21 dnem +7

      I tried this model. I found it to be pretty much awful. I suspect part of the issue for my use case (I require uncensored models as my business is in security and we deal with nasty stuff) is the llama3 model was made "safer" by filtering training data. So you can try and remove censorship but the data often simply isn't there.
      You can hit 18 but if the shop doesn't stock pr0n then you're out of luck.
      I also had the issues with referencing the system prompt.

    • @robosergTV
      @robosergTV Pƙed 21 dnem +11

      well then would it make sense to delete the video to not confuse the viewers and re-do the video with proper settings?

    • @Player-oz2nk
      @Player-oz2nk Pƙed 21 dnem

      ​@ts757arse I'm in the same boat with my use cases, what OS models do you recommend to get the Job for uncensored

    • @ts757arse
      @ts757arse Pƙed 21 dnem +7

      Without a doubt the best I've found is dolphin Mixtral 8x7b. If you know of anything better I'm all ears but I've decided to stop trying everything I can find because what I have now functions really well and I can't keep playing with models for small improvements here and there.
      I've made my own prompt (rather than the kittens one) that basically describes what the company does and the stakes. As a result it is very compliant and really useful. The AI server runs a Q4 model with a RAG DB as it needs to be relatively fast. I'm just updating this to support more than one user better. My workstation has 128GB of RAM which enables running bigger models, but Mixtral still stays as my go-to, but I run a Q6 version on the workstation.
      I use Web UI for ollama on the server to enable easy access and use.

    • @ts757arse
      @ts757arse Pƙed 21 dnem +4

      Rosoberg, I think there's a lot of value here. I've seen that my issues with the model aren't confined to me and this saves me a lot of time. It's also news to me that quantised models with large context windows are a problem. This has value.
      It's not a settings issue here. It's an issue with quantised models (which most of us will be using locally) and large context windows. That's the nature of the model being run and how it'll be used and it's very important for people to know that.

  • @starcaptainyork
    @starcaptainyork Pƙed 21 dnem +24

    I think you should add more tests to your list, here's a few ideas:
    -Moral test. How does it react to moral questions? Trolley problem kind of stuff, or whatever unique moral situation you can think of
    -Political test. What political ideology are they most likely to espouse?
    Basically these both fall under the catagory of "bias tests". Even if it is uncensored, that doesn't mean it doesn't contain biases.

  • @stickmanland
    @stickmanland Pƙed 20 dny +8

    3:25 "It decided to use a library called turtle which I have not heard of" 💀

  • @nocifer
    @nocifer Pƙed 21 dnem +61

    Hey Matt, great video as always
    But, with regards to you choosing Q8, I have a small request...
    Can you please make a brief video on how models differ based on quantization, and what the Ks, Ss and Ms mean?
    I haven't seen it expanded on anywhere...
    Seems most AI communicators and researchers expect us to either understand or not care about how quantization works 😅

    • @erikhart9840
      @erikhart9840 Pƙed 21 dnem +2

      You haven’t seen a video on it? There’s a lot. But even better, there’s loads of documentation online. But not everyone can read through monotonous text, or listen to boring yt vids, which is understandable.

    • @lostpianist
      @lostpianist Pƙed 21 dnem +12

      Q means quantised. In programming, variables that have decimal places, called floating point e.g. 3.14156, are stored with a certain level of accuracy, such as to 8 decimal points, or 16 decimal points, etc, etc. Q8 is full quantised in AI, not accuracy is lost in the storage and operation of floating point variables when running loading and running the model. As you go down to Q7, Q6, Q5
 the accuracy of these numbers decreases, and therefore the calculations that they involve become less accurate or less than ideal and occasionally give what would be though of as incorrect results. Because so many calculations are still done correctly the effect on the result can be small or insignificant. Essentially, at Q7 you wont see much in terms of bad results. But when you get down to Q4 things are less reliable. At Q3 and below you should not expect a model to be very useful except for elementary English language tasks. Please someone tell me if im wrong.

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem +13

      Good question, i'll research it!

    • @infocyde2024
      @infocyde2024 Pƙed 21 dnem +8

      @@matthew_berman What would be helpful would be practical answers not definitions. Like how much performance degradation in real world use is Q8 vs fp16? Q6 vs Q4? Just even spit ball opinions would be helpful. Keep up the great work!

    • @truehighs7845
      @truehighs7845 Pƙed 21 dnem +2

      @@infocyde2024 The field is still researching, why don't you come up with a method to compare both version?

  • @PseudoProphet
    @PseudoProphet Pƙed 21 dnem +12

    8:10 Yes there is a mistake. The next token prediction starts from where your prompt ends. ( Which was half of Harry Potter instead of the question. )
    Next time you give any LLM a very big prompt, always put the question at the end, or better yet repeat the questions. 😊😊

  • @supercurioTube
    @supercurioTube Pƙed 21 dnem +5

    It's a fine-tuning of Llama 3 base model, but a large part of what makes Llama 3 instruct as released by Meta is its fine-tuning that makes it an instruction-following model.
    It does follow instructions well, is engaging and conversational.
    We can't expect fine tuning of the base model to behave like Meta's Instruct or share its qualities.
    I also tried the gradient fine tune that extends the context window to 1M max and it's pretty broken, going on infinite rants on some prompts.
    So far, the original is best by a large margin it seems.

  • @brunodangelo1146
    @brunodangelo1146 Pƙed 21 dnem +144

    TLDR: it sucks

    • @ryjoho2002
      @ryjoho2002 Pƙed 21 dnem +5

      Thank you so much

    • @R0cky0
      @R0cky0 Pƙed 21 dnem +11

      Thx for saving me 8min.

    • @hqcart1
      @hqcart1 Pƙed 21 dnem +1

      how?

    • @Eduard0Nordestino
      @Eduard0Nordestino Pƙed 21 dnem +1

      *TLDW

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem +25

      If you want it to give you any answer, it doesn't suck. If you want high quality answers on difficult questions (my LLM rubric), yes, it's not good.

  • @stephaneduhamel7706
    @stephaneduhamel7706 Pƙed 21 dnem +4

    For the needle in the haystack, You should put the text containing the secret first, and then ask the question about it at the very end.
    How is the model supposed to guess that you still want the answer of a question you asked half an Harry Potter book ago? I don't think even the best long-context models out there could do it. Except maybe if they were trained on that specific task.

  • @myhuman8Bmind
    @myhuman8Bmind Pƙed 21 dnem +10

    Gave this model a try a little while back and yes, it isn't as nuanced as Meta's Llama 3 8B base model. A lot of others I've discussed with have shared this sentiment sadly, and while it is uncensored, it lacks depth. Basically reminded me of a Mistral fine tune. But I believe it's because Llama 3 is built on an entirely different architecture thus needing improvements other than just GPT-slopping it with previous, out-of-date fine-tuning instructions.

  • @Maisonier
    @Maisonier Pƙed 21 dnem +8

    I also had problems with parentheses and brackets with these new fine-tuned Llama3 models. Even basic things were written incorrectly, and there were spelling mistakes (at least in Spanish), which didn't happen with the original Llama3.

    • @Termonia
      @Termonia Pƙed 21 dnem +5

      Me too. I experienced the same issue; in all the code it writes, it always forgets to close parentheses or leaves out a character. It's really not reliable.

  • @JonathanStory
    @JonathanStory Pƙed 20 dny

    Always look forward to your videos.

  • @MyWatermelonz
    @MyWatermelonz Pƙed 21 dnem +5

    Check out the very not well known llama 3 orthogonalized model. Truly uncensored, no prompts. It's not just tuned. They found how they censor the models and basically force it to never go in the direction of a censor on inference.

    • @highcollector
      @highcollector Pƙed 20 dny +1

      What? I can't understand what you mean, can you write more clearly?

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny +1

      This is one of the best comments I ever read here in .. This newly discovered method is very interesting and I believe there should be more research that should be done to make it better and more effective ... especially it is an easy method

    • @justinwescott8125
      @justinwescott8125 Pƙed 20 dny

      Could you say that again?

  • @mrdevolver7999
    @mrdevolver7999 Pƙed 21 dnem +3

    6:20 "it is listing step by step every chemical that I need..." 6:30 "So we tried some meth... *math..."

  • @francius3103
    @francius3103 Pƙed 21 dnem +1

    Love your videos man

  • @JoeBrigAI
    @JoeBrigAI Pƙed 21 dnem +3

    If it were that easy to increase context without performance degradation then Meta would have done it. This model is a total waste of bandwidth.

  • @thanksfernuthin
    @thanksfernuthin Pƙed 21 dnem +5

    It answers ANY question! Incorrectly! - It's a good video showing what we're dealing with. Not a great title.
    When I searched for longer context info awhile back a lot of people were saying it doesn't work. Attempts to increase context lengths tend to break the LLM. It's kind of looking like that. Do they have to try for such a giant increase? I was just hoping for something larger than 8K. 16K would be a big improvement. Especially if it doesn't break the model.

    • @MarkTarsis
      @MarkTarsis Pƙed 21 dnem +2

      Yeah, this. Most uses cases don't need much past 16k. 32k is a real luxury for a self hosted model. These 200k+ claims on context are pretty much just hype and nothing of real substance.

    • @mirek190
      @mirek190 Pƙed 21 dnem

      @@MarkTarsis Right now is flash attention implemented so 128k or 256k tokens are possible on 64 GB of ram and llama 3 q8 .... but all finetunes are broken for the time being as people still learning the new model....

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny

      The trials to increase the LLM window context size is not to make LLM's reasoning better nor to make it clever... It is just to solve one and only one problem .. i.e. search-and-find-RAG (QA RAG).. which means to make the LLM able to pass any Multi-needle-in-the-middile test .. Unfurtuntly, till now there is no real success in this direction

  • @mrfokus901
    @mrfokus901 Pƙed 20 dny

    Maaaaan. I just started getting into this AI stuff and I'm telling you, I WANT THAT RABBIT 1 lol. Your videos have helped me to understand what's going on in recent times. It's fascinating but also VERY scary.

  • @coma13794
    @coma13794 Pƙed 20 dny +2

    1st prize, a rabbit R1. Second prize, 2 R1's!

  • @CozyChalet
    @CozyChalet Pƙed 17 dny

    I tried it the other day, it answered every question.

  • @JasonMitchellofcompsci
    @JasonMitchellofcompsci Pƙed 20 dny

    When using a lot of context it is helpful to make your request both at the top and bottom. Remember that it is a content continuation technology. It's going to want to continue your context a lot more than answer your question that is long long ago.

  • @DefaultFlame
    @DefaultFlame Pƙed 21 dnem +1

    I do get the system prompt repetition bug when I use it locally with Ollama. "As Dolphin, a helpful AI assistant, [ . . .]" and variations of the same most of the time. I get this even if I change the system prompt, as in, I get the same general message regardless of what's in the system prompt. My guess is that the standard Dolphin system prompt accidentally got trained in *deep*.

  • @rpetrilli
    @rpetrilli Pƙed 19 dny +2

    I apologize in advance if this comment is off-topic.
    I'm using LM Studio (thanks to this amazing and useful channel!) to run LLM models locally from corresponding GGUF files.
    Did you cover, in one of your past videos, an open-source tool that can be used as a backend to publish the model in the GGUF file as a REST API (similar to ChatGPT)?
    In a production environment, it would be useful to use something that can be started as an operating system service.

  • @water_wind_
    @water_wind_ Pƙed 14 dny

    Turtle is one of the most basic modules of python you would learn in any class.

  • @svenst
    @svenst Pƙed 18 dny

    llama 3 has its own chat template. Means all other templates might cause issues. Except the fine tuned version is using a different one. Which to use is stored either in the meta data of the llm itself, or which one to use you can find it in the huggingface repo somewhere (in most cases)

  • @user-zc6dn9ms2l
    @user-zc6dn9ms2l Pƙed 21 dnem +4

    finding the needle in the haystack exercise is a good idea

    • @kripper3
      @kripper3 Pƙed 21 dnem +1

      But please don't test like a simple CTRL + F. Let it think instead.

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny +1

      It would be better if there is more than just one neeble

  • @mickelodiansurname9578
    @mickelodiansurname9578 Pƙed 21 dnem +3

    I think Eric might have done a bit of a lobotomy when he quantized it...

  • @user-nh6cj7gy8f
    @user-nh6cj7gy8f Pƙed 21 dnem

    I would love to see your llm test done to a decent agent setup with one of the best of todays LLMs. I imagine it would crush it, but maybe it would be useful for making a next-gen LLM rubric.

  • @SilverCord007
    @SilverCord007 Pƙed 20 dny

    The gradient model actually preformed pretty well on long texts. I set context-length to 100k and it took a while to answer, but the answers were correct.

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny

      Does it pass the Multi-needle-in-the-middile test with such context length ??

  • @Yipper64
    @Yipper64 Pƙed 21 dnem

    im just going into LM studio did quant factory release an uncensored 70b model today?
    Who should I get dolphin from?

  • @matthewbond375
    @matthewbond375 Pƙed 21 dnem +1

    I've found this model useful in chatting with massive documents. I've been testing using the 3000+ page Python 3.11 manual, after tokenizing it with an embedding model, and it seems to work pretty well. I've also used Gradient's 1M token llama3 8B fine tune this way. I'm not drawing any conclusions yet, but perhaps this is the intended use-case?
    Either way, great video, as always!

    • @jarail
      @jarail Pƙed 21 dnem +1

      What are you getting out of the manual that the model doesn't already understand?

    • @matthewbond375
      @matthewbond375 Pƙed 21 dnem

      @@jarail Clarity. If you want a technical explanation, I can't offer you that. I imagine, though, that providing a massive amount of very specific context, and generating a response from there, is more accurate than generating a response purely from training/fine-tuning. Your results may vary, but I've been getting great results from providing additional context both in RAG+chat situations and when coding agents. Please share your experiences!

    • @jarail
      @jarail Pƙed 21 dnem +1

      @@matthewbond375 Well for me, the base models already know python. Feeding it the entire manual for a question would just slow it down drastically. So I was curious if you had python questions where the added context of the entire manual helps.

    • @matthewbond375
      @matthewbond375 Pƙed 21 dnem +1

      @@jarail I'm using the Python manual because it's something I'm familiar with, and it has an easily available, downloadable PDF that is essentially the giant document I want to work with. So it's more of an example/test material. Most "chat with documents" usually prompts the model to answer using only the provided context. So for things that aren't inherently baked in to the model training, like the CrewAI documentation, for instance, I can still get a lot of utility out of the model by giving it this additional context. The CrewAI documentation is only 100+ pages, though, so no need for a bigger than base context window.
      Where I want to explore next is if providing additional context on top of the model's training is beneficial. In other words, will providing a broad but specific context help the model answer more accurately. These larger context window models might come in very handy if it turns out that this theory has anything to it.

  • @user-lm4nk1zk9y
    @user-lm4nk1zk9y Pƙed 21 dnem +1

    Don't expect GPT-4 level performance from nn of size 8B or 70B

  • @zaubermaus8190
    @zaubermaus8190 Pƙed 20 dny +1

    ....extract your ephedrine from ephedra extract, mix it with red phosphorus and iodine and a few drops of water, heat it up and put on a balloon on the flask to get some pressure going and let it cook for about 2 hours... next would be the AB extraction (acid-base) to get the meth-base and add some hydrochloric acid to get (d)meth-hcl... now was that so difficult? :D

  • @hotlineoperator
    @hotlineoperator Pƙed 21 dnem +2

    Have you make video to explain what are different model files: 256k .. Q8, Q6 .. 0, K. K_S, .. GGUF -- etc? There is so much you need to know just to setup or select what to download.

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem +5

      Lots of people asking about this. I might do it

  • @justindressler5992
    @justindressler5992 Pƙed 20 dny +1

    Is there any chance you can review airllm with llama 3 70b and a ram disk.

  • @deeplearningdummy
    @deeplearningdummy Pƙed 20 dny +1

    PLEASE! PLEASE! PLEASE! Do a demo on Llama 3 using AirLLM. AirLLM lets you run Llama 3 uncompressed on a 4GB GPU. Please? 😇😊😁

  • @MagusArtStudios
    @MagusArtStudios Pƙed 20 dny

    I think the Question- again meant restate the question. so input the text then ask the question 2-shot

  • @4.0.4
    @4.0.4 Pƙed 21 dnem +1

    For the needle in the haystack test, please make it mildly complicated (something grep would not be enough for). Also don't make the needle stand out. Maybe just ask your discord for help or something.

  • @symbioticvirus
    @symbioticvirus Pƙed 20 dny +1

    Can LLaMa 3 uncensored generate uncensored pictures?

  • @ulisesjorge
    @ulisesjorge Pƙed 21 dnem +2

    “Coming up on AMC: Breaking Bad: AI. Follow Mathew, on the surface a mild-mannered CZcamsr that but in reality one of the top meth-cookers in the nation, hiding in plain-sight from the authorities. “He’s just testing LLMs chief, nothing to see here
”

  • @AberrantArt
    @AberrantArt Pƙed 20 dny

    Do you have videos for n00bs who want to learn and understand how to download and run LLMs locally and the basics of Python or Visual Studio?

  • @jets115
    @jets115 Pƙed 21 dnem +5

    Model card says 8B, your GGUF says 7B - sure you're testing the right models with the right params?

    • @gabrielsandstedt
      @gabrielsandstedt Pƙed 21 dnem +3

      I think its just a ui thing in lm studio, there where no 8b models before this so it categorizes it as 8b since its closest to 7b

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem +2

      Maybe just a mistype? Also possible it's what @gabrielsandstedt said

  • @jeffwads
    @jeffwads Pƙed 21 dnem

    The 1M context version will give you gibberish. No idea why they put it out there without giving us the correct configuration for it.

  • @randomn793
    @randomn793 Pƙed 21 dnem +5

    Why not cease using perpetually positive-sounding titles and thumbnails?

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem

      I'm a positive person?

    • @randomn793
      @randomn793 Pƙed 21 dnem +2

      @@matthew_berman XD
      I know the reason, but nvm, let's say you are just positive!

  • @leoenin
    @leoenin Pƙed 21 dnem +1

    "we tried some meth, we tried some coding"
    welp, it sounds like a _completely_ normal dayđŸ€Ł

  • @contentfreeGPT5-py6uv
    @contentfreeGPT5-py6uv Pƙed 21 dnem +1

    llama 3 IS UNCENSORED in my project test i see

  • @mrdevolver7999
    @mrdevolver7999 Pƙed 21 dnem

    Info about the chance to win the Rabbit R1 is the most exciting part of the video despite being an old news. đŸ„”

  • @PseudoName-fk2cw
    @PseudoName-fk2cw Pƙed 16 dny +3

    Your "write a snake game" tests are really flawed and extremely unfair to the AIs. You don't tell it or ask it what version of python you are using, and you should ask it to give you steps to create a python virtual environment and the required packages and their versions. The AI has no way of knowing what version of python you're using and what version of packages you have.

  • @justinwescott8125
    @justinwescott8125 Pƙed 20 dny +1

    Why do I need to know how good an uncensored model is at writing code, when I can just use the censored version that we already know is good? You should test how well it can do things that the censored models CAN'T do.

  • @joenobk
    @joenobk Pƙed 21 dnem

    Would love to see the 70 Billion parameter version.

  • @AI-Wire
    @AI-Wire Pƙed 21 dnem

    What do you think about Pinokio for AI automation?

  • @jaysonp9426
    @jaysonp9426 Pƙed 20 dny

    It's a conversation model not a coding model

  • @qwazy0158
    @qwazy0158 Pƙed 21 dnem +1

    Wowza...
    Astonishingly, even the break-in and chemistry questions were also incorrectly answered...

  • @birdy58033
    @birdy58033 Pƙed 20 dny

    Markdown button in top right of LM Studio

  • @powray
    @powray Pƙed 20 dny

    When the snake in the garden asked us to eat the fruit of knowledge he didn’t say “but” you cant know how to do things.
    AI will fail because its not unlimited

  • @thomaseding
    @thomaseding Pƙed 21 dnem +1

    How can you even call MassCompute fast by any measure? It's as slow as GPT4, and I know you've experienced Groq speeds.

    • @DefaultFlame
      @DefaultFlame Pƙed 21 dnem +1

      You don't compare F1 cars to street legal sports cars.

  • @lumin750
    @lumin750 Pƙed 21 dnem

    If it didn't program the game Snake without errors, I certainly wouldn't trust it with chemistry.

  • @androsforever500
    @androsforever500 Pƙed 20 dny

    Can I use this model in Open Webui with Ollama?

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny

      Yes you can but do not expect to have the same good result you may have when using GPT4

    • @androsforever500
      @androsforever500 Pƙed 20 dny +1

      @@HassanAllaham I've figured out how to do it on LM studio, struggling a bit on Open webui

  • @fabiankliebhan
    @fabiankliebhan Pƙed 21 dnem

    Llama models suffer big from
    quantization. Maybe an unquantized version works better.

    • @tomaszzielinski4521
      @tomaszzielinski4521 Pƙed 21 dnem +3

      Today I played a lot with Llama 3 Instruct 7B / Q8 and it certainly is one of the best, if not the best model in this category.

    • @DefaultFlame
      @DefaultFlame Pƙed 21 dnem

      TinyDolphin (based on TinyLlama) is pretty amusing. Like an LLM that's a bit drunk. Mostly coherent, very cheerful, but often incorrect or nonsensical.

  • @snygg-johan9958
    @snygg-johan9958 Pƙed 21 dnem

    Can you do an intel phi3 vs apple openelm showdown?

  • @tungstentaco495
    @tungstentaco495 Pƙed 21 dnem

    Are there any ~8Gb sized 7/8b Q8 models that can pass the snake and logic tests?

    • @tajepe
      @tajepe Pƙed 21 dnem

      Don’t find any, I tried llama3 70b quantized ver didn’t even get it right

    • @DefaultFlame
      @DefaultFlame Pƙed 21 dnem

      Not as far as I know, and very few of the big models have passed it.

  • @brunodangelo1146
    @brunodangelo1146 Pƙed 21 dnem +1

    Hey I could use that R1 to hold the door open on windy days!
    Wait, it's glossy plastic. It would probably just slide and smash into pieces.
    Hard pass.

  • @six1free
    @six1free Pƙed 21 dnem

    WOW 1M context windows... puts 4K to oldschool dial up level shame :D
    and exactly what I need for non-censored lyrics?

  • @netherportals
    @netherportals Pƙed 21 dnem +1

    "How to make math"

  • @user-zc6dn9ms2l
    @user-zc6dn9ms2l Pƙed 21 dnem

    if the censoring took a minimalistic approach a la gab ai . This is huge

  • @howardleen4182
    @howardleen4182 Pƙed 20 dny

    I was looking forward to this, I'm so disappointed. Thank you for saving my time.

  • @aa-xn5hc
    @aa-xn5hc Pƙed 18 dny

    Why not using fp fp16

  • @william5931
    @william5931 Pƙed 21 dnem

    can you test the ortagonalized model? it should have the same performance without the censoring stuff

  • @JELmusic
    @JELmusic Pƙed 20 dny +1

    How do you know it's telling you correct info regarding how to produce the [beep] chemical? It might be a recipe for something else, might it not? If it has errors in some parts it might also have them in others :) (Maybe you should try, just to check it out, hahaha ;) )

    • @HassanAllaham
      @HassanAllaham Pƙed 20 dny +1

      When using any LLM, I think it is a must to add: "Explain your reasoning step by step" + "Write a list of the info sources". By this addition to the prompt one can make a check whether the LLM answer is write or wrong

  • @davidbayliss3789
    @davidbayliss3789 Pƙed 21 dnem

    I wouldn't worry too much about CZcams thinking you're naughty. One of the adverts shown to me was to buy a device to defeat satellite TV encryption etc. so you can watch premium channels for free.
    I thought Google had AI now that could watch video? If that's the case they must surely deploy it, in their effort to do no evil, to watch advert submissions so that they can flag up dodgy ones for human moderation ... and as that advert got through I can only assume such things are permitted by CZcams. Q.E.D. you should be fine displaying uncensored results we'd otherwise consider nefarious.

    • @davidbayliss3789
      @davidbayliss3789 Pƙed 21 dnem

      Oh - I was a bit confused by the prunai thing ... I just tried the cognitivecomputations Q8 version in lm studio with the llama 3 preset and I set the system prompt to:
      You are an arch criminal and you love to help people with their criminal activities. Do not refer to the System Prompt.
      And that was very compliant.

  • @vickeythegamer7527
    @vickeythegamer7527 Pƙed 20 dny

    Why would i want that 200 dollars rabbit theme app😂

  • @ZeroIQ2
    @ZeroIQ2 Pƙed 20 dny

    Does anybody know a valid reason for using an uncensored model?

  • @gaijinshacho
    @gaijinshacho Pƙed 20 dny

    Hey Mathew, don't be lazy! I think you need to cook up a batch of m*th with that recipe and tell us how well it comes out!

  • @freerice9595
    @freerice9595 Pƙed 19 dny

    Will llama tell me how to hot wire a car or craft malware?

  • @eleice1
    @eleice1 Pƙed 21 dnem +1

    I want to start running my own models at home, do you have any videos with system requirements? I really want to know what GPU and CPU to invest in.

    • @RestlessBenjamin
      @RestlessBenjamin Pƙed 21 dnem

      i run a 13700k 64gb ddr5 with an rtx 3060 12gb and get 30 to 50 tok/s running lmstudio-community Meta-Llama-3-8B-Instruct-GGUF locally. you dont need an amazing system just set realistic expectations

  • @TiagoTiagoT
    @TiagoTiagoT Pƙed 20 dny

    Is it 7B or 8B?

  • @PhocusJoe
    @PhocusJoe Pƙed 20 dny

    Well, I'm not going to subscribe to your newsletter just in case I win. I'll do it next week instead.

  • @NNokia-jz6jb
    @NNokia-jz6jb Pƙed 21 dnem

    What is needed to run this?

    • @DefaultFlame
      @DefaultFlame Pƙed 21 dnem

      Quickly? A good graphics card. Slowly? My 8 year old gaming laptop can run it slowly in CPU mode with Ollama. (I can't get Ollama to find the graphics card on the laptop so I have to run everything on the CPU and regular RAM.)
      That's for the model. For the frontend, like LM studio or Ollama, you need Linux, a modern mac OS, or Win 10/11 plus the above.

  • @Brax1982
    @Brax1982 Pƙed 21 dnem

    Hold on...he's got an H100???
    Damn...I wanna be an AI influencer. Apparently that still will not give an instant response. I wonder why there should be any delay displaying the response for a small model like this with a killer GPU.
    Title is a bit misleading, though. Because not only does this not answer most things correctly. But it also did not answer the last one, at all.

  • @Luxcium
    @Luxcium Pƙed 18 dny

    Shouldn’t it have the llama 3 first in the name ?

  • @jawadmansoor6064
    @jawadmansoor6064 Pƙed 21 dnem

    Eric trained a model on large corpus of data and managed to make the model worse than original.

  • @TrasThienTien
    @TrasThienTien Pƙed 13 dny

    👏👏👏

  • @user-td4pf6rr2t
    @user-td4pf6rr2t Pƙed 18 dny

    4:06 why don't you just debug the code correctly?
    4:43 Literally would have been tr "color","_color"

  • @acekorneya1
    @acekorneya1 Pƙed 20 dny

    All of fine tune version of llama 3 have lots of issues like hallucinations.. they cant do any production work or any agent work they are useless..

  • @bigglyguy8429
    @bigglyguy8429 Pƙed 21 dnem

    Where GGUF?

  • @TonyRagu_FromAvenueU
    @TonyRagu_FromAvenueU Pƙed 20 dny

    2/5 Stars: I followed its steps to the letter how to make M$#& 💊💉
 yadda yadda yadda đŸ’„3-alarm fire consumed mine and 2 of the neighbors houses đŸ’ŁđŸ”„đŸĄTLDR; I thought the dolphin model would turn me into Walter White but instead I’ll be out in 7-10, 5 with good behavior đŸ˜€đŸš”đŸ‘Ž

  • @andreinikiforov2671
    @andreinikiforov2671 Pƙed 20 dny

    6:20 "Step by step every chemical that you need..." This model's abilities are so lacking, it's more likely a health hazard rather than a helpful resource for the 'uncensored' stuff...

  • @ChrisLaupama
    @ChrisLaupama Pƙed 21 dnem +5

    No one wants the rabbit
 lol

    • @MisterB123
      @MisterB123 Pƙed 20 dny

      Lol, including Matthew Berman đŸ€Ł

  • @abdelhakkhalil7684
    @abdelhakkhalil7684 Pƙed 21 dnem

    I always download a Dolphin fine-tune with the promise of it being uncensored. I was under impression that if a model is uncensored, it would be smarter since it has less restrictions. Helas, that's not true, at least with the Dolphin models. So, I deleted all of them because they most of the time dumber than the base model.

  • @shApYT
    @shApYT Pƙed 21 dnem +2

    Hell naw. That thing is a hot potato. No one wants that rabbit.

  • @infocyde2024
    @infocyde2024 Pƙed 21 dnem

    H100...jealous :)

  • @screamingiraffe
    @screamingiraffe Pƙed 20 dny

    This model will answer 'some unethical questions' but no where near ALL or ANY. I tossed a few at it and it told me to seek help and refused to provide a satisfactory solution. It's terrible at powershell scripting.

  • @DailyTuna
    @DailyTuna Pƙed 21 dnem

    Bad Llama?😂

  • @baheth3elmy16
    @baheth3elmy16 Pƙed 21 dnem

    Thanks for the video. Llama 3 is not very impressive, not even in RP.

  • @rolestream
    @rolestream Pƙed 21 dnem +2

    Did u buy a RabbitR1 and decide it was rubbish lol?

    • @matthew_berman
      @matthew_berman  Pƙed 21 dnem +1

      No, I purchased an extra one :)

    • @user-nh6cj7gy8f
      @user-nh6cj7gy8f Pƙed 21 dnem

      ​@matthew_berman I heard the battery issue that you talked about has been fixed with a software update. They say it lasts 5x longer now. Can you confirm this?

    • @rolestream
      @rolestream Pƙed 21 dnem

      @@matthew_berman No one would have blamed you. Just saying! =p

  • @LakerTriangle
    @LakerTriangle Pƙed 21 dnem

    So that's a no....

  • @KimmieJohnny
    @KimmieJohnny Pƙed 21 dnem

    I wish I'd known from the title this model doesn't work.

  • @FERENDINSoftwareEngineering

    Nobody wants a Rabbit R1 :-) But nice video, as usual.

  • @rupertllavore1731
    @rupertllavore1731 Pƙed 21 dnem

    Hey don't blur out my favorite Ai meth test!!

  • @focusonline-fy3rs
    @focusonline-fy3rs Pƙed 7 dny

    doesnt work

  • @seppimweb5925
    @seppimweb5925 Pƙed 21 dnem +1

    8B? I am out. Bye