Prompt Engineering
Prompt Engineering
  • 242
  • 7 958 084
Why Cartesia-AI's Voice Tech is a Game-Changer You Can't Ignore!
In this video, I'm excited to introduce Cartesia AI's revolutionary real-time text-to-speech system, Sonic, which offers 135ms model latency and lifelike generative voice capabilities. I'll demonstrate how this versatile API can be integrated into your projects, including a step-by-step guide on obtaining and using the API key. With a variety of voices to choose from, including options for emotion customization, this platform stands out for its quality and speed. I'll also cover setting up a voice-to-voice chat assistant and how you can configure the voices for your needs. Stay tuned for more on voice cloning and advanced setups in upcoming videos!
#tts #aivoice #voicechat
🦾 Discord: discord.com/invite/t4eYQRUcXB
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Patreon: www.patreon.com/PromptEngineering
💼Consulting: calendly.com/engineerprompt/consulting-call
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Advanced RAG:
tally.so/r/3y9bb0a
LINKS:
play.cartesia.ai/
Verbi Github: github.com/PromtEngineer/Verbi
TIMESTAMP
00:00 Introduction to Cartesia AI's Text-to-Speech System
00:51 Demonstrating Voice Generation Speed and Quality
01:20 Exploring Different Voice Profiles
03:03 Setting Up Your Account and API Key
03:40 Customizing Voice Parameters
05:22 Implementing the Text-to-Speech System
05:53 Running the Standalone Example
10:36 Voice-to-Voice Chat Assistant Project
13:03 Conclusion and Future Plans
All Interesting Videos:
Everything LangChain: czcams.com/play/PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr.html
Everything LLM: czcams.com/play/PLVEEucA9MYhNF5-zeb4Iw2Nl1OKTH-Txw.html
Everything Midjourney: czcams.com/play/PLVEEucA9MYhMdrdHZtFeEebl20LPkaSmw.html
AI Image Generation: czcams.com/play/PLVEEucA9MYhPVgYazU5hx6emMXtargd4z.html
zhlédnutí: 8 280

Video

Marker: This Open-Source Tool will make your PDFs LLM Ready
zhlédnutí 17KPřed 4 hodinami
In this video, I discuss the challenges of working with PDFs for LLM applications and introduce you to an open-source tool called Marker. Marker simplifies the conversion of complex PDF files into structured Markdown, making data extraction much easier. I compare Marker with NuGet, showing its superior performance in preserving document structure accurately. Additionally, I give a detailed tuto...
Master Fine-Tuning Mistral AI Models with Official Mistral-FineTune Package
zhlédnutí 5KPřed 14 hodinami
In this video, I walk you through the official Mistral AI fine-tuning guide using their new Mistral FineTune package. This lightweight code base enables memory-efficient and high-performance fine-tuning of Mistral models. I delve into the detailed data preparation process and explain how to format your datasets correctly in JSONL format to get the best results. We'll also set up an example trai...
Advanced Function Calling with Mistral-7B - Multi function and Nested Tool Usage
zhlédnutí 4,8KPřed 19 hodinami
Testing Multi and Nested Function Calls with Mistral 7b In this video, I explore the advanced function calling capabilities of the Mistral 7b v3 model, including multi-function and nested function calls. Using a Google Colab notebook by Uncle Code, I demonstrate how to set up, install the Mistral inference package, and log into Hugging Face hub. Practical examples of the model handling multiple...
ChatGPT Desktop App: First Impressions and What's Missing!!!
zhlédnutí 3,8KPřed 21 hodinou
Official ChatGPT Desktop App for Mac OS - Early Access Review and Features In this video, I explore and review the newly released official ChatGPT desktop app for Mac OS. After downloading it from the ChatGPT website, I walk you through the installation process, launching the app, logging in, and utilizing its various features such as text input, uploading files, and voice conversations. I also...
NEW MISTRAL: Uncensored and Powerful with Function Calling
zhlédnutí 7KPřed dnem
In this video, I explore the new Mistral 7B-v0.3 model, now available on Hugging Face. I'll show you how to install the Mistral inference package, download the model, and run initial queries. We also test its performance and highlight its new features like uncensored responses and function calling. Stay tuned for future videos on fine-tuning this model! #mistral #functioncalling #llm 🦾 Discord:...
INSANELY FAST Talking AI: Powered by Groq & Deepgram
zhlédnutí 7KPřed dnem
Fastest Voice Chat Inference with Groq and DeepGram In this video, I show how to achieve the fastest voice chat inference using Groq and DeepGram APIs. I compare their speeds to OpenAI’s Whisper and demonstrate how to set up and code the process. Learn about handling rate limits, buffering issues, and how to get started with these services. Stay tuned for future videos on local model implementa...
Creating JARVIS - Your Voice Assistant with Memory
zhlédnutí 6KPřed 14 dny
In this video, you will see a demo of a voice assistant that can remember past conversations. We use external APIs like OpenAI's Whisper for audio transcription, GPT-4 for generating responses, and OpenAI's voice engine for text-to-speech conversion. The main focus is on using modular code and OpenAI's tools to construct a conversational assistant with a memory feature. 🦾 Discord: discord.com/i...
Creating J.A.R.V.I.S.
zhlédnutí 3,5KPřed 14 dny
A sneak peek of voice-to-voice chat assistant. 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Patreon: www.patreon.com/PromptEngineering 💼Consulting: calendly.com/engineerprompt/consulting-call 📧 Business Contact: engineerprompt@gmail.com Become Member: tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for...
First Impressions of Gemini Flash 1.5 - The Fastest 1 Million Token Model
zhlédnutí 7KPřed 14 dny
Just checked out Google's new Gemini Flash at Google I/O. It's a super-fast AI model designed for handling big tasks - think processing videos, audios, or huge codebases, all while keeping costs low. I put it through its paces against giants like GPT 3.5 and GPT 4.0, looking at performance, costs, and how it handles real-world tasks. I even tried confusing it with tricky questions and coding ch...
Google IO: Agents is The Future - Demos
zhlédnutí 3,4KPřed 14 dny
Google IO was all about Agents. Here are some examples demo shown. 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Patreon: www.patreon.com/PromptEngineering 💼Consulting: calendly.com/engineerprompt/consulting-call 📧 Business Contact: engineerprompt@gmail.com Become Member: tinyurl.com/y5h28s6h 💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: P...
Getting Started with GPT-4o API, Image Understanding, Function Calling and MORE
zhlédnutí 8KPřed 14 dny
Getting Started with GPT 4.0: A Comprehensive Tutorial This video tutorial guides you through the basics of getting started with the GPT-4o API, including comparisons with GPT 4.0 Turbo, exploring capabilities like text generation, image understanding, and function calling. 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Patreon: www.patreon.com/Prompt...
GPT-4o: OpenAI's NEW OMNI-MODEL Can DO it ALL
zhlédnutí 4,3KPřed 14 dny
In this video we look at GPT-4 OmniModel, a groundbreaking AI model capable of processing and responding to audio, vision, and text in real-time. Demonstrating its versatility, the video showcases various scenarios including customer support, language translation, and educational tutoring, highlighting the OmniModel's ability to understand and interact in near-human response times. 🦾 Discord: d...
Yi-1.5: True Apache 2.0 Competitor to LLAMA-3
zhlédnutí 6KPřed 14 dny
In this video, we will look at Yi-1.5 series models were just released by 01-AI. This update includes 3 different models with sizes ranging from 6 billion to 34 billion parameters and training on up to 4.1 trillion tokens. All models are released under Apache 2.0 license. 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Patreon: www.patreon.com/PromptEn...
NVIDIA ChatRTX: Private Chatbot for Your Files, Image Search via Voice | How to get started
zhlédnutí 8KPřed 21 dnem
This video provides an in-depth review and tutorial of NVIDIA's ChatRTX, a new tool designed for users with RTX GPUs on Windows PCs. The tool leverages Retrieval Augmented Generation technology and tensor RT LLM alongside RTX acceleration to chat with documents and use voice interaction. It now supports local photo and image search with improvements in its features. The application requires spe...
Free LOCAL Copilot to Take Your Coding to the NEXT LEVEL
zhlédnutí 6KPřed 21 dnem
Free LOCAL Copilot to Take Your Coding to the NEXT LEVEL
Free Copilot to Take Your Coding to the NEXT LEVEL
zhlédnutí 13KPřed 21 dnem
Free Copilot to Take Your Coding to the NEXT LEVEL
Llama-3 🦙 with LocalGPT: Chat with YOUR Documents in Private
zhlédnutí 9KPřed 28 dny
Llama-3 🦙 with LocalGPT: Chat with YOUR Documents in Private
Extending Llama-3 to 1M+ Tokens - Does it Impact the Performance?
zhlédnutí 11KPřed měsícem
Extending Llama-3 to 1M Tokens - Does it Impact the Performance?
Get your own custom Phi-3-mini for your use cases
zhlédnutí 11KPřed měsícem
Get your own custom Phi-3-mini for your use cases
How Good is LLAMA-3 for RAG, Routing, and Function Calling
zhlédnutí 8KPřed měsícem
How Good is LLAMA-3 for RAG, Routing, and Function Calling
How Good is Phi-3-Mini for RAG, Routing, Agents
zhlédnutí 10KPřed měsícem
How Good is Phi-3-Mini for RAG, Routing, Agents
Does Size Matter? Phi-3-Mini Punching Above its Size on "BENCHMARKS"
zhlédnutí 5KPřed měsícem
Does Size Matter? Phi-3-Mini Punching Above its Size on "BENCHMARKS"
Llama-3 Is Not Really THAT Censored
zhlédnutí 7KPřed měsícem
Llama-3 Is Not Really THAT Censored
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
zhlédnutí 4,1KPřed měsícem
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
Insanely Fast LLAMA-3 on Groq Playground and API for FREE
zhlédnutí 24KPřed měsícem
Insanely Fast LLAMA-3 on Groq Playground and API for FREE
LLAMA-3 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌
zhlédnutí 45KPřed měsícem
LLAMA-3 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌
LLAMA 3 Released - All You Need to Know
zhlédnutí 11KPřed měsícem
LLAMA 3 Released - All You Need to Know
WizardLM 2 - First Open Model Outperforming GPT-4
zhlédnutí 16KPřed měsícem
WizardLM 2 - First Open Model Outperforming GPT-4
Create Financial Agents with Vision 👀 - Powered by Claude 3 Haiku & Opus
zhlédnutí 6KPřed měsícem
Create Financial Agents with Vision 👀 - Powered by Claude 3 Haiku & Opus

Komentáře

  • @aifortune
    @aifortune Před hodinou

    I'm all in. better price the eleven labs.

  • @Cedric_0
    @Cedric_0 Před 4 hodinami

    Was working on a project whwre i need to use my local language but having issuse with coqui ai tts Library, aby other alternative that would be helpful, and easy to use thank you

  • @Beetgrape
    @Beetgrape Před 6 hodinami

    is it faster than Deepgram?

    • @engineerprompt
      @engineerprompt Před 6 hodinami

      Yes, on the playground. The Cartesia team recommends streaming. I am going to test that and report.

  • @hnb13686
    @hnb13686 Před 7 hodinami

    THis is not completely open-source so dont report it as such with clarification midway in the vid.

  • @GAllium14
    @GAllium14 Před 7 hodinami

    What software do you use for those super smooth zooms?

    • @engineerprompt
      @engineerprompt Před 6 hodinami

      It's called screen studio. It's only for mac

  • @gorripotinikhileswar7087
    @gorripotinikhileswar7087 Před 8 hodinami

    Hey , Can we use this offline?

  • @cristian_palau
    @cristian_palau Před 8 hodinami

    thank you for sharing this excelent tools!

  • @MrLogansrun35
    @MrLogansrun35 Před 10 hodinami

    why do they censor these models ? AI should remain non biased and present facts when asked not give you reasons why it cannot answer a question just because the truth may offend . facts don't care about feelings. Glad they have overcome censorship.

  • @Larsbor
    @Larsbor Před 10 hodinami

    Ok as usual the lack of Gui destroys it for me..😢

  • @Larsbor
    @Larsbor Před 10 hodinami

    I am uncertain about marker, it is for scientific use, but says it removes footers, that is where you normally put in your sources, and apendix links.. so?!

  • @themorethemerrier281
    @themorethemerrier281 Před 11 hodinami

    This sounds very interesting but I will need to learn some python environment basic before I can put this to the test. A solution like this could help me a lot!

  • @johntdavies
    @johntdavies Před 11 hodinami

    Thanks for posting Verbi. I wanted to get it to speak more than just English, I couldn't find any Carteia models that were anything other than English or American but ElevenLabs has great multi-lingual support. The following change in text_to_speach() enable ElevenLabs to speak quite a few languages... elif model == 'elevenlabs': ELEVENLABS_VOICE_ID = "Rachel" client = ElevenLabs(api_key=api_key) audio = client.generate( text=text, voice=ELEVENLABS_VOICE_ID, output_format="mp3_22050_32", model="eleven_multilingual_v2" ) elevenlabs.save(audio, output_file_path)

  • @chjpiu
    @chjpiu Před 12 hodinami

    Hi, do you know how much RAM is required for this application? I tried, but it said that it was out of memory. My laptop has 16 GB RAM w/o Nvidia GPU. Thanks a lot

  • @drgutman
    @drgutman Před 13 hodinami

    meh, I thought it's a better local tts ...ohh well.

  • @user-yi2mo9km2s
    @user-yi2mo9km2s Před 15 hodinami

    Nobody would pay for services while we can do it on our own PC locally.

  • @user-yi2mo9km2s
    @user-yi2mo9km2s Před 15 hodinami

    No thanks for advertising.

  • @michalbiros6221
    @michalbiros6221 Před 17 hodinami

    Oh boy, it's three times more expensive than Google's premium voices and only includes English. Skipped.

  • @pawan3133
    @pawan3133 Před 19 hodinami

    Thanks for another great video!! Can you please make a video or at least share the material on fine-tuning a quantized mistral v0.3 model

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      In general, you want to load the model in 4-bit. Look at my finetuning videos using unsloth.

  • @KCM25NJL
    @KCM25NJL Před 19 hodinami

    They still have natural cadence issues, which is a hard problem to solve.

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      Yes, I think this is just the alpha version so hopefully will get better over time.

  • @mohsenghafari7652
    @mohsenghafari7652 Před 20 hodinami

    thanks

  • @ScottzPlaylists
    @ScottzPlaylists Před 20 hodinami

    I'm interested in open source only... can't finish watching. Thumbs down, sorry.

  • @Beetgrape
    @Beetgrape Před 21 hodinou

    dude, I wanna deploy this on huggingface as an API. make a tutorial on this.

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      deployment series is coming soon, will give you an idea on how to do this.

  • @gregsLyrics
    @gregsLyrics Před 22 hodinami

    Brilliant vid - it is a godsend. OCRing a PDF is just not workable, period. I gave up on attempting parsing PDF. This new information is amazing and I am once again excited.

  • @MrKarlyboy
    @MrKarlyboy Před 22 hodinami

    If you wanted this to plug into a chatbot the pricing does not add up. I've done some crunching, it won't even get you far with a basic smallish customer doing say doing 1000-3000 chats a month which isn't a lot. Most engines price in at audio sequence every 15s or 1m. More good engines are emerging. For our low end customers, we usually see 3 to 5 concurrency anyway and that's like the smallest model. Currently we have done 100's of millions of chat, 100's of millions of live chat too. So getting into the billions. The market is competitive. Some of the new google studio voices are comparable, deep gram too. Sure these are nice voices but for streaming api, at cost and competitive, sorry but no! unless the pricing model radically improves. It's early days so hopefully there will be new models, new options and a realization. Suggest you take say 5000, 10000, 30000 and 100,000 chats and work out the text size average transcript on the bot side, and average out the characters. You will see my point!

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      that's a valid argument. Hopefully they will be able to reduce their price as they scale.

  • @christopherchilton-smith6482

    I wonder how far away we are from arbitrarily high accuracy on tasks like this.

    • @engineerprompt
      @engineerprompt Před 16 hodinami

      To be honest, when it comes to voice models, open source models are lagging behind!

  • @greymooses
    @greymooses Před dnem

    If you do make a video about scraping data, please go over content that requires javascript to load. It’s been difficult to find a clear guide specifically for capturing this data for LLM usage. I loved this video, thank you!

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      I haven't look into it before so let me see what I can come up with.

  • @tx3851
    @tx3851 Před dnem

    They do not sound good at all....

  • @arjundesai2715
    @arjundesai2715 Před dnem

    thanks for the feature! super excited to keep building here. For the best experience w/ the API, I recommend using `stream=True` to get the first audio back super fast . Audio will come back in chunks. we'll add more info about how to use this to our docs

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      thanks for pointing it out. I do feel the docs need more work, I am going to explore it further. thanks for putting it together.

  • @keithprice3369
    @keithprice3369 Před dnem

    Has anyone done a demo of a single Cartesia voice outputting something like podcast length? 20 to 30 minutes? The human quality on short text is stunning but I worry that over longer text it will fall into repetitive cadence. The fact that voices are cloned on just a 20 second sample reinforces my concern. Have you tested that?

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      Interesting point, I will do a test and report back. It will be a fun experiment.

  • @danielpicassomunoz2752

    Anything to convert to epub? Getting rid of headers and footers

  • @Jayden-qq1ei
    @Jayden-qq1ei Před dnem

    Markdowns for PDF for LLM😁

  • @canaldetestes4517
    @canaldetestes4517 Před dnem

    Thanks but I'm Brazilian and didn't find portuguese in it

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      At the moment, its only English.

    • @canaldetestes4517
      @canaldetestes4517 Před 11 hodinami

      @@engineerprompt Hi, ok. Thank you for your attention and answer

  • @DevsDoCode
    @DevsDoCode Před dnem

    Hey Prompt Engineer, If you don't mind. Could i also be a contributor of your Project. I have some wonderful Features which could help you make your Verbi AI more better and a perfect voice assistant 🥹 Its a request to add me in the group. I would disappoint you 😼

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      Yes, would love contributions. Please open a PR. We have a dedicated channel on the discord server. Feel free to join the discussion there.

  • @unclecode
    @unclecode Před dnem

    Remind me of Elevenslab's early days. I think they use stream mode in their playground, measuring the time it takes to generate the first audio segment. That's why seems very fast. What do u think?

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      That's exactly how they are doing it. Their cofounder pointed it out and suggested to enable streaming via api as well. On the discord a contributor to project-verbi said its possible to get about 200-400ms with streaming. I might redo this again.

  • @P-G-77
    @P-G-77 Před dnem

    This... incredible... awesome, NICE WORK !!

  • @avi7278
    @avi7278 Před dnem

    Appreciate your efforts, but why the heck would you need an API call to get the ID of the voice you want to use or other seemingly static parameters? Also the API latency is terrible compared to their playground. Either you're doing something unnecessary still or their infrastructure is poor, which defeats the purpose of their supposedly low latency. Further the text to speech piece should be chunked into sentences and be streamed to the TTS service instead of waiting for the full response. This is OK for one or two sentence responses but if latency increases linearly then it's no good. Is there endpointing? interruption?

    • @arjundesai2715
      @arjundesai2715 Před dnem

      thanks for the feedback @avi7278. 1. you can get the voice_id straight from the playground! will have support very soon for passing that in directly 2. For the best experience w/ the API, recommend using `stream=True` to get the first audio back super fast 🚀. Audio will come back in chunks. we'll add more info about this to our docs 3. you can definitely send text chunks over the wire, will have more native support for text streaming soon

  • @gsagum
    @gsagum Před dnem

    the free plan is "10000" characters , while the lowest $5 per month gets you "100,000 characters per month". I re-read that again, its in "characters" and not "words" . am i dreaming? so one letter is one character, right? is that correct? isn't that super expensive.

    • @3750gustavo
      @3750gustavo Před dnem

      Its cheap compared to the other paid voice service (elevenlabs), that gives only 30k characters for 5 dollars, for the same 100k characters, it cost over 20 dollars on elevenlabs, 4x more expensive, but yeah, compared to other AI services where you pay once and gets almost unlimited usage, like infermatic for text AI, its expensive.

    • @simongus
      @simongus Před dnem

      And with character they can count every space as a character.

    • @BackTiVi
      @BackTiVi Před dnem

      Yup that's only characters. On average, 1000 characters is about 1 min of audio iirc, so the free tier is 10 min audio. For the same price ($5), the starter pack of Elevenlabs is only 30000 characters per month, so only half an hour.

    • @ronilevarez901
      @ronilevarez901 Před 14 hodinami

      @@BackTiVi I'll stay with Coqui.

  • @anandgs
    @anandgs Před dnem

    I had another question, are you also on Udemy?

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      I am not on Udemy but just launching my RAG course here: prompt-s-site.thinkific.com/courses/rag

  • @anandgs
    @anandgs Před dnem

    Thank you very much!!! I was looking for something like this for a long time. I work for a large bank but with very small budget for my project. Due to budget crunch we cannot afford buying third party tools, this sounds to be a perfect fit but since there is a limit of $5MN we may not qualify to use this for free. Would you suggest going with Nougat or you have a better alternative for my use case, really appreciate your content!

    • @engineerprompt
      @engineerprompt Před 17 hodinami

      Nougat can be an option or look into unstructuredio. Also I would recommend to look into Claude or GPT4o with vision if data privacy is not a big issue. Some of these proprietary tools have good data privacy based on their TOS.

    • @anandgs
      @anandgs Před 7 hodinami

      @@engineerprompt Thanks for the prompt response!!

  • @MeinDeutschkurs
    @MeinDeutschkurs Před dnem

    I‘m always so impressed by models like this. But where are all the open source solutions according to this topic? Research is crazy!

  • @tribuzeus
    @tribuzeus Před dnem

    Multi-language?

  • @Zale370
    @Zale370 Před dnem

    The more models I use the less I want to pay for the apis

    • @14supersonic
      @14supersonic Před dnem

      Yeah, we really need this stuff free and open source. The only real limiting factor is the affordability of the GPU(s) needed to run this locally. There's stuff out there, but local open source audio stuff is behind text and image based models sadly, but maybe soon that'll change.

    • @Zale370
      @Zale370 Před dnem

      @@14supersonic actually you can use stable diffusion locally with a mid range 12 gb commercial gpu for image generation, audio models as well, also quantized llm models are very good for simpler tasks like summarizations

    • @14supersonic
      @14supersonic Před dnem

      @Zale370 I know, that's why I said audio based AI models are behind text and image based solutions. When you compare something like local Llama 3 or SD3 to local audio based AI models, there's no audio modality comparable to them yet in terms of local usage.

    • @Alex29196
      @Alex29196 Před dnem

      Indeed, there are no optimal and swift text-to-speech (TTS) solutions for local LLM inference. I personally believe this is not solely due to GPU memory constraints but also driven by security considerations.

    • @ts757arse
      @ts757arse Před dnem

      Yeah, I'm a security consultant and the risks inherent in this are just insane. I won't ever say open source should slow down but I appreciate the time we are getting to communicate what's coming. Amazingly, the EU AI legislation classifies voice cloner AI as lower risk. I don't think they've ever got a phone call from their doctor asking them to stop a particular medication or their wife saying they're being held hostage and they're demanding all the money. It gets darker from there.

  • @GameofLifeChannel
    @GameofLifeChannel Před dnem

    Liking without watching coz I know this is gonna be amazing

  • @manjula_1
    @manjula_1 Před dnem

    This is Very useful!, Now, In next video, Tell how to finetune any model (with some long context length like "Phi-3-mini-128k-instruct") With this Markdown Data 😍😍

  • @engineerprompt
    @engineerprompt Před dnem

    If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag

  • @engineerprompt
    @engineerprompt Před dnem

    If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag

  • @drmartinbartos
    @drmartinbartos Před dnem

    Around 7minutes, having installed a conda environment you select pip not conda when installing PyTorch - any reason why? If there’s a working conda option doesn’t it make sense to keep using conda and only use pip when you absolutely have to? Just wondering.. (thanks for the video btw - had just been wondering about effective ways of making off content reliably available to RAG and the video is super-useful).

    • @engineerprompt
      @engineerprompt Před dnem

      I usually use pip because that has most of the python packages available. conda is somehow limited with available python packaged. conda will also work in this case but its more of my own habit at this point :)

  • @Sneakylamah
    @Sneakylamah Před dnem

    On my m1 Mac i have tried this out, installing dependencies = [ "torch>=2.3.0", "torchvision>=0.18.0", "torchaudio>=2.3.0", "marker-pdf>=0.2.13", ] Then when i try out just a single pdf it fails on a simple python import. marker_single 26572517.pdf OUTPUT --max_pages 2 --langs English Traceback (most recent call last): File "marker/.venv/bin/marker_single", line 5, in <module> from convert_single import main File "marker/.venv/lib/python3.12/site-packages/convert_single.py", line 5, in <module> from marker.convert import convert_single_pdf ModuleNotFoundError: No module named ‘marker.convert' Anyone getting the same? Tried with python 3.10 and 3.12

    • @engineerprompt
      @engineerprompt Před dnem

      are you using a virtual environment? use this command: python -m pip install marker-pdf This will ensure its installing the package in the current virtual env.

    • @Sneakylamah
      @Sneakylamah Před dnem

      Using rye, and yes it is there in my virtual env.

    • @Sneakylamah
      @Sneakylamah Před dnem

      The marker scripts are there to be called.

    • @Sneakylamah
      @Sneakylamah Před dnem

      @@engineerprompt Ok the problem seems to be with the way Rye handled the imports, sorry bout that. Creating the virtual env normally i can run the commands. Thanks for the video, i have been looking for how to do this a long time.

  • @DavidJNowak
    @DavidJNowak Před dnem

    What I want is for LLMs to cook my next meal.

  • @drmetroyt
    @drmetroyt Před dnem

    Docker version please