Video

zhlédnutí 0

Komentáře •

@IdPreferNot1 Před 4 měsíci ⁺⁶
Yes please on tutorial building even more functionality of this example! 😀
@guanjwcn Před 4 měsíci ⁺¹
Very interesting and insightful. Thank you very much, Sam.
@chorton53 Před 2 měsíci
This is such a cool example. I was looking for this for a long time. Cheers for that !
@andikunar7183 Před 4 měsíci ⁺¹
Thanks a lot. Great content!
@IvanFioravanti Před 4 měsíci
Love it!!! Great content and super Ollama in action!
@samwitteveenai Před 4 měsíci
Thanks!!
@alx8439 Před 4 měsíci ⁺¹⁰
You can save this text meta back to your image files to EXIF, so it will be always going hand-to-hand without the need of extra files lying around
@WhySoBroke Před 4 měsíci ⁺¹
Great insight! Can you please provide more details for those of us getting started? Many thanks in advance!
@brianhauk8136 Před 4 měsíci
A quick search shows that "EXIF metadata is restricted in size to 64 kB in JPEG images, because according to the specification, this information must be contained within a single JPEG APP1 segment." The relevant metadata tag is ImageDescription.
@Canna_Science_and_Technology Před 3 měsíci
@@WhySoBroke import piexif
def add_description_to_exif(image_file, description):
# Load the existing EXIF data
exif_dict = piexif.load(image_file)
# Add or update the EXIF tag with your description
# For example, using the UserComment tag
exif_dict['Exif'][piexif.ExifIFD.UserComment] = piexif.helper.UserComment.dump(description)
# Write the modified EXIF data back to the image
exif_bytes = piexif.dump(exif_dict)
piexif.insert(exif_bytes, image_file)
# Usage example
description = "Generated description of the image."
image_file = "path_to_your_image.jpg"
add_description_to_exif(image_file, description)
@MikeCreuzer Před 11 dny
I am really struggling with this getting the data in places that the tools i use will use or display this. Plus it appears that the windows thumbnail uses up most of the available space in the EXIF space so I will need to drop that piece. On top of that all, the libraries like decompressing the images which I really don't like.
@lazut273 Před 4 měsíci
that is what i exactly looking forward thanks a lot
@LaHoraMaker Před 4 měsíci
This is right into the awesomeness space! Thanks for sharing this project! (yesterday I was working on a similar solution using ComfyUi + Python exporting but this is way cleaner)
@samwitteveenai Před 4 měsíci
I really like the look of ComfyUI I need to try and make some time to play with it.
@renierdelacruz4652 Před 4 měsíci
Great video, Thanks very much
@pnddesign Před 4 měsíci ⁺¹
This is great ! With the idea to put the result to the exif metadata, this would be awesome 😎
@mrpasak Před 19 dny
Thanks for sharing this is very useful and its a good source that i keep coming back to
@MassimilianoGrecoPh Před 4 měsíci ⁺²
These are the 4 questions i ask llava and then I put the results manually in the comment section of the exif metadata:
describe this image in great detail
write the 10 most relevant questions for this image
answer the 10 above questions in the correct order
write the 20 most relevant tags for instagram
I will try to automate this workflow to keyword my photo collection , thanks for this tutorial!
@MikeCreuzer Před 11 dny
I am trying to do this myself. I am struggling with the exif writing. I keep getting space limitation errors. I think it's due to the windows thumbnails.
@miriamramstudio3982 Před 4 měsíci
Excellent !!! Was just playing around with moondream. Perfect timing ;)
@samwitteveenai Před 4 měsíci ⁺¹
I was hoping they would put Moondream in here as well. I also played with that and was impressed what it could do for the size.
@donb5521 Před 4 měsíci
Awesome video. appreciate you demystifying the process and tying in queuing, dataframe, and rag concepts some powerful stuff. Will be interesting to do an apple to apple comparison with GPT Vision and Gemini Vision functionality.
@samwitteveenai Před 4 měsíci ⁺¹
Next vid will be some cool Gemini stuff
@carlosterrazas5091 Před 4 měsíci
Great video, really what I was looking for, some useful real world cases on how to use LLM models locally, (instead of paying a company to do this for us, of course more secure and private). What I would love to see, is how to integrate this example to create a Tweet for us about the image, store it in the CSV file, and then be able to post the image with that tweet at intervals directly, maybe using Twitter's API? Not very tech savvy myself, but very interested in putting LLMs to some real world use and automation. Thanks for making these videos.
@brianhauk8136 Před 4 měsíci ⁺¹
Good job, and thank you for again sharing your knowledge showing us how to do useful stuff. I'd also be interested in seeing how you create a professional web user interface for this and other projects going forward. What are some good ways of doing this which are easy to make look good and modern, and which run on all major browsers?
@samwitteveenai Před 4 měsíci
normally I use NextJS for that kind of thing
@brianhauk8136 Před 4 měsíci
Good to know this. I look forward to an example of how to create a professional front end using NextJS if you'd like to recommend a tutorial or create one here @@samwitteveenai.
@theh1ve Před 4 měsíci
I can't wait until these multimodal local models can read charts and graphs reliably.
@mr.daniish Před 4 měsíci
just what the doctor ordered.
@oznek1839 Před 4 měsíci
thank you for the sharing Sam! i tried the same thing here, but nothin happen, the process seems to be stuck and show only :
Processing ./images\1.png
any idea why?
@oznek1839 Před 4 měsíci
oh my bad. i update ollama and it works
@JDP-uq7zn Před 4 měsíci
Any tips on getting a more consistent response with only the necessary text I want extracted from an image? I’ve played around with the prompt quite a bit and even provided an output example.
@MikeCreuzer Před 11 dny
I have a loop where I generate a response, then have another prompt ask if this response is correct for the image. If no, try again. I like the big llava for the first writing and a smaller llava or moondream for the checking. It can take a couple minutes for the multiple attempts, but that's ok.
@squiddymute Před 4 měsíci
why pass the file to ollama as bytes and not an image file ? is it faster that way? Also do you know any hacks for ollama to return precisely a specific number of words (or a range) every time ?
@fintech1378 Před 4 měsíci ⁺¹
can you do a tutorial bout AI agent, for image and video
@christopherd.winnan8701 Před 4 měsíci ⁺¹
Please do some more examples of identifying difficult screen shots.
Have you also thought about how boxing could improve this process?
@samwitteveenai Před 4 měsíci ⁺¹
boxing meaning bounding boxes?
@christopherd.winnan8701 Před 4 měsíci
@@samwitteveenai - Great minds think alike! ;-)
@samwitteveenai Před 4 měsíci
@@christopherd.winnan8701 I haven't tried it with this model, but I tried it using the Moondream model with red bounding boxes and it was able to workout what was inside. been working on getting it to give me BB coordinates for things.
@christopherd.winnan8701 Před 4 měsíci
@@samwitteveenai - Thank you for the great research you are doing. Always looking forward to more of your excellent vids!
@federikoPlus Před 2 měsíci
Is there any way to indicate the base model? It is not in localhost in my case..Thanks
@samwitteveenai Před 2 měsíci
Ollama normally support the Instruction Tuned models rather than the base model. You can do a custom install for any model including base models if they are converted to the right format. If you mean the model that gets loaded yes you can do that in the API.
@chrisadamec900 Před 4 měsíci
Ollama llava and bakllava handle PNG . What are you gaining by converting to bytes?
@samwitteveenai Před 4 měsíci
for me I was having issues with it working with PNGs that is the reason I added it. I will have a look again and see if maybe I just had something set wrong the first time.
@edits_for_fun Před 4 měsíci
llava:34b-v1.6 running very slowly and not using GPU whereas llava:13b-v1.6 working fine.
my system specs
Ram: 32 GB
Gup: nvidia3060 12GB
@Ryan-yj4sd Před 4 měsíci
Are you using GPU? Or all on CPU RAM?
@mshonle Před 4 měsíci ⁺¹
He is using a Mac mini which has a unified memory architecture. So, while the GPUs are used they do not have their own dedicated memory.
@samwitteveenai Před 4 měsíci
@mshonle is totally right, no NVIDIA GPU used just the inbuilt Mac one
@brabes76 Před 4 měsíci
Can you try out video of an application and ask Gemini to code an application with simular functionality and design. Something simple
@HistoryIsAbsurd Před 4 měsíci
Ur welcome 😂
@TheGuillotineKing Před 3 dny
Microsoft stole your idea
@Canna_Science_and_Technology Před 3 měsíci
Just FYI for others needing to reference their local ollama.
client = Client(host='192.168.0.25:11434')
response = client.generate(model='llava:34b-v1.6',
prompt='describe this image and make sure to include anything notable about it (include text you see in the image):',
images=[image_bytes])
@SigmaScorpion Před 4 měsíci ⁺¹
Being a windows user... I am still waiting....
@collectivelogic Před 4 měsíci
Windows sucks. It really really sucks 😂
Get off that Microsoft telemetry machine while you still can. (All in jest I don't actually care which OS you use. )
@samwitteveenai Před 4 měsíci
Windows version is just out, at least a beta version.