Bioinformatics in Python: DNA Toolkit. Part 1: Validating and counting nucleotides.

Sdílet
Vložit
  • čas přidán 22. 08. 2024

Komentáře • 124

  • @JohnnyUtah13
    @JohnnyUtah13 Před 4 lety +37

    I spent two days figuring out how to count nucleotides by converting strings to lists and using an overly complicated list of if/else commands. You just showed me a far superior method in less than 10 minutes. I am already thankful and can't wait to watch the entire series.

    • @rebelScience
      @rebelScience  Před 4 lety +8

      Awesome! I know the feeling. We will try diving deeper in more complex but more interesting stuff soon. It is important to make sure you understand Python fundamentals to be able to use it effectively. Make sure you watch Cory's video series. You will be surprised when we make 2-3 Pythonic lines of code out of 10.

    • @rebelScience
      @rebelScience  Před 4 lety +4

      I just wanted to add that you can join our community chat and we will try helping you next time, so you save 2 days of figuring things out on your own.

  • @tiamat1628
    @tiamat1628 Před rokem +4

    I am an MD and I want to become a bioinformatician, I have zero exp in programming and I found your video very easy to understand and digest.
    Thank you very much, you earnd a new sub.

  • @akanimohosutuk928
    @akanimohosutuk928 Před 16 dny +1

    Currently running all these code in a decentralised Cartesi VM for a side project. Thanks for these videos

    • @rebelScience
      @rebelScience  Před 12 dny

      Sounds amazing. I know the Cartesi Blockchain project.

    • @akanimohosutuk928
      @akanimohosutuk928 Před 12 dny +1

      @@rebelScienceI will share with you when I am done next week

  • @matthewmarshall5730
    @matthewmarshall5730 Před 11 měsíci +1

    Thank you, I like the pace of the teaching and the relevant examples used for bioinformatics.

  • @rebelScience
    @rebelScience  Před 4 lety +5

    FYI: at 3:25 it is ASCII Table: t.ly/jyGG8. In ASCII table 'a' = 97 and 'A' = 65. So 97 != 65 and 'a' != 'A'

    • @broytingaravsol
      @broytingaravsol Před 3 lety +1

      i'll go through all ur work on bioinformatics in python, i'm on both

  • @gabrielevetrugno6089
    @gabrielevetrugno6089 Před 4 lety +4

    Amazing! Love to see and try new stuff about the topic 💪🏻

    • @rebelScience
      @rebelScience  Před 4 lety

      Thanks! We will cover some very exiting and interesting research in the future.

  • @carosfine
    @carosfine Před 9 měsíci

    jesus, i'm in love with this playlist. thank you so much

  • @boogywoogy2395
    @boogywoogy2395 Před 4 lety +7

    great content...really helpful and well explained

  • @GGLazyJJ
    @GGLazyJJ Před 4 lety

    your recommendations are always the best part of your videos!

  • @amitrupani9898
    @amitrupani9898 Před 4 lety +1

    Thanks rebelCoder! Enjoyed learning from this lesson. Look forward to upcoming lessons!

    • @rebelScience
      @rebelScience  Před 4 lety +1

      Thank you for watching! I am glad you liked it. We are just getting started! We will cover some complex stuff after we cover all the basics.

    • @rebelScience
      @rebelScience  Před 4 lety +1

      And please feel free to comment and suggest things as I want to have an open and collaborative approach!

    • @amitrupani9898
      @amitrupani9898 Před 4 lety +1

      ​@@rebelScience Sure, as a Bioinformatician, I often come across situations where I have to compare multiple files (sometimes, 100's of gb's in size) based on genomic coordinates to create new file/files.
      Would be nice to see something similar in one of the lessons. Also, methods of code optimization for quick file comparisons for bigger size files would be great!
      :-)

    • @rebelScience
      @rebelScience  Před 4 lety +1

      @@amitrupani9898 Sounds interesting! I will be covering memory optimizations, speed optimizations and multi threaded approach too. I plan to cover writing super fast routines in C++ or Rust and hooking into them from Python. It was hard to figure out where to start this series, and I decided to go with the basics first and build up. There is so much to cover...

    • @amitrupani9898
      @amitrupani9898 Před 4 lety +1

      @@rebelScience I think its a great way to start (given basic programming skills are a prerequisite). Look forward to a great leaning experience! :-)

  • @dylanneal8244
    @dylanneal8244 Před 3 lety +1

    So cool. Thanks for this video!

  • @mattgraves3709
    @mattgraves3709 Před 3 lety

    Agreed Corey's videos are really good.
    What I know of Python I got much from him.

  • @nazaninrahimirad7344
    @nazaninrahimirad7344 Před 4 lety +2

    I couldn't see the exact codes. I think it was better to zoom in your screen or you have used a high contrast theme

  • @apoorvwatsky
    @apoorvwatsky Před 4 lety +3

    Amazing content! Looking forward. :)
    I'd prefer not to cast Counter object as dictionary, and use them as it is. Whatever operations you can perform on dictionaries, you can do them on Counters too. They are mutable, fast and already come with out of the box features like most_common etc.

    • @NirielWinx
      @NirielWinx Před 3 lety +1

      Yeah, but each time you operate on them you risk changing the order (Counter.update disrespects the original order) or losing zeros (Counter.__add__ decides to remove keys when values reach zero). Furthermore, even though Counter dicts have implicit zeros for __getitem__, they break equality with dictionaries that have implicit zeros, so testing is a mess. The behavior of Counter object is too chaotic for me: I want to rely of the promise of OrderedDict, I want equality to work, and I don't want the zeros to disappear for no reason. So I use {n:seq.count(n) for n in NUCLEOTIDES}.

  • @felipepedro1678
    @felipepedro1678 Před 2 lety +1

    Great content!

  • @williamcowan4936
    @williamcowan4936 Před 3 lety +2

    at 1:12 should we have downloaded something other than python and our IDE? or should we make those files/projects exactly as we see on the video?

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Hi. We don't download anything in our videos. We create everything from scratch. I have a video on setting up the code editor also.

  • @zeination
    @zeination Před 3 lety +1

    I'm a Computer Science student and im so interested in the field of Bioinformatics.
    Its just that i'm lost from where should i start first to catch with your videos
    Thank you so much

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Hey. Join our chat and check out my last article as it is about your question.

    • @zeination
      @zeination Před 3 lety +1

      @@rebelScience thank you so much!

    • @cognosagedev
      @cognosagedev Před 2 lety

      @@rebelScience plz mention that article i also have the same case?

  • @tekomichael2667
    @tekomichael2667 Před 3 lety +1

    Thank you man it's by far great video from what I saw, though texts on the screen are very small & sometimes hard to read.

    • @rebelScience
      @rebelScience  Před 3 lety

      Thanks! Are you watching on a mobile device ? I adjusted the size of the font and tested on small screens in next videos so it should be better.

    • @tekomichael2667
      @tekomichael2667 Před 3 lety

      @@rebelScience Thank you for responding so fast, you're a man of your words. Yes, you're right usually I watch on mobile device, coz I don't have pc where I mostly stay & work for the time been.

  • @daniocionini7043
    @daniocionini7043 Před 3 lety

    really great! Thank you for that

  • @kaansimsek7986
    @kaansimsek7986 Před 3 lety +2

    hello i am a biomedical engineering student. I chose DNA analysis with python as my last thesis and can you help with software? I really need it. thank you.

  • @jaswanthchotu6068
    @jaswanthchotu6068 Před 8 měsíci

    Please do more useful information about bioinformatics

  • @HanhNguyen-ue6oq
    @HanhNguyen-ue6oq Před 3 lety

    Really helpful! Thanks a lot

  • @1973vgc
    @1973vgc Před 3 lety

    Thank you for this!!!

  • @user-ey8cz9ow4i
    @user-ey8cz9ow4i Před 10 měsíci

    thank you very much

  • @nardineharrab
    @nardineharrab Před 4 měsíci

    Hello, i have this problem, it gave me this output {'A': 16, 'C': 12, 'T': 6, 'G': 16} while I entered the dictionary in this order {"A": 0, "C": 0, "G": 0, "T": 0} why it switches the T and the G order in the output ?? please help me cuz I'm stuck here
    thanks

  • @is44ct37
    @is44ct37 Před 4 měsíci

    I get the error: no module named DNAToolkit - I tried installing the DNA toolkit from PIP, and I thought it would work, but still giving me the same error. I copied the code, from what I could tell, exactly. Any thoughts?

  • @gowrang456
    @gowrang456 Před 4 lety +1

    Great content now I am able to understand how to apply python in bioinformatics. For the random joining of the nucleotide sequence does the nucleotide arrangement happen in a defined way or there is no pattern for the generation of nucleotide?

    • @rebelScience
      @rebelScience  Před 4 lety +1

      Hi! Well, Randomness is exactly what it is - random generation of characters. If we would what you call "a pattern" or "defines way", that would not be randomness, right? We use that just for tests.

  • @diegoavendanohernandez9908

    awesome content, grate channel

  • @ivanviveros
    @ivanviveros Před 3 lety +1

    Your videos are so good man! Thank you. Btw, which vs code theme is that? I love the color scheme!

    • @rebelScience
      @rebelScience  Před 3 lety

      Hey! Thank you. I have configured my theme a long time ago and interestingly enough, it was changing on its own by becoming darker. I think extensions I was using for my theme kept getting updated and that is why it changed for me with the time. I will try figuring out my config and share it with you as a few other people were interested in this.

    • @wilku1039
      @wilku1039 Před 3 lety

      @@rebelScience hey, any updates on that? the theme looks really good, and i couldn't find any information about it from you

  • @irinalaivina8664
    @irinalaivina8664 Před 4 lety +1

    Very interesting! I like it!

  • @daltonham2821
    @daltonham2821 Před 3 lety +1

    What if you want to include N's which represents any of the four nucleotides?

    • @rebelScience
      @rebelScience  Před 3 lety +2

      In our case, we are working with standard Nucleotides for now as most of the raw data will be in this format. Adding a lot of other variants and logic would make our first lesson overcomplicated. This is a beginner level set of tutorials. We will be adding a lot of cool and complex stuff in our next series "Genome Toolkit", which will use "DNA Toolkit"! Stay tuned.

  • @cristianperalta5022
    @cristianperalta5022 Před 4 lety +6

    Hi!, First of all, thank you very much for this kind of videos, they are absolutely fascinating.
    I have a problem, I was following your instrucctions and then, suddenly, the code wouldn't work. The problem is something related to the module: from DNAToolkit import *. The error says the following: "ModuleNotFoundError: No module named 'DNAToolkit'.
    Something absolutely hilarious, because a few minutes before the code was functionating. I'm using sublime text, please help me.
    Thanks

    • @rebelScience
      @rebelScience  Před 4 lety +1

      Hey! Thanks! I enjoy sharing this information very much.
      About the error: it looks like Code Editor (Sublime Text) problem or file naming problem. Hard to tell what it is without looking at logs. I would suggest joining our chat on Telegram or Matrix (links are in video description) so you can share screenshots and output information.
      For now, make sure all of your files are named correctly (DNAToolkit.py or dnatoolkit.py)
      Try creating a new folder for the project, copy all files, make sure names are correct and try running the code again.
      Where it says "ModuleNotFoundError", does is say something about temp file?

    • @cristianperalta5022
      @cristianperalta5022 Před 4 lety

      @@rebelScience I've even deleted all the files and created it again, though, I realized that a file named "__pycache__" was created. It said nothing about "temp file".
      I might try VS Code as a Code Editor.

    • @not_him...1
      @not_him...1 Před 2 lety

      @@rebelScience thanks Sir, I have the same problem. I just couldn’t get to import the files from DNAtoolkit. I really don’t know what to do. I really enjoy your explanations and I’m sure I understand them, but I have a problem importing the tools to work on bioinformatics.

    • @not_him...1
      @not_him...1 Před 2 lety

      @@rebelScience I’ll be really glad if you can reply as soon as you’re able, I am eager to learn more but I can’t if I cannot practice myself.

    • @not_him...1
      @not_him...1 Před 2 lety

      @@rebelScience yeah, I just joined the platform on telegram. I can’t drop questions there too. So please, you help’s needed 🙏

  • @what_the_really
    @what_the_really Před 10 měsíci

    I'm studying with your videos. but when I print result, It show different result when I runned. If the random result tart from G, dictionry's result shows also G. Is it OK? If I use join list comprehension, how could I know which one is A or G or C or T ?? If I like to make a dic list start from A, C, G and T , how to make a code...?

    • @what_the_really
      @what_the_really Před 10 měsíci

      also... If i "print(' '.join([str(val) for key, val in result.items()]))" this one, when I print dic. it has blank value and key, should I '' (without black), instead ' '(with black) ? but If I use '' , the print resule show 20121721 no 20 12 17 21... I couldn't find out what is different with yours..

  • @TTy5361
    @TTy5361 Před 3 lety +1

    Super dumb question but what IDE are you using to run these python scripts?

    • @rebelScience
      @rebelScience  Před 3 lety

      I have a video on my channel, titled Development Tools. It has all the answers ;)

  • @Jonix-redhat
    @Jonix-redhat Před 4 lety +1

    Thank you for the great video! I know this is a newbie question because I just started to learn bioinformatics with python (I'm a biomedicine master student), but anyway: why do you use "[" and "]" in join([random.choice(Nucleotides)... and not just "(" and ")"?

    • @rebelScience
      @rebelScience  Před 4 lety

      [random.choice("ACGT") for x in range(10)] is a list comprehension
      Then we pass it to a join method and all methods/functions have () - join()
      Try this: test = [random.choice("ACGT") for x in range(10)]
      and this: test = random.choice("ACGT") for x in range(10)
      You can run it the way you suggested and it seems that Python 3.6 and up recognizes that it is a list comprehension and allows for this: seq = ''.join(random.choice("ACGT") for x in range(10))
      But it is a bad practice as you should make sure your code is readable and [ ] is a list comprehension.

    • @Jonix-redhat
      @Jonix-redhat Před 4 lety

      @@rebelScience Ok! thanks a lot for the answer, I understand! looking forward to more good videos with bioinformatics! take care!

  • @et504383
    @et504383 Před 3 lety

    I made the python code on Jupyter Notebook, but it can not work well.

  • @TragoudistrosMPH
    @TragoudistrosMPH Před 3 lety +1

    Hi, I noticed your DNAtoolkit file is not on your gitlab folder DNA Toolset. I clicked the history and found the file.
    I noticed that when I tried to import the file.
    Hopefully that helps (and I'm not foolishly misunderstanding anything haha)

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Hey. We are not importing anything. DNA Toolkit is not a Python module. DNA Toolkit is a tool we write from scratch in Python.

    • @TragoudistrosMPH
      @TragoudistrosMPH Před 3 lety

      @@rebelScience I see, DNA Toolkit is in the history, and you were importing it while writing it. I had never imported a file I was working on
      (from DNA Toolkit import *)
      I happened to be using jupyter notebook, so I overlooked the idea :P (I'm a biostatistician, so coding is a secondary skill lol)

    • @rebelScience
      @rebelScience  Před 3 lety +1

      If you follow every video from 1st to last you should have a good idea of what we are doing. I am not sure how to import additional files onto Notebooks. Try searching for it on the internet.

    • @TragoudistrosMPH
      @TragoudistrosMPH Před 3 lety

      @@rebelScience no worries! I was reporting back that I figured it out and that you were correct :)

  • @kalyanirajalingham1286

    Amazing! I love your videos!

  • @bhatwasim6741
    @bhatwasim6741 Před 3 lety

    Why these codes r not running for me...em copying exactly

  • @MirjamGrebenc
    @MirjamGrebenc Před 4 lety +1

    What is the software you use? I have been using jupyter but prefer the layout you have

    • @rebelScience
      @rebelScience  Před 4 lety +2

      Hey. I have a video on that. Search for Development Tools in my videos.

    • @MirjamGrebenc
      @MirjamGrebenc Před 4 lety

      @@rebelScience thank you so much

  • @maheshrani6609
    @maheshrani6609 Před 5 měsíci

    please share the gitlab link.

  • @mariasira5808
    @mariasira5808 Před 3 lety

    Can sb who has studied bioinformatics work in the laboratory , or is it more of a computer/coding job ?

  • @lizixiao9316
    @lizixiao9316 Před 3 lety +1

    How is your cursor line highlighted?

    • @rebelScience
      @rebelScience  Před 3 lety

      It is just an extension for the code editor, called line highlighter. Try search for it in the extensions library

  • @MehranKhan-he1lh
    @MehranKhan-he1lh Před rokem

    Link for structuring the project/access to the project (step in your video at the moment (1;:10 minute)? Please

    • @rebelScience
      @rebelScience  Před rokem

      "Link for structuring the project/access to the project"
      1. We are creating this project, so if you follow the video series, you will see how we are structuring it.
      2. If you are looking for a git repository for it, it is in the video description.

    • @MehranKhan-he1lh
      @MehranKhan-he1lh Před rokem

      @@rebelScience Thanks

  • @EuphoriaSkater3
    @EuphoriaSkater3 Před 4 lety +1

    def validate_seq(seq):
    for nuc in seq:
    if nuc not in nucleotides:
    print("Invalid sequence. Only A, C, T, and G are accepted characters.
    ")
    randomvschoice()
    return seq
    Hi rebelCoder,
    I have this coded for a more user friendly version. I get a strange result though.
    When I input a sequence with a mixture of correct and incorrect nucleotides it
    ignores this statement. But when I input only
    incorrect characters it works fine. I am not sure why please help.

    • @Rossboe1
      @Rossboe1 Před 4 lety

      I have the same problem

    • @juanmaruizrobles5867
      @juanmaruizrobles5867 Před 4 lety +3

      @@Rossboe1 just check the identation of the last line...I suggest it could be just in the level of the "for", not in the "if"

  • @varisingermany
    @varisingermany Před 4 lety

    i cant run the file why tho ? I wrote all the things like you but cant run it

    • @rebelScience
      @rebelScience  Před 4 lety +1

      Well, you would need to do two things: add more details of what OS/Editor/code runner/Plugins you use and how you are trying to run things, or join our chat in Telegram/Matrix and share some screenshots with above information.
      Have you set-up your code editor like we did here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/

  • @aysha.h5608
    @aysha.h5608 Před 3 lety +1

    did you use pycharm?

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Yes, I have used PyCharm. It is a good code editor for Python. I use VSCode as I think it is the best one. Also, if you are writing in more than one programming language, which I do, VSCode is perfect. It supports any language, while PyCharm editor is strictly Python. The important point is, if you are familiar with the tool (PyCharm in your case?), and it does everything you need, just keep using it. Also, I have an article and a video about that here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/

  • @divz2646
    @divz2646 Před rokem

    how you create such functions, and hav you downloaded the module?

    • @rebelScience
      @rebelScience  Před rokem

      No, we do not use any modules. We create DNA Tooolkit module from scratch in this series of videos.

  • @mohammedahmedjalloh531

    You didn’t give any instructions on how to install the toolkit which is difficult for some of us to even start

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Hello. We are not installing any toolkit. We are developing it from scratch in plain Python in this series of videos. Did you watch my introduction video?

    • @mohammedahmedjalloh531
      @mohammedahmedjalloh531 Před 3 lety

      @@rebelScience Hello, thanks for the instant reply. I am a beginner in this course, both python and Bioinformatics... I just want to ask if I can use Pycharm as my code editor.

    • @rebelScience
      @rebelScience  Před 3 lety +1

      Yes. You can use whatever you want. Whatever is easier for you. I talk about that in the Introduction video.

  • @marzijahan5502
    @marzijahan5502 Před 3 lety

    which environemt are writing. It is not python. Also I have problem with space character. When I use space in python there are some errors!

    • @rebelScience
      @rebelScience  Před 3 lety

      Sorry, what do you mean by it is not Python ? I have Introduction video, and the other one is called Development tools. You should watch those two videos to understand what environment I use and how to set it up.

    • @marzijahan5502
      @marzijahan5502 Před 3 lety

      @@rebelScience Ok. TnQ

    • @marzijahan5502
      @marzijahan5502 Před 3 lety

      @@rebelScience I could not find the Development tools among yr videos! :(

    • @rebelScience
      @rebelScience  Před 3 lety

      I only have 27 videos ;) czcams.com/video/81Eb_YXmV4g/video.html

  • @linuxuser1234
    @linuxuser1234 Před 3 lety +1

    What python software did you use

    • @rebelScience
      @rebelScience  Před 3 lety

      Hey! Sorry, I am not sure what you mean. Are you asking about what code editor I use to write Python code in?

    • @linuxuser1234
      @linuxuser1234 Před 3 lety +1

      @@rebelScience yes

    • @rebelScience
      @rebelScience  Před 3 lety

      I have a video about the code editor here: czcams.com/video/81Eb_YXmV4g/video.html

    • @linuxuser1234
      @linuxuser1234 Před 3 lety +1

      @@rebelScience thanks

  • @sujanmahmud1038
    @sujanmahmud1038 Před rokem +1

    what ide is this?

    • @rebelScience
      @rebelScience  Před rokem

      Hey! You should start with the Introduction video, where I explain what you need to work with this series of video, including the code editor. I also have a video of how to set it up.

  • @rajarshimondal
    @rajarshimondal Před 3 lety +1

    Sir I just joined the telegram channel mentioned in the chat box. I'm a Biotechnology student and I want to learn bioinformatics. So I joined the telegram channel. I think without any reason I'm banned.

    • @rajarshimondal
      @rajarshimondal Před 3 lety

      Please invite me back bro.🥺

    • @rebelScience
      @rebelScience  Před 3 lety

      Hey. When you join, Bot asks you a question you need to type an answer for in the chat. If you don't, Bot kicks you. This is a Spam protection. Try joining again and see what Bot asks you.

    • @rajarshimondal
      @rajarshimondal Před 3 lety

      @@rebelScience ok thanks

    • @rajarshimondal
      @rajarshimondal Před 3 lety

      It's saying chat is no longer accessible. Please unban me. I'm BABU GUDDU.

  • @md.mahfuzurrahmanbhuyan9351

    What software are u using

    • @rebelScience
      @rebelScience  Před 3 lety

      Hey, sorry what do you mean? Are you asking about the Code Editor I use? It was mentioned in the introduction video and I also have a Development Tools video where I show how to set it up.

  • @alexandergapak4883
    @alexandergapak4883 Před 4 lety +2

    Good! Русские есть?

  • @marzijahan5502
    @marzijahan5502 Před 3 lety

    I am totally beginner for python. should I memorize these functions????'

    • @rebelScience
      @rebelScience  Před 3 lety

      Please watch Interdiction video. I explain everything in that video. Yes, you should be good with Python before watching these videos. Make sure you learn Python first. My Introduction video has links and suggestions.

  • @borispyakillya4777
    @borispyakillya4777 Před 3 lety +1

    Thank you a lot for the video!