Duration 1:42:54

Python for Bioinformatics - Drug Discovery Using Machine Learning and Data Analysis

479 952 watched
0
15.3 K
Published 2 Jun 2021

Learn how to use Python and machine learning to build a bioinformatics project for drug discovery. ✏️ Course developed by Chanin Nantasenamat (aka Data Professor). Check out his YouTube channels for more bioinformatics and data science tutorials: /dataprofessor/ and /codingprofessor/ 🔗 And Medium blog posts for more data science tutorials https://data-professor.medium.com/ ⭐️ Code ⭐️ 💻 Parts 1-5 https://github.com/dataprofessor/bioinformatics_freecodecamp/ 💻 Part 6 https://github.com/dataprofessor/bioactivity-prediction-app ⭐️ Course Contents ⭐️ ⌨️ (0:00) Introduction ⌨️ (4:29) Part 1 - Data collection ⌨️ (26:57) Part 2 - Exploratory data analysis ⌨️ (49:41) Part 3 - Descriptor calculation ⌨️ (1:01:51) Part 4 - Model building ⌨️ (1:10:41) Part 5 - Model comparison ⌨️ (1:18:15) Part 6 - Model deployment 🎉 Thanks to our Champion and Sponsor supporters: 👾 Wong Voon jinq 👾 hexploitation 👾 Katia Moran 👾 BlckPhantom 👾 Nick Raker 👾 Otis Morgan 👾 DeezMaster 👾 Treehouse -- Learn to code for free and get a developer job: https://www.freecodecamp.org Read hundreds of articles on programming: https://freecodecamp.org/news

Category

Show more

Comments - 322
  • @
    @Alecor_studio3 years ago Never in my life would i think i would get a video for this. There' s not many resources on bioinformatics on youtube. This is godsend! Thank you very much. 388
  • @
    @louzewdie91042 years ago As someone with a bioinformatics minor who always wants to learn more, i cannot express how ecstatic and grateful i am to find your content. Thank you good sir! 12
  • @
    @helloworld20543 years ago I want to study bioinformatics and i' m currently learning python. So grateful for this course! 138
  • @
    @clodsire_3 years ago Wish this vid came out sooner. Im about to graduate as a bioinformatics major and i remember struggling so much bc of the lack of online resources and m . ...Expand 43
  • @
    @donutsandpenguins28393 years ago Can' t believe how lucky i am! I was just considering my options among the hundreds of computer science branches yesterday, and bioinformatics and computational . ...Expand 38
  • @
    @emmanuelonah45962 years ago Thank you professor. It' s so heartwarming to have this quality of information shared for free. The best i have seen onall my questions answered and opened my eye to the many facets of computational drug modeling and simulation. Thank you once again, prof. ...Expand 11
  • @
    @omaralam79753 years ago Im starting a job at astrazeneca next month. Thank you sir for the course, i will use the knowledge to the best of my ability. 96
  • @
    @miltonkambarami81293 years ago I am from the virology side of bioinformatics and its wonderful to find atutorial on alearning platform. I embrace more of the computational side to answer biological questions and the two are starting to merge as one. Sooner or later programming is going to be a compulsory skill just like writing but what will matter is types of questions and problems you can solve using programming. ...Expand 39
  • @
    @shakuntalabaichoo79223 years ago This is the best bioinformatics video/tutorial on youtube, i have ever come across. Thank you so much for such a great. 3
  • @
    @jaggyjut3 years ago This is pure gold. Thank you for creating this tutorial. 13
  • @
    @stringpriest8 months ago Thanks so much for this video. I obtained an msc in pharmacognosy and taking another masters in regulatory science. Trying to go into tech, i started a . ...Expand
  • @
    @bassamtork76793 years ago Huge work is put in this video, wonderful method of delivering the course. Many thanks to dr. Chanin nantasenamat. 3
  • @
    @apoorvasetu58403 years ago Hello chanin,
    i am using ec50 standard_type forwhen labeling compounds, is it the same as ic50 or is it different?
    1
  • @
    @khanmubeen3 years ago Respected professor! I' m quite amazed of how you stepped out to share and create such a knowledgeable course for all of us. I love that. 2
  • @
    @MrViperHiggins3 years ago Love it man keep up the great work. Really appreciate people like yourself making data science accessible to those of us coming through the process of learning and working with data. 6
  • @
    @leonardomontes29753 years ago You are always on top guys! I really appreciate the way you share the knowledge! Thanks quincy! 8
  • @
    @benjamintwumasi24806 months ago Thanks so much for this video, professor. You have saved a million lives of people who want to learn python for drug discovery. Thank you once again.
  • @
    @jannmikoingelrabagogamingc60122 years ago Oh hey! I am so grateful and excited for this one man! I am an upcoming 3rd year bs pharmacy student and planning to do my undergraduate thesis on this area specifically. 2
  • @
    @palakgupta87283 years ago its a great video really helpful. at you are not able to see other types of standard type variable because youve already entered its filter as ic 50 so it shows only ic50 entries. once you remove filter.standard type[ic50] , youll see array with inhibition, ki, ec 50, kd, activity. .. ...Expand 3
  • @
    @LM-ch8rh3 years ago Hi. I' m trying to followcourse but what order do i watch them? I started with this video but he then refers to a previous video where he showed us how to download data. Is there a place where i can get the recommended order of videos to watch? Thank you so much. ...Expand
  • @
    @Teflon20003 years ago I' ve study industrial chemistry in college. This video is perfect. 7
  • @
    @0307ismail3 years ago Hi data professor, i am doing phd in bioinformatics and my research topic is the same as of this video. Thnx alot for this video. I think i will need more of your support and guideline for my research. 4
  • @
    @azadjain85343 years ago I was searching for this content for long time. Finally this arrived. Thank you so much for invaluable content. 7
  • @
    @gurudeebanselvaraj88883 years ago Excellent session. Hats off to you @data professor. I really enjoyed the whole lecture. It is quite informative for budding researchers in. 3
  • @
    @knockknockyoo58123 years ago I truly appreciate it for sharing this video for free. 2
  • @
    @brunobustos963 years ago Computational neuroscience next please! Love your channel. 12
  • @
    @joyrainbowdress3 years ago I' m a dentistry student but i love computers as much as i love biology, i also would love to explore the world of clinical research and i think i' d take up bioinformatics in my masters along with my clinical practice, it seems like the best intersection between healthcare and it. ...Expand 17
  • @
    @sohithreddy4 months ago descriptor_output.csv file used in is same for anything we take right? like i mean padel descriptor.
  • @
    @christianscientist39633 years ago Since you used the chembl database to look at targets and small molecules in your course, is there a library code to extract zinc and pubchem drug database to use their small molecule? 3
  • @
    @prajyotprabhu8273 years ago Excellent! Much appreciated efforts you folks are putting in. Thank you very much. 2
  • @
    @georgevoknerech2283 years ago Thank you so much. I' ve never dreamed that we would' ve get a lesson on. 1
  • @
    @andrewchen23493 years ago So good to see data professor on fcc again! Thank you! 2
  • @
    @OliverShey3 years ago Hi sir, thanks so much for the wonderful presentation. I have never watched such a clear video on the subject. Keep it up. I have another concern. I changed . ...Expand 4
  • @
    @deeptibhanot7622 years ago Hello professor, i want to work on parameter tuning of 3d bioprinters before they print, so it will take input as the tissue type /bone type that we want please help me from where i can get dataset for 3d bioprinted organs as well as dataset of 3d bioprinters. ...Expand 1
  • @
    @BerkshireHathawayCRE3 years ago I hope you guys can make more bioinformatics videos! Such an important application! 4
  • @
    @e.m.26553 years ago Been hoping you would make this video! I have a background in biology and a very small bit of experience in informatics. 9
  • @
    @curiousresearcher308last year Hello! I have a question. The bioactivity class is not showing for me. This is the error i am receiving
    " #39; bioactivity_class' not in index"
  • @
    @user-fe2oh8oj2u3 years ago Looking forward for " python for finance" 44
  • @
    @kamalikabhattacharjee3363 years ago In part 5 for " compare mlthe code is showing mecannot import namefrom" any idea how to resolve that. 1
  • @
    @iam1.last year Would it be recommended to split the test and train data differently, i would assume that if we had less to train and more to test on then the results would be more accurate, but if im wrong please somebody explain.
  • @
    @ccuny13 years ago Fantastic. Incredible material on this channel and this is no exception. Thank you. 3
  • @
    @mahmoudal-bassam25077 months ago This is very helpful! Thank you so much. I tried something like this to compare distribution of normal and -log10 data (importas plt
    fig, ax =nrows=2)
    ax=ax[0]
    x: 50, ax = ax[1]
    .
    ...Expand
  • @
    @lukecahalane18113 years ago When i try and display the dataframe in the data collection section in pycharm i get a mixture of parentheses and not all the data that is actually present on chembl, any help would be appreciated.
  • @
    @vpundir30243 years ago Its very useful for doctors who are learning coding. 7
  • @
    @jondoe86583 years ago Thank you and to all the people who share their knowledge for free. 2
  • @
    @LujoSey5 months ago Hi prof, i was following your session closely but i got lost on the mounting of colab after creating the 3 dataframes. Would you please explain the colab jupiter interface for a layman like me? Thanks.
  • @
    @kevindelgado29822 years ago Glad this information exist on youtube, in my thesis i' ll be using the megamolbart model from nvidia:
  • @
    @user-yb6hm1jz6s9 months ago Thank you very much data professor for sharing the great learning resource!
  • @
    @ayeshaafzal4884last year Loved the whole video. Recommendation for part 4 in colab, to prevent error. In the last scatterplot building chunk use: ax =y=y_pred, . 1
  • @
    @WelingtonSilvaMusica3 years ago That' s exactly what i was searching for, what rich material! Thank u so much!
  • @
    @pacifio3 years ago Man be expert in biology and computer science while me just being dumb 24/7 epic style. 23
  • @
    @crismo77533 years ago Thank you very much for this great tutorial! Sometimes i am asking myself how much do you wish to have a better equipment, but i don' t dare to ask you that. I wish you great success and good health!
  • @
    @paulynamagana75603 years ago I' m doing my phd in pharmacy and i wish i had seen your videos before. I have discovered my love fortoo late to change my phd tho. 6
  • @
    @aashishkatyal3 years ago Dear professor,
    it is always awesome to watch and learn from your sessions. Can you make some video regarding docking by using python?
    2
  • @
    @dojaibi2753 years ago Am currently study mscthank you for this package. 10
  • @
    @shenglinjing73502 years ago In part 5, runningdoesn' t give any output and if i doit returns (0, 4) do these 2 lines in the original
    =x_train, y_train, y_train)
    =x_test, y_train, y_test)
    .
    ...Expand
  • @
    @chennakeshvapodila53672 years ago Hey are we taking a drug from chembl database and predicting the best drug or are we taking a desease causing protein and finding a cure to it. Plz explain?
  • @
    @GCKteamKrispy2 years ago Just found a book about this and got excited about this field. It sounds interesting.
  • @
    @MoonSahab2 years ago I am trying to save the molecule. Smi file as txt to upload in the prediction app but it doesn' t work out. Can someone please guide me to get the bioactivity i have implemented the complete code but it results out as csv and conversation doesn' t work for me.
    thanks
    .
    ...Expand
  • @
    @hoyingan65772 years ago Dear data professor, how could i use other bioinformatic db webpages, like rcsb pdb kegg data and adapt them to your codelab?
  • @
    @exons-codingforthebest82952 years ago Dear sir, model building ended up in an error - input contains nan, infinity or a value too large for=y_train)
    r2 =y_test)
    r2) can please help in this regard
    .
    1
  • @
    @lllll72602 years ago Does the coronavirus target protein in the beginning mean protein that binds to coronavirus? I am confused with that.
  • @
    @VyshnavieRSarma-rb7ur3 years ago Wow thank you professor. Really enjoy learning a lot from your videos. 2
  • @
    @sinakoohbour3 years ago I am not able to install the libraries. It shows " in [0]quot; next to every item and am not able to get past that part. Any ideas would be appreciated. Thanks.
  • @
    @lo88852 years ago Beginning part 4, big clap for all of us
    hope to find the thesis subject.
  • @
    @moca3513 years ago What modules do i need to get the basics for this course?
  • @
    @user-uj8up5jd6j3 years ago Somehow when I follow along, from I get "NaN" for items 128-132 in the bioactivity class column instead of "inactive" This throws off my results for the rest of the procedures. Does anyone know how I can fix this? .. ...Expand
  • @
    @scign2 years ago In the deployed app, you should highlight any significant differences in the removed columns since those parts of the fingerprint could be important for also, what was the purpose of calculating the lipinsky descriptors if only the pubchem fingerprint was going to be used in the model?. ...Expand 1
  • @
    @ranggawrnt3 years ago Thank you so much for your wonderful video presentation, this is really helpful!
  • @
    @DavidRamirezdmramirezs3 years ago Wow great video tutorial. I will try to implement it in my lab.
    thanks!
    1
  • @
    @ayeshaafzal48849 months ago Can this whole process be regarded as a qsar model? 1
  • @
    @OfficialBunnE2 years ago What microphone and camera do you use?
  • @
    @DrNoureddinSadawi3 years ago This is a fantastic tutorial, many thanks!
  • @
    @militant_dilettante6 months ago Thank you for such wonderful lesson! So much useful information, and so tightly packed!
    i have questions, though. In the beginning of both notebooks is this step specifically for calculation of the lipinski' s descriptors? Because later we concatenate the descriptors dataframe with the original dataframe.
    also, i am doing that lesson more than 2 years after the video was posted, and in thedata chembl fed me at least onevalue of 0. What do you recommend replacing it with: 1 or nan?
    .
    ...Expand
  • @
    @molecule_mindslast year How to use our created models? Like i created a model forhow to use it like this app?
  • @
    @noshintasnia43026 months ago Hello data professor, how can i user your bioactivity app without creating your code? I mean if i download your code file then how could i run, from which software i could run it.
  • @
    @dumbkiddo31892 years ago This channel should receive an academic award in pedagogy. What you guys are doing is amazing!
  • @
    @shivanipawar22963 years ago Thank you so much sir for this useful information. It' s very difficult to get any information about bioinformatics on youtube. I' m currently studying in bsc(looking forward for more information. 2
  • @
    @tejiyo3 years ago In part 3, Why did you convert the df to .smi file and used command line to apply padel function, can you explain the line " ! bash " please 1
  • @
    @DooDooDaddyTV3 years ago I have my bachelors in biology and have worked in a microbiology lab for the last 4 years. Im currently learning python and r though sites like code academy. Do . ...Expand
  • @
    @miracleuche5862 years ago Thank you so much for this amazing tutorial.
  • @
    @molecule_mindslast year In part 4 im getting an error. In scatter plot it is showing error like regplot takes from 0 to 1 positional arguments but 2 positional arguments were given.
  • @
    @febaelsamathew93483 years ago This was a topic i' m searching for several months. Since i' m from a pharma background and started learning python, but don' t have an idea how this will work for job applications to the pharmaceutical industry. ...Expand
  • @
    @valentinaortiz99356 months ago So, correct me if i am wrong, but what you just taught us is how to potentially find a hit through acomputational screening?
  • @
    @michaelfrimpong63443 years ago Dear data professor. Thanks for the wonderful videos. Please i have been able to follow along the video up to part3 where i am struggling. The command . ...Expand 1
  • @
    @tinacole14503 years ago Hey. Having trouble saving a fastq file and/or txt file . it prints well in my editor (VS) but I only get 1 line of data in the file when I save it. id="hidden42" class="buttons"> '
    The file (which is part of a bigger context manager)
    for fastq_obj in fastqfile:
    fastq_obj[0]
    sequence = fastq_obj[1][0:5]
    bartrim = fastq_obj[1][5:] # to trim sequence barcode
    data = clinical_data.loc[clinical_data.Barcode==sequence] we loop thru fastq
    first method:
    with open('data','w+') as m:
    print(data, file=m, end='')
    m.close()
    next method:
    csv_data = data.to_csv(path_or_buf = '\ndata.csv', index = True)
    - If I type print(data) there are tons of lines which print out. but saving only has 1 line
    .. ...Expand
  • @
    @theaieducator15953 years ago Bioinformatics isfor the helpful videos.
  • @
    @kazishahjalal685210 months ago Can anyone be kind to install the libraries. I have been trying to figure them for about 2. 5 hours and still import does not seem to work. Whenever i write in pycharm it shows error.
  • @
    @indumatisharma36463 years ago Thanks for this video. Can the model be deployed in rshiny? 1
  • @
    @MolecularMatt08217 months ago The gethub items are not coming up for me, would you mind maybe uploading them? Other than that, these videos look amazing! My thesis is bioinformatics . ...Expand 1
  • @
    @michaelmoore75682 years ago How do you say that the best way to learn data sciences to do data science does it have anything to do with abs rule for example? Also our bio informatics . ...Expand
  • @
    @aleksandonov84133 years ago Back in the day i was reading perl in bioinformatics and it was mostly regular expressions. I guess these days it is python and machine learning. 1
  • @
    @kyleerb94733 years ago When writing the mann whitney function why not put it in a for loop to allow for multiple descriptors to be added for each call of the function, this reduced . ...Expand
  • @
    @jaggyjut3 years ago Would be able to do a session on hipaa complaint application design. It will be interesting to know how to design the.