GithubHelp home page GithubHelp logo

qile0317 / sofocompbio22 Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 15.8 MB

Updates and analysis scripts for my SoFo project.

License: BSD 3-Clause "New" or "Revised" License

Julia 20.19% Jupyter Notebook 79.81%

sofocompbio22's Introduction

SoFoCompBio22 - work in progress

If you miraculously came here from my physical copy of my research report, hi! Hope you enjoyed it and hopefully more progress will have been made with my project.

If you stumbled upon this repo you may be confused. SoFoCompBio is abbreviation for "SOmmarFOrskarskola i berakningsbiologi och bioinformatik" which is a computational biology research internship at karolinska institutet, and I had the pleasure of attending its first iteration in the summer of 2022 among 5 students picked nationally in Sweden, at the Karlsson Hedestam/Murrell Lab. I was additionally even more fortunate to be accepted to the "sommarforskaskola med biomedicinskt inriktning, " karolinska institutets's long-running wet lab summer research program. This allowed me the exclusive opportunity to conduct a joint wet-lab & computational research project.

The project was about progressing the camelid germline VHH repertoire through NGS of PBMC mRNA samples from a huarizo and a computational genome mining algorithm.

The genome mining algorithm has been uploaded as a preliminary package at https://github.com/Qile0317/KmerGMA.jl

Project abstract - (Project is unfinished)

The alpaca adaptive immune system partially produce heavy chain only antibodies, characterized by variable and constant regions referred to as VHH and CHH. Procurement of a com- prehensive and diverse alpaca VHH gene repertoire is essential for the understanding of B cell biology and has numerous ad- vantages and benefits in therapeutics and research via usage of alpaca nanobodies. However, the full repertoire is far from complete. Here, we contribute to the repertoire via the creation and execution of a modified 5’RACE protocol for next genera- tion sequencing of both VHH and conventional VH mRNA tran- scripts from a huarizo (Vicugna pacos × lama glama). The re- sulting sequenced repertoire revealed over 600 thousand high quality VDJ & VHH transcripts, including 300 thousand IgM transcripts that can be processed in subsequent studies with germline inference tools and experimental verification. Rudi- mentary phylogenetic and V-gene assignment analyses pointed strongly at the existence of novel germline alleles in our se- quenced repertoire compared to the IMGT databse, and sub- sequent analyses suggested a strong correlation of isotype fre- quency to presence of nanobody hallmark mutations. Addition- ally, we propose a novel, swift genome mining algorithm for V gene discovery. Our unoptimized Julia implementation of the algorithm was applied over camelid V gene loci and the Vic- Pac3.2 full coverage alpaca genome and successfully found both exact and approximal matches in linear time. Conclusively, our study utilized 2 approaches to successfully progress the current alpaca V gene repertoire to completion with high improvement potential

Overview of the Repo

  • Data: some of the sequence data used in the paper that werent too large in filesize.
  • Figures: a collection of cool figures generated in the analysis that didnt make it to the paper due to the word limit.
  • AnalysisScripts: collection of scripts (except Vsearch commands) described in my methods section that used to process the rep-seq data.

sofocompbio22's People

Contributors

qile0317 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.