GithubHelp home page GithubHelp logo

kmerificator's Introduction

kMerificator

kmerificator's People

Contributors

alexanderdilthey avatar evanbiederstedt avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

evanbiederstedt

kmerificator's Issues

Future feature enhancement: update README, add --help flag to sourcecode

I should probably add a --help flag to this code, so users could see the parameters and parameter meanings.

This should be detailed in the README as well, kMerificator.cpp:

if(arguments.at(i) == "--vcf")
{
	vcfFile = arguments.at(i+1);
}

if(arguments.at(i) == "--referenceGenome")
{
	referenceGenome = arguments.at(i+1);
}

if(arguments.at(i) == "--deBruijnGraph")
{
	deBruijnGraph = arguments.at(i+1);
}

if(arguments.at(i) == "--outputDirectory")
{
	outputDirectory = arguments.at(i+1);
}


if(arguments.at(i) == "--cortex_height")
{
	cortex_height = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--cortex_width")
{
	cortex_width = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--k")
{
	kMer_size = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--threads")
{
	threads = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--onlyPASS")
{
	onlyPASS = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--maximumHaplotypePairs")
{
	maximumHaplotypePairs = Utilities::StrtoI(arguments.at(i+1));
}

if(arguments.at(i) == "--regions")
{
	regions = arguments.at(i+1);
}	

Installing on Mac OS and Ubuntu, future README revision

This is more of a note to my self for the future, but I suppose one could revise the README in the future.

In order to compile on a Mac (circa 2020, v10.15.2):

Recall gcc on Mac isn't gcc. And the llvm libraries installed via brew do not contain standard C++ libraries.
https://stackoverflow.com/questions/14128298/using-homebrew-gcc-and-llvm-with-c-11

So the Makefile I wrote up uses the following:

## boost libraries installed via brew
INC_BOOST = /usr/local/Cellar/boost/1.72.0/include
LIB_BOOST = /usr/local/Cellar/boost/1.72.0/lib
INCS = -I$(INC_BOOST) 
LIBS = -L$(LIB_BOOST) -lboost_system -lboost_filesystem
### explicitly set these
DIR_OBJ = /Users/evanbiederstedt/kMerificator/obj
DIR_BIN = /Users/evanbiederstedt/kMerificator/bin
## I suspect I've named these incorrectly...
CPP = clang++ -std=c++11 -stdlib=libc++
CPPFLAGS = -I/usr/local/opt/llvm/include -fopenmp
LDFLAGS = -L/usr/local/opt/llvm/lib
COPTS  = -ggdb -O2 -fopenmp -std=gnu++0x -fstack-protector-all
## CFLAGS = 
COMPILE = $(CPP) $(COPTS) $(INCS) $(CPPFLAGS) $(LDFLAGS)

Here is the entire Makefile, which worked:

# LIBRARY SETTINGS - SET AS NECESSARY


INC_BOOST = /usr/local/Cellar/boost/1.72.0/include
LIB_BOOST = /usr/local/Cellar/boost/1.72.0/lib
INCS = -I$(INC_BOOST) 
LIBS = -L$(LIB_BOOST) -lboost_system -lboost_filesystem


MKDIR_P = mkdir -p

.PHONY: directories
	
# END LIBRARY SETTINGS

#
# object and binary dirs
#

DIR_OBJ = /Users/evanbiederstedt/kMerificator/obj
DIR_BIN = /Users/evanbiederstedt/kMerificator/bin

CPP = clang++ -std=c++11 -stdlib=libc++
CPPFLAGS = -I/usr/local/opt/llvm/include -fopenmp
LDFLAGS = -L/usr/local/opt/llvm/lib
CXX    = g++
COPTS  = -ggdb -O2 -fopenmp -std=gnu++0x -fstack-protector-all
## CFLAGS = 
COMPILE = $(CPP) $(COPTS) $(INCS) $(CPPFLAGS) $(LDFLAGS)

VPATH = Data:Graph:NextGen:hash:hash/deBruijn:hash/sequence:GraphAligner:GraphAlignerUnique:readFilter
        
OBJS = \
        $(DIR_OBJ)/Utilities.o \
        $(DIR_OBJ)/DeBruijnGraph.o \
        $(DIR_OBJ)/DeBruijnElement.o \
        $(DIR_OBJ)/basic.o \
        $(DIR_OBJ)/binarykMer.o \
        $(DIR_OBJ)/Hsh.o \
        $(DIR_OBJ)/Validator.o \
#
# list executable file names
#
EXECS = kMerificator

OUT_DIR = ../obj ../bin

directories: ${OUT_DIR}


#
# compile and link
#
default:
	@echo
	@echo " to build:"
	@echo "    make all"
	@echo
	@echo " to clean:"
	@echo "    make clean"
	@echo "    make realclean"
	@echo

all: directories $(EXECS)

$(EXECS): $(OBJS)
	$(foreach EX, $(EXECS), $(COMPILE) $(EX).cpp -c -o $(DIR_OBJ)/$(EX).o;)
	$(foreach EX, $(EXECS), $(COMPILE) $(OBJS) $(DIR_OBJ)/$(EX).o -o $(DIR_BIN)/$(EX) $(LIBS);)

$(DIR_OBJ)/%.o: %.cpp %.h
	$(COMPILE) $< -c -o $@


#
# odds and ends
#
clean:
	/bin/rm $(DIR_OBJ)/*

realclean: clean
	/bin/rm $(DIR_BIN)/*

${OUT_DIR}:
	${MKDIR_P} ${OUT_DIR}

README additions, outputs

Additions to README

Method: With / without VCF

  • Total characters: Length of the optimal pair of considered haplotypes (the thing assumes diploidy, hence here 2 x chromosome length)

  • Total non-gap characters: Non-gap characters in the optimal pair of considered haplotypes

  • kMers: Number of kmers in the optimal pair of haplotypes

  • kMers invalid: Of these, invalid kmers (e.g. weird characters)

  • kMers present: Of the valid kmers, kMers recovered from the de Bruijn graph
    Unweighted optimality: Proportion of kmers in the optimal pair of considered haplotypes classified as 'recovered'

  • Coverage-weighted optimality: ??


- Level: Row ID, 0-indexed
- ReferenceCoordinate: location of basepair, 1-index
- kMersInduced:
- kMersInvalid:
- kMersOK:
- InducedkMersInReference:
- DiploidChromotype:
- ChromotypeLostPhase:
- GapsAtLevel:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.