GithubHelp home page GithubHelp logo

nsh87 / receptormarker Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 9.0 386.91 MB

Source for 'receptormarker' package for R: antibody receptor and phenotypic marker analysis

Home Page: http://receptormarker.com

License: BSD 2-Clause "Simplified" License

R 22.20% Shell 0.40% JavaScript 39.95% CSS 0.11% Python 0.14% Perl 37.19% Raku 0.03%

receptormarker's People

Contributors

catterbu avatar daniel0128 avatar mma5usf avatar nsh87 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

receptormarker's Issues

No hover effect for radial phylo widget

The pointer-effects: none CSS in 076aad8 enables the "Save Image" text link to work in RStudio, but removes the hover effects on the phylogram. You used that CSS because in RStudio the <div id=htmlwidgets_container> gets put on top of the "Save Image" link, causing the link to not work. It actually doesn't work in RStudio regardless, so maybe just take this CSS out or play with the SizingPolicy of the widget so the div doesn't lay on top of the text link in RStudio.

Conditionally limit maximum number of plots per row for cluster membership plot

I've capped the site at 20 clusters, but even with 12 clusters the x-axis labels of the cluster membership plot start to overlap.

12 clusters
screen shot 2015-07-19 at 10 47 52 pm

Additionally, the individual plots becomes very compressed along the x-axis.

20 clusters
screen shot 2015-07-19 at 10 49 32 pm

If you have control over the number of plots per row, can you just add a check on num_clust and if it's less than 10 then keep up to five plots per row; if it's 10-15 then limit it to 4 plots per row; and perhaps if it's 16-20 then 3 plots per row? I think doing this would take care of both the issues (the overlapping x-axis labels and the illegible graph at ~20 clusters).

Convergence not running locally

I think this has something to do with my version of Perl?

> library(receptormarker)
> convergence(d, seqs_col='clone', verbose=TRUE)
Unrecognized switch: --textfile=/var/folders/w_/vqlk70xx1l954jj0qsswxsnc0000gn/T//RtmpsJBGGA/63341c83ea67/sequences-deduped-633418921ed1.txt  (-h will show valid options).
Error: Convergence output not found: no convergence groups, or error.

This actually happens in R, but not running the file from my terminal. Seems to work on the site, though.

Typo in docs

Typo in clust_boxplot() documentation: It utlizes facet_wrap.

No legend when annotating columns by specific value

When using rings=c(col='all') then every column value gets annotated and there's a legend to explain which color corresponds to which value. If we instead annotate a column by a specific value, there's no legend: rings=c(col='some_val'). This becomes particularly troublesome when we annotate specific values in multiple columns because then we don't know which ring corresponds to what.

See PR #68

Unable to pass additional arguments to kmeans() in multi_clust()

The function is defined by

multi_clust <- function(d, krange = 2:10, iter.max = 300, runs = 10, method = "kmeans", ...)

but the ellipsis isn't included in the call to kmeans:

kmm <- stats::kmeans(d, k, iter.max = iter.max, nstart = 10)

Docs say ... Further arguments to be passed to kmeans.

Reduce size of cluster membership plot's outliers

It can be very hard to read plots with such large dots for the outliers. They're outliers, but they take over, and it's hard to see the bars, which are really the important part. Can the size of them be reduced (significantly)?

screen shot 2015-07-19 at 10 46 06 pm

Checking usage of NbClust with existing data sets on the dev site

The purpose if this Issue is to track tests of the current dev version of the package against the master version.

Why? The current dev version contains two significant changes:

  1. The modification of the multi_clust() function to use NbClust to determine the optimal k
    • Has the ability to change results of tests on the same data set
    • Has the potential to error more often than before (on various data sets) due to the increased complexity of NbClust
    • Has the ability to take an inordinate amount of time due to the number of algorithms tested by NbClust
  2. The transition of the multiClust object to a multiClust S4 class
    • Has the ability to break images on the site if the new class is not referenced properly in the code

The testing will involve running the same data sets on the staging site (which has the master branch of receptormarker installed) and a local dev version (which has the dev branch installed) and comparing the outcome of various data sets.

Checked boxes below indicate that same result is obtained with NbClust as before, and that all the other potential issues described above have not been observed.

Estimate k (select "Replace empty cells with: 0 on the site):

  • cmv-fluidym.csv
    • It has been ~15 mins and the task is still not complete. I don't know if it's still running or if it has hung.
    • Update: the local dev machine with 1.5GB RAM has run out of memory, as indicated by Rserve. Increasing RAM to 2GB.
    • Also added the ability to dictate what index NbClust should use ('all' or 'alllong') to try 'all' to see if it decreases memory requirements and increases performance - 84778c7.
    • Ran again - the task exceeded the 1hr time limit. Which means that although it's still maxing out the CPU of the backend server, the frontend has considered this task to have errored out and will not receive the response even if the task finishes.
  • single-cell-boolean.csv
    • This is also taking a ridiculous amount of time, after upgrading local site to 2GB RAM. It hasn't finished in ~20 mins. Wait, there is an error: ..Error in multiclust[["k_best"]] : this S4 class is not subsettable... This was due to the clustering task on the frontend server using the old notation for the multiClust structure. I've updated it to use multiclust@k_best, for the new S4 class. Test again.

FYI, @catterbu.

Missing doc for multiClust class

Should have documentation that tells you which methods are available and/or what are the names of the objects in the list. Hadley docs.

Rather than having the description of the class in the Return Value section of the multi_clust() function link to the documentation of the class there. You could also do a "see also" to the multiClust documentation from your plotting functions then too.

See documentation for ?muscle::muscle for an example.

Kmeans iter.max not being respected in multi_clust()

So the call to kmeans in multi_clust() is

kmm <- stats::kmeans(d, k, iter.max = iter.max, nstart = 10)
# multi_clust() makes iter.max = 300 by default

which is all good and should work properly, but for some reason kmeans is trying only 10 iterations:

fclust <- multi_clust(f, krange=2:20)
Warning messages:
1: did not converge in 10 iterations 
2: did not converge in 10 iterations 
3: did not converge in 10 iterations 
4: did not converge in 10 iterations 
5: did not converge in 10 iterations 
6: did not converge in 10 iterations 

No matter what I do to the arguments into multi_clust(), I cannot change kmeans from doing just 10 iterations. If I call kmeans separately, it works fine:

# Here's me using 3 iterations instead of 10
stats::kmeans(f, centers=20, iter.max=3, nstart=10)
Warning messages:
1: did not converge in 3 iterations 
2: did not converge in 3 iterations 
3: did not converge in 3 iterations 
4: did not converge in 3 iterations 
5: did not converge in 3 iterations 
6: did not converge in 3 iterations 
7: did not converge in 3 iterations 
8: did not converge in 3 iterations 
9: did not converge in 3 iterations 
10: did not converge in 3 iterations 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.