nsh87 / receptormarker Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 9.0 386.91 MB

Source for 'receptormarker' package for R: antibody receptor and phenotypic marker analysis

Home Page: http://receptormarker.com

License: BSD 2-Clause "Simplified" License

R 22.20% Shell 0.40% JavaScript 39.95% CSS 0.11% Python 0.14% Perl 37.19% Raku 0.03%

receptormarker's People

Contributors

Stargazers

Watchers

Forkers

zinedinshiqin catterbu djq99 mma5usf daniel0128 teaguesterling weiqi0608 biolchen lh848

receptormarker's Issues

Package fails to build because lintr incorrectly lints .onLoad and .onAttach functions

Already asked the developer how to get lintr to ignore the line...now waiting for a response. There seems to be a fix on GitHub but that version won't install on R 3.2. You either have to:

wait for the fix to be pushed to CRAN
see if the developer tells you how to ignore the line
try to integrate the fix on GitHub into this package yourself

Exists as of e4babee.

Fix PCA plot legend colors being the same

See image below:

No hover effect for radial phylo widget

The pointer-effects: none CSS in 076aad8 enables the "Save Image" text link to work in RStudio, but removes the hover effects on the phylogram. You used that CSS because in RStudio the <div id=htmlwidgets_container> gets put on top of the "Save Image" link, causing the link to not work. It actually doesn't work in RStudio regardless, so maybe just take this CSS out or play with the SizingPolicy of the widget so the div doesn't lay on top of the text link in RStudio.

Conditionally limit maximum number of plots per row for cluster membership plot

I've capped the site at 20 clusters, but even with 12 clusters the x-axis labels of the cluster membership plot start to overlap.

12 clusters

Additionally, the individual plots becomes very compressed along the x-axis.

20 clusters

If you have control over the number of plots per row, can you just add a check on num_clust and if it's less than 10 then keep up to five plots per row; if it's 10-15 then limit it to 4 plots per row; and perhaps if it's 16-20 then 3 plots per row? I think doing this would take care of both the issues (the overlapping x-axis labels and the illegible graph at ~20 clusters).

Convergence not running locally

I think this has something to do with my version of Perl?

> library(receptormarker)
> convergence(d, seqs_col='clone', verbose=TRUE)
Unrecognized switch: --textfile=/var/folders/w_/vqlk70xx1l954jj0qsswxsnc0000gn/T//RtmpsJBGGA/63341c83ea67/sequences-deduped-633418921ed1.txt  (-h will show valid options).
Error: Convergence output not found: no convergence groups, or error.

This actually happens in R, but not running the file from my terminal. Seems to work on the site, though.

Typo in docs

Typo in clust_boxplot() documentation: It utlizes facet_wrap.

No legend when annotating columns by specific value

When using rings=c(col='all') then every column value gets annotated and there's a legend to explain which color corresponds to which value. If we instead annotate a column by a specific value, there's no legend: rings=c(col='some_val'). This becomes particularly troublesome when we annotate specific values in multiple columns because then we don't know which ring corresponds to what.

See PR #68

Unable to pass additional arguments to kmeans() in multi_clust()

The function is defined by

multi_clust <- function(d, krange = 2:10, iter.max = 300, runs = 10, method = "kmeans", ...)

but the ellipsis isn't included in the call to kmeans:

kmm <- stats::kmeans(d, k, iter.max = iter.max, nstart = 10)

Docs say ... Further arguments to be passed to kmeans.

Have `multi_clust` handle NA's

This just needs to remove all incomplete cases.

Fix "could not find function "aes" error on Cluster Membership olot

Need to fix error caused by using ggplot instead of ggplot2 in f21e609.

Reduce size of cluster membership plot's outliers

It can be very hard to read plots with such large dots for the outliers. They're outliers, but they take over, and it's hard to see the bars, which are really the important part. Can the size of them be reduced (significantly)?

Checking usage of NbClust with existing data sets on the dev site

The purpose if this Issue is to track tests of the current dev version of the package against the master version.

Why? The current dev version contains two significant changes:

The modification of the multi_clust() function to use NbClust to determine the optimal k
- Has the ability to change results of tests on the same data set
- Has the potential to error more often than before (on various data sets) due to the increased complexity of NbClust
- Has the ability to take an inordinate amount of time due to the number of algorithms tested by NbClust
The transition of the multiClust object to a multiClust S4 class
- Has the ability to break images on the site if the new class is not referenced properly in the code

The testing will involve running the same data sets on the staging site (which has the master branch of receptormarker installed) and a local dev version (which has the dev branch installed) and comparing the outcome of various data sets.

Checked boxes below indicate that same result is obtained with NbClust as before, and that all the other potential issues described above have not been observed.

Estimate k (select "Replace empty cells with: 0 on the site):

cmv-fluidym.csv
- It has been ~15 mins and the task is still not complete. I don't know if it's still running or if it has hung.
- Update: the local dev machine with 1.5GB RAM has run out of memory, as indicated by Rserve. Increasing RAM to 2GB.
- Also added the ability to dictate what index NbClust should use ('all' or 'alllong') to try 'all' to see if it decreases memory requirements and increases performance - 84778c7.
- Ran again - the task exceeded the 1hr time limit. Which means that although it's still maxing out the CPU of the backend server, the frontend has considered this task to have errored out and will not receive the response even if the task finishes.
single-cell-boolean.csv
- This is also taking a ridiculous amount of time, after upgrading local site to 2GB RAM. It hasn't finished in ~20 mins. Wait, there is an error: ..Error in multiclust[["k_best"]] : this S4 class is not subsettable... This was due to the clustering task on the frontend server using the old notation for the multiClust structure. I've updated it to use multiclust@k_best, for the new S4 class. Test again.

FYI, @catterbu.

Test case for arg validation should use data frame

See comment on bed3b2c.

Missing doc for multiClust class

Should have documentation that tells you which methods are available and/or what are the names of the objects in the list. Hadley docs.

Rather than having the description of the class in the Return Value section of the multi_clust() function link to the documentation of the class there. You could also do a "see also" to the multiClust documentation from your plotting functions then too.

See documentation for ?muscle::muscle for an example.

[DUPLICATE] Test case for arg validation should use data frame

Kmeans iter.max not being respected in multi_clust()

So the call to kmeans in multi_clust() is

kmm <- stats::kmeans(d, k, iter.max = iter.max, nstart = 10)
# multi_clust() makes iter.max = 300 by default

which is all good and should work properly, but for some reason kmeans is trying only 10 iterations:

fclust <- multi_clust(f, krange=2:20)
Warning messages:
1: did not converge in 10 iterations 
2: did not converge in 10 iterations 
3: did not converge in 10 iterations 
4: did not converge in 10 iterations 
5: did not converge in 10 iterations 
6: did not converge in 10 iterations

No matter what I do to the arguments into multi_clust(), I cannot change kmeans from doing just 10 iterations. If I call kmeans separately, it works fine:

# Here's me using 3 iterations instead of 10
stats::kmeans(f, centers=20, iter.max=3, nstart=10)
Warning messages:
1: did not converge in 3 iterations 
2: did not converge in 3 iterations 
3: did not converge in 3 iterations 
4: did not converge in 3 iterations 
5: did not converge in 3 iterations 
6: did not converge in 3 iterations 
7: did not converge in 3 iterations 
8: did not converge in 3 iterations 
9: did not converge in 3 iterations 
10: did not converge in 3 iterations