GithubHelp home page GithubHelp logo

Comments (10)

jkrijthe avatar jkrijthe commented on August 17, 2024

Thanks for the bug report. Do you have a reproducible example to make it easier to debug the issue?

from rtsne.

ChenMorSays avatar ChenMorSays commented on August 17, 2024

I do, I have the following script, which runs on a matrix of 3 Million columns, and about 150 rows:

################################## Install packages if required ##################################
list.of.packages <- c("ggplot2", "Rtsne")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
library(Rtsne)
##################################################################################################

#1. load and transpose the example matrix
exampleMatrix = read.csv('example.matrix', sep='\t', na.strings="NA", header=TRUE, row.names=1)

##we must normalize the example matrix and then transpose
example_t = t(scale(exampleMatrix, center = TRUE, scale = TRUE));
srs <- read.csv("all-available.txt", sep='\t',na.strings="NA", header=TRUE, row.names=1)

filter_example_matrix <- function (use_case_srs) {
  sr_annotations_with_example_t <- which( row.names(example_t) %in% row.names(use_case_srs) )
  use_case_samples <- example_t[sr_annotations_with_example_t,]
  
  return(use_case_samples)
}

use_case_samples <- filter_example_matrix(srs)
example_df <- as.data.frame(use_case_samples)## essential for K means
example_df.matrix <-data.matrix(example_df)

## Curating the database for analysis with both t-SNE and PCA
Labels <- row.names(example_df)

## for plotting
colors = rainbow(length(unique(Labels)))
names(colors) = unique(Labels)

# ~~~~~~~~~~~~~~~~~~~~~~~~~~ WILL CRASH AFTER THE FOLLOWING LINE! ~~~~~~~~~~~~~~~~~~~~~~~~~~

## Executing the algorithm on curated data
d_tsne_1 <- Rtsne(example_df[,-1], dims = 3, perplexity=30, verbose=TRUE, max_iter = 500)
#exeTimeTsne <- system.time(Rtsne(example_df[,-1], dims = 3, perplexity=30, verbose=TRUE, max_iter = 500))

## keeping original data
d_tsne_1_original=d_tsne_1

print("Executing k-means")

## Creating k-means clustering model, and assigning the result to the data used to create the tsne
fit_cluster_kmeans=kmeans(scale(d_tsne_1), 3)

## Export clusters into a CSV file for verification purposes
tsne_clusters <- fit_cluster_kmeans$clusters
write.csv(tsne_clusters, file="tsne_clusters.csv")

d_tsne_1_original$cl_kmeans = factor(fit_cluster_kmeans$cluster)

## Creating hierarchical cluster model, and assigning the result to the data used to create the tsne
fit_cluster_hierarchical=hclust(dist(scale(d_tsne_1)))

## setting 3 clusters as output
d_tsne_1_original$cl_hierarchical = factor(cutree(fit_cluster_hierarchical, k=3))  

#Plotting the cluster models onto t-SNE output
#Now time to plot the result of each cluster model, based on the t-SNE map.

plot_cluster=function(data, var_cluster, palette)  
{
  ggplot(data, aes_string(x="V1", y="V2", color=var_cluster)) +
    geom_point(size=0.25) +
    guides(colour=guide_legend(override.aes=list(size=6))) +
    xlab("") + ylab("") +
    ggtitle("") +
    theme_light(base_size=20) +
    theme(axis.text.x=element_blank(),
          axis.text.y=element_blank(),
          legend.direction = "horizontal", 
          legend.position = "bottom",
          legend.box = "horizontal") + 
    scale_colour_brewer(palette = palette) 
}

plot_k=plot_cluster(d_tsne_1_original, "cl_kmeans", "Accent")  
plot_h=plot_cluster(d_tsne_1_original, "cl_hierarchical", "Set1")

## and finally: putting the plots side by side with gridExtra lib...
library(gridExtra)  
grid.arrange(plot_k, plot_h,  ncol=2)

## Export the plot into a PDF file for further analysis
pdf("tsne_quality_control_plot.pdf",width=7,height=5)
dev.off()


from rtsne.

jkrijthe avatar jkrijthe commented on August 17, 2024

Since I do not have the data, I can not use the script directly. I have also been unable to reproduce the behavior on my machine so far. My first guess would be that something goes wrong in the conversion of the data.frame to the matrix used in Rtsne. Have you tried running Rtsne with example_df.matrix instead of example_df? Anyway, it would be useful to make a simpler example script using generated data that causes the same bug in order for me to reproduce the problem.

from rtsne.

jkrijthe avatar jkrijthe commented on August 17, 2024

Unless we have more information/a reproducible example, I'm afraid I can't be of much help. Thanks for reporting the error.

from rtsne.

swati051 avatar swati051 commented on August 17, 2024

same issue .. please help !

from rtsne.

jkrijthe avatar jkrijthe commented on August 17, 2024

Do you have a reproducible example or more details that would help us track down the issue?

from rtsne.

Troelsmou avatar Troelsmou commented on August 17, 2024

I had the same issue as this one. Was able to fix it by converting to a numeric matrix. As in removing all non-integer or non-double columns and then converting through as.matrix.

from rtsne.

jkrijthe avatar jkrijthe commented on August 17, 2024

Thanks for the report and solution. Do you happen to have a small reproducible example for when this behaviour occurs, or was it specific to the dataset you were working on?

from rtsne.

Troelsmou avatar Troelsmou commented on August 17, 2024

I think its specific to the dataset. The dataset consists of 848259 integer columns with values either 0, 1 or 2, with 157 rows.

from rtsne.

SamGG avatar SamGG commented on August 17, 2024

I am quite dubious about using tSNE to represent a small number of points. OK, it works iris, but still.
a) I would try to run tSNE with a couple seeds and lower the perplexity to check how much output is stable.
b) When the number dimensions is higher than 50, a PCA is applied by default. I would try to compute it and give the result to Rtsne, which probably avoids all those strange behaviors I never encountered.
c) When pre-computing PCA, I would use the 2 first components to set the Y_init parameters and check if early exaggeration need to be restored, as setting Y_init skip it. Check the doc about this.

from rtsne.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.