GithubHelp home page GithubHelp logo

Illegal instruction about higashi HOT 8 CLOSED

sshen82 avatar sshen82 commented on August 16, 2024
Illegal instruction

from higashi.

Comments (8)

ruochiz avatar ruochiz commented on August 16, 2024 1

Ah... I see. Still not sure why it didn't work on the previous machine. The illegal instruction can also be caused by things like non-intel cpu trying to use mpl library. Basically incompatible CPU / OS / complied python library. But your pytorch example itself worked... I'm thinking it can be some "advanced" functions in pytorch that is used by Higashi... Anyway, Good to see that it works on the other machine.

from higashi.

sshen82 avatar sshen82 commented on August 16, 2024

I tried both pytorch 1.4.0 and 1.10.0, and meet the same error. Is this because my platform only have CPU?

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

That's a strange bug. I tested on a device with only CPU, this error does not pop up. Could you do the following things to help with debugging?

  1. Try to run a small pytorch neural network training example to see if that can finish? (a small example with forward / backward would be fine)
  2. Instead of ./xx.sh just directly run the command python main_cell.py -xxx to see if there are more detailed error information?

from higashi.

sshen82 avatar sshen82 commented on August 16, 2024

For the first part, I tried all the examples in https://github.com/jcjohnson/pytorch-examples that only uses pytorch package, and all of them work well.

For the second, it is actually showing fewer messages...

1636948976(1)

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Could you try:

In the JSON file, set cpu_num = 1

Thishelps to check if the error is triggered by the multiprocessing part of the code. And this machine you are running Higashi on, is it a cluster? Some clusters have strict mpl limitation configured that disabled launching child process within python.

from higashi.

sshen82 avatar sshen82 commented on August 16, 2024

It still gives illegal instruction error. By the way, below is the JSON file.
({"config_name": "Ren2019", "data_dir": "/p/keles/schic/volumeA/Ren2019/higashi11132021",
"temp_dir": "/p/keles/schic/volumeA/Ren2019/higashi11132021/temp", "genome_reference_path": "/u/s/s/sshen82/Rfile/Hi-C/mm9.chrom.sizes",
"cytoband_path": "/u/s/s/sshen82/Rfile/Hi-C/mm9_cytoBand.txt",
"chrom_list": ["chr1", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chrX"],
"resolution": 1000000, "resolution_cell": 1000000, "local_transfer_range": 1, "dimensions": 64, "loss_mode": "zinb", "rank_thres": 1, "embedding_name": "ren_embed",
"impute_list": ["chr1"], "minimum_distance": 1000000, "maximum_distance": -1, "neighbor_num": 3, "cpu_num": 1, "gpu_num": 0})

I think I am not using a cluster. I think it is just a "computer" with a lot of cpus. (Not very sure) Is there a way to check it?

from higashi.

ruochiz avatar ruochiz commented on August 16, 2024

Are the jobs submitted through SLURM or any other management system? If not, it's a pc as you said.

Is there any other machine you can test it on? If not, would it be possible for you to share a small fraction of the dataset for me to test on my end?

from higashi.

sshen82 avatar sshen82 commented on August 16, 2024

I tried it on another machine and it worked, I thought it will be universal for all machines since they are connected... You can ignore the email I sent to you. I will also ask my collaborator to run. Thank you for your help!

from higashi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.