GithubHelp home page GithubHelp logo

kmansouri / opera Goto Github PK

View Code? Open in Web Editor NEW
89.0 89.0 40.0 2.03 GB

Free and open-source application (command line and GUI) providing QSAR models predictions as well as applicability domain and accuracy assessment for physicochemical properties, environmental fate and toxicological endpoints. ==================>Download the latest compiled version from the "releases" tab and run the executable installer.

License: MIT License

MATLAB 84.24% Makefile 0.06% Java 9.71% M4 5.98% HTML 0.01%

opera's People

Contributors

kmansouri avatar rakeshsahu24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opera's Issues

Impossible use of OPERA CL in Red Hat

Dear Kamel,
Many thanks for provinding OPERA to all.
I would be interested in using version 2.9 in command line on a Linux platform.
I have already installed the software with a bit of difficulty but that's fine now (I suppose).
While using command line "OPERA_CL mcr_directory -h" I have no problem and the help is written on the screen.
But when I am trying to compute ADME properties on the Sample_50.sdf file I get the following messages:

Endpoints to be calculated:
FUB, CLINT and CACO2

Initializing and loading models...
Error using OPERA
Default install folder was changed during installation. Update OPERA_installdir.txt file in /home/silvio/.mcrCache9.12/OPERA_1/..

As I didn't change anything during the installation process, I don't undertsand this message.
Do you have any idea of what to do to deal with this problem?
Many thanks for your help.
Regards,

Silvio

problem with starting app.

When I start OPERA_UI/CL, the app crashed/no command line displayed.
There were no problem when I installed softwares.
Please tell me how to do solution.

Error with path in OPERA2_linux.m

Using the newest offline installation for linux, I ran into the following issue:

PaDEL calculating 2D descriptors...
Error: Unable to access jarfile /usr/local/bin/OPERA/papplication/adel-full-1.00.jar
End of descriptors calculation: Undefined function or variable 'linux'.

Digging into the source code a bit, I believe the problematic line is 419 in OPERA2_linux.m, where the file path is indeed specified as above.

Obviously this seems to just be an error with the placement of the p. However, I do believe there is another issue with the file. Upon installation, one can choose to install somewhere other than /usr/local/bin/, as I did. However, the path seems to be unchanged in this file. I can't test this with this installation since I can only find the executable, but this may be an issue.

Thanks,
Matt

Unable to open descriptors file

I'm running into an error when running OPERA that has started since upgrading from version 2.3 to 2.5. The calculation initiates, but after a minute or so it throws this error:

"Error using OPERA (line 743)
Unable to open descriptors file"

I've tried loosening the access privileges and running as administrator, but it's still throwing the same error. Is there anything else I could try to get around this error? Or any idea what would be causing this issue (I'm assuming it's not happening to everyone or this would probably already be an issue on here, i.e., maybe user error on my part)? I'm on Windows 64-bit running the CLI version with MCR included.

Original Smiles in the output file are wrong

Dear OPERA developer,

I am using OPERA 2.6 and I found that the Original Smiles in the output are wrong. I think they are
Canonical_QSARr because they are exactly the same but different from my original Smiles. As some of my input were incompatible for the prediction, and the number of row in the results is different from that in my input file, the original Smiles is important to correlate my input and the results.

many thanks.

best regards,
Sukis

OPERA 2.2: Issues with PaDEL Calculations

Hello Kamel,

I've tried to download and run the new OPERA 2.2 code in MATLAB r2018a (Windows, 64 bit operating system). I've successfully downloaded and installed the 2.2 package provided on the github site. However, I keep encountering the following issues when running the source code during the PaDEL calculations:

Exception in thread "Thread-84" java.lang.ArrayIndexOutOfBoundsException: 1
at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-91" java.lang.ArrayIndexOutOfBoundsException: 0
at Jama.Matrix.(Matrix.java:113)
at libpadeldescriptor.BurdenModifiedEigenvaluesDescriptor.calculate(Unknown Source)
at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-123" java.lang.ArrayIndexOutOfBoundsException: 0
at libpadeldescriptor.TopologicalChargeDescriptor.calculate(Unknown Source)
at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-124" java.lang.ArrayIndexOutOfBoundsException: -1
at Jama.EigenvalueDecomposition.tred2(EigenvalueDecomposition.java:173)
at Jama.EigenvalueDecomposition.(EigenvalueDecomposition.java:884)
at libpadeldescriptor.TopologicalDistanceMatrixDescriptor.calculate(Unknown Source)
at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)
Exception in thread "Thread-99" java.lang.ArrayIndexOutOfBoundsException: -1
at Jama.EigenvalueDecomposition.tred2(EigenvalueDecomposition.java:173)
at Jama.EigenvalueDecomposition.(EigenvalueDecomposition.java:884)
at libpadeldescriptor.DetourMatrixDescriptor.calculate(Unknown Source)
at libpadeldescriptor.CDK_Descriptor.run(Unknown Source)

These errors are causing the code to produce the same physico-chemical property (i.e., VP, BCF, etc.) estimates no matter the input files used (even when I input other chemical sdf or mol files). I am not sure what the problem is here. Thank you for your assistance.

Regards,

Derek Manheim

Knime workflow is missing in OPERA2.9_CL_par

LD_LIBRARY_PATH is .:/usr/local/MATLAB/MATLAB_Runtime/v912/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v912/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v912/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v912/sys/opengl/lib/glnxa64

 All properties will be calculated: 
General structural properties, Physchem, Env. fate, ADME and Tox Endpoints (CERAPP, CoMPARA and CATMoS)  

 Initializing and loading models...

========== Structures standardization ==========
Input structures: 3353.
Generating QSAR-ready structures...
Error using OPERA_par
No matching files named '/usr/OPERA/application/knime_4.5.1/knime-workspace/QSAR-ready_2.5.10' were found.

System: Ubuntu 22-04

Yours
Tobias

what is the minimum memory capacity requirements for OPERA calculation?

Hi,
I deployed a LibOPERA_Py on Centos 7 server with capacity of 1G cpu 2G RAM. during running melting point prediction for one molecule, process terminated/killed accidently by sys.
so what's the minimum requirements for OPERA model prediction?

[root@VM-0-9-centos local]# python3
Python 3.7.0 (default, Aug 9 2021, 15:58:55)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
import libOPERA_Py as OPERA
opera = OPERA.initialize()
pred = opera.OPERA('--SMI','/usr/local/test.smi','-o','pred.txt','-MP','-v',1)

Endpoint to be calculated: MP

Initializing and loading models...

============= Molecular Descriptors ============
Loaded structures: 1.
PaDEL calculating 2D descriptors...
PaDEL descriptors calculated for: 1 molecules.
Loading of PaDEL descriptors file...
Checking loaded variables.
Loaded 1444 PaDEL descriptors for 1 molecules.

============== Running The Models ==============
Predicting MP values (Deg. C)...
Killed

Coloured and Rearranged Output [Feature Request]

I like a lot the possibility to have neighbours information within the table output. Since It's just a CSV it is a little bit difficult to have a nice overview of the results.
Could you consider to add also xls output maybe with a different layout and colour codes?
I tried to do an example table with KNIME, maybe it could be something useful to understand what I mean.
This is the example on what it could be:
image

OPERA2.3_Pred_CERAPP.xlsx

If you want to have a look at the workflow it's here:
opera output to XLS 50 sample.zip

Regarding Agonist antagonist and Binders

I am using OPERA to predict the activity against AR, for some chemicals I am getting the output as agonist 0, antagonist 0 and binder 1, and for some it is agonist 1 antagonist 0 and binder 0, How a chemical can be agonist but not binder? Is it because of different data set used in training?

MatLab link broken

Hi,
The link for Matlab runtime download in release 2.3 is broken.

Best wishes
Tobias

Using the python libraries

Hi @kmansouri ,

I am interested in using OPERA, however to adapt it to our workflow I would really like to use it within Python. I managed to get everything running, however now that I have imported the library I am a bit stuck. How do I proceed from here to be able to access the models? Could you point me to any documentation or few examples which can get me started?
In addition: does the python library need a working installation of the OPERA Gui/CL? (so far I did only install the python libraries)

Thanks a lot!
Best,
Jennifer

Problems with Opera CL that are not present in the GUI version

Hello,

While trying to use opera CL i encountered a couple of errors that weren't present in the GUI version:

  1. i keep getting a KNIME error for the QSAR standardization. It gives me exit error code 4. I'm not sure but this might be a problem with my version of KNIME, so I am currently trying to troubleshoot this issue.

2)While running the CL version i get the following error, which is not present for the GUI version:

======================================================================
============== Running The Models ==============
Generating the general structural properties...
---------- PhysChem properties ----------
Predicting LogP values (Log10)...
Weighted kNN model with 9 descriptors
Predicting MP values (Deg. C)...
Weighted kNN model with 15 descriptors
Predicting BP values (Deg. C)...
Weighted kNN model with 13 descriptors
Index exceeds the number of array elements. Index must not exceed 7860.

Error in OPERA (line 2067)

MATLAB:badsubscript

Silent Installation-mode

Hi Kamel,

is it possible to install Opera for windows in a silent way?
We have some users where we have to install and update it. I tried on my own but it seems the installer is waiting for any feedback when running it with a /s parameter.

Cheers,
Andreas

pka-b value

Hi @kmansouri,
I have a question the pKa-b value in the output table. It this the base pKa? Unfortunately, this is not documented.

Best wishes,
Tobias

Java vulnerability in KNIME dependency

Hello,

OPERA has been super useful, I'm a huge fan.

I have run into one problem though, which is that the underlying version of KNIME (4.1.1) uses an older version of the Java Runtime Environment (JRE), which is _jre.linux.x86_64_1.8.0.202-b08. This JRE has been flagged by my institution as being a security vulnerability.

I'm running OPERA in a docker container with an Ubuntu image, and I actually do install Ubuntu's default JRE (a more current and safe version) in the dockerfile, but the underlying version of KNIME still uses the old JRE. Is there a way to use the more current JRE? I don't know if installing the newer JRE in a certain folder might fix that.

Otherwise, can the underlying version of KNIME be updated? I imagine a more recent version of KNIME would probably have a newer JRE too.

Thanks!

Melting points for organic molecule with 90 atoms (C & H only)

Dear Prof. kmansouri
I am curious to calculate the melting and boiling point (M.P. & B.P.) of some organic molecules with 90 atoms (C & H only) keeping the molecular weight less than 2000 g/Mol.

Could you please confirm, whether it can predict accurate M.P. and B.P. for such a large molecule for which the experimental or any other data are not available in the literature?

Thanks

Regards
Bhamu

Java installation is required?

Hi @kmansouri

I just tried using OPERA 2.9 and encountered the error which says: "Command java not found". Does OPERA require an additional JAVA installation? If so, are there any requirements on the Version? (Sorry I cannot upload a screenshot, somehow my upload is blocked)

Thanks in advance for the answer!

OPERA 2.6 command line parallel version crashing

Hi Kamel,

The command line parallel version of OPERA 2.6 is crashing for an input that works with the normal version:

INFO: Adding explicit H false
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: Will evaluate 50 descriptors
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: Got 50 descriptor instances
Exception in thread "main" java.lang.NullPointerException
        at net.guha.apps.cdkdesc.CDKDescUtils.isSMILESFormat(CDKDescUtils.java:74)
        at net.guha.apps.cdkdesc.CDKdescBatch.batchDescriptor(CDKdescBatch.java:227)
        at net.guha.apps.cdkdesc.CDKdesc.main(CDKdesc.java:510)
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: output:   CDKtemp/CDKDesc_8_temp.csv
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor
INFO: type:     all
Aug 22, 2020 6:14:08 AM net.guha.apps.cdkdesc.CDKdescBatch batchDescriptor

From using CDK, I know that it is not thread safe - does it need a lock around CDK functionality?

Customising of pH for log D calculations

Hi Kamel, It would be great if the used pH could be customised. We require also other than the standard pHs. I suggest something like opera -s input.sdf -O output.csv -logD7.4 -logD5.0 -x -v 0 or so.

Yours
Tobias

Experimental results in Caco2_QR.csv

Hi,
For some datasets it is not quiet clear where is the experimental data field in the provided csv files, for example, in the Caco2_QR.csv, I can only find 'CalculatedLogPapp' which the file lacks the experimental endpoint, am I missing something here ?

Storage of PadDEL and CDK descriptors and fingerprints in internal sqldb for re-use

Hi Kamel,
I have a suggestion for improvement of OPERA. The main bottleneck of OPERA is the tedious estimation of PadDEL descriptors and fingerprints. As the descriptors and fingerprints will not change at all, I suggest adding an internal database, which stores the descriptors, the smiles, inchikey, or other identifiers characterizing the structure. In cases of matched compounds, the DB value is taken, and only new compounds are modelled. This is also good for the climate, as unnecessary computing is avoided.

Best
Tobias

question about the datasets used for training

Hello,
For the provided Opera datasets, I am a bit confused about what does the updated files means (for example: LogP_2.9_update.csv), are they additional experimental data used for training the models, in addition to the original sdf files (ends by QR), or they are estimates from the predictions, especially the provided activity values in these updates are given as 'value_point_estimate',

Thanks,
Marawan

Loading Fingerprint files

Dear Kamel,

Thanks for sharing the project, it's working out very well for me.

To save time, I'm trying to load descriptor AND the fingerprint file on new calculation.
The descriptor file works fine but I can't get the program to load the fingerprints.
Is this supported?

Thanks,

Peter

Java 8 dependency of PaDEL

Hi Kamal,
In enterprise environments, Java is not free anymore. Either the company pays for the runtime JRE or they switch to use OpenJDK. I would like to ask to shift to support also OpenJDK if possible.

Thanks and best wishes,
Tobias

CDK Exception

The following exception is thrown when processing the attached SDF file. Using CL version of OPERA. All outputs are enabled.

Note: file extension changed from .sdf to .txt because GitHub wouldn't let me upload the SDF file for some reason.

=============== Exceptions ===============
441326609 lengthOverBreadth org.openscience.cdk.exception.CDKException: Error in center of mass calculation, has exact mass been set on all atoms?

ARV-471.txt

Problem generating padel descriptors

Hi,

thank you for so useful library. One question, why sometimes I obtain the error of that padel descriptors for some compounds have not been generated? And in other situations, I obtain NaNs in pka, why?

Experimental values and predictions

Hi @kmansouri ,

I have been using OPERA and currently it works well in the GUI and also via python. Now I would like to do a comparison of OPERA to our internal tools for KOC prediction. However, when I used the validation data from OPERA (downloaded from the QMRF report) and predicted this dataset with the OPERA KOC model I got an RMSE of 0.1 with an R² of 0.99.

It looks like I do not get predictions, rather I get the experimental values stored in the training dataset. I could not find anything documented for this behavior so I just would like to make sure that my assumption is correct and I always get an experimental value if a molecule is present in the training (or test?) dataset. Is this correct? If so, is there a way to check which returned values are actual predictions and which were part of the training/test data? This would be essential if we want to compare OPERA to our internal tools.

Thanks in advance for your clarification.

Potentially missing setup.py from libOPERA2.8_Py.tar.gz?

I recently upgraded from v2.6 to v2.8.2. For v2.6 I've been using Docker to build and install OPERA that downloads the *_Py.tar.gz release, unzips, cd's into the unzipped directly, and runs "python setup.py install". This method works with v2.6.

For v2.8 (https://github.com/kmansouri/OPERA/releases/download/v2.8.2/libOPERA2.8_Py.tar.gz), when I unzip the contents and look at what's inside the unzipped folder, I am no longer seeing a setup.py file and my Docker build fails as it can't find the setup module.

Has the installation of libOPERA2.8_Py changed since version 2.6?

thanks!

Standalone QSAR ready model

Hi @kmansouri, It would be great, if you could modify the OPERA UI such that it is possible to calculate QSAR ready smiles without selection of any model in order to use them for other purposes than Opera.

Txs,
Tobias

Consistency of endpoints

Hi,
The endpoints and their abbreviations in the field headers should be harmonised and well documented. This would be helpful for the automated processing of the output files.

One examples:
AOH: LogOH, but AD_AOH.

The examples are maybe typos, but this part should be revised and harmonsed.

Yours
Tobias

does opera GUI perform de novo estimates?

I am interested in using OPERA to estimate physicochemical parameters - specifically pKa. I downloaded the GUI and it seems that i only get pKa values returned which are already in the CompTox dashboard - i.e. when the strucuture matches an existing structure, i see the results, else it reports that the compound is not in the database. Is it possible to get de-novo predictions out for compounds NOT currently in the ComTox database? Thanks.

Opera 2.8 - issues with the structural-standardization

Hello,
I just installed the new OPERA 2.8 version, and I'm encountering the same error, no matter which input file I use (mol, sdf, smi, txt) : No structures passed standardization. Check input file!
This issues also occurs with the mol files provided by the US EPA CompTox Chemicals Dashboard.
I currently have installed the KNIME version 4.5.2 on my PC. Could it be that this version of KNIME is creating some issues with the implemented QSAR-ready workflow? Other users have experienced similar issues with this new version of opera?
Thank you for your feedback!
Best,
Lidia

Trouble installing

I may be daft, but I can't find the OPERA2.9_CL_win.zip file anywhere in the repo?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.