GithubHelp home page GithubHelp logo

cdk / cdk-paper-3 Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 10.0 5.99 MB

Repository with the Latex source code for the CDK III paper.

TeX 55.02% Makefile 0.60% R 1.92% Java 14.02% C++ 27.38% Shell 0.74% Perl 0.33%
cdk-paper cheminformatics java

cdk-paper-3's Introduction

Maven Central build Bugs

The Chemistry Development Kit (CDK)

Copyright © 1997-2024 The CDK Development Team

License: LGPL v2, see LICENSE.txt

Home Page | JavaDoc | Wiki | Issues | Mailing List

Introduction

The CDK is an open-source Java library for cheminformatics and bioinformatics.

Key Features:

  • Molecule and reaction valence bond representation.
  • Read and write file formats: SMILES, SDF, InChI, Mol2, CML, and others.
  • Efficient molecule processing algorithms: Ring Finding, Kekulisation, Aromaticity.
  • Coordinate generation and rendering.
  • Canonical identifiers for fast exact searching.
  • Substructure and SMARTS pattern searching.
  • ECFP, Daylight, MACCS, and other fingerprint methods for similarity searching.
  • QSAR descriptor calculations

Install

The CDK is a class library intended to be used by other programs, it will not run as a stand-alone program.

The library is built with Apache Maven and currently requires Java 1.7 or later. From the root of the project run to build the JAR files for each module. The bundle/target/ directory contains the main JAR with all dependencies included:

$ mvn install

You can also download a pre-built library JAR from releases.

Include the main JAR on the Java classpath when compiling and running your code:

$ javac -cp cdk-2.9.jar MyClass.java
$ java -cp cdk-2.9.jar:. MyClass

If you are using Maven, you can use the uber cdk-bundle to grab everything, note it is much more efficient to use include the modules you need:

<dependency>
  <artifactId>cdk-bundle</artifactId>
  <groupId>org.openscience.cdk</groupId>
  <version>2.9</version>
</dependency>

If you are a Python user, the Cinfony project provides access via Jython. Noel O'Boyle's Cinfony provides a wrapper around the CDK and over toolkits exposing core functionality as a consistent API. ScyJava can also be used, as explain in ChemPyFormatics.

Further details on building the project in integrated development environments (IDEs) are available on the wiki:

Getting Help

The Toolkit-Rosetta Wiki Page provides some examples for common tasks. If you need help using the CDK and have questions please use the user mailing list, [email protected] (you must subscribe here first to post).

Acknowledgments

YourKit Logo

The CDK developers use YourKit to profile and optimise code.

YourKit supports open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.

cdk-paper-3's People

Contributors

egonw avatar gilleain avatar johnmay avatar jonalv avatar miquelrojascherto avatar olas avatar rajarshi avatar steinbeck avatar tomas-pluskal avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdk-paper-3's Issues

more optimistic conclusion

"Looks good, but perhaps CDK paper deserves more optimistic conclusion? In fact performance in CDK 1.5 is much better than previous version (apart from personal experience, John had shown stats on this)."

Sentence not clear "That former" - New Builders

The sentence "Originally, the CDK was developed as a shared library between JChemPaint and Jmol. The former used a MVC ..."
That former is not clear from my opinion to what is referred: CDK, JChemPain or Jmol.

More performance benchmarks?

It would be nice to show performance benchmarks comparing the latest release (or current master) with a 1.4 or older release - to make it clear that performance has significantly improved. Features to benchmark could include

  • type perception (speed and coverage)
  • SMILES parsing/generation
  • Fingerprinting (depending on fingerprint type this will include aromaticity, SMARTS performance)

The downside is that this is a good chunk of work, though maybe the test cases could be reused for this purpose

LICSS

"Nice Paper! LICSS (current development version 3.2.1; released July 2016) uses CDK 1.5.13. The project is now on GitHub: https://github.com/KevinLawson/excel-cdk. The original paper was: "LICSS - a Chemical Spreadsheet in Microsoft Excel", Kevin R Lawson and Jonty Lawson, Journal of Cheminformatics, 2012, 4:3
DOI: 10.1186/1758-2946-4-3.

I would be very happy to see a reference to LICSS in your paper if you thought it appropriate."

More recent bib info for Segler2016

The article was published today as an accepted article in Chem. Eur. J.
http://onlinelibrary.wiley.com/doi/10.1002/chem.201604556/full

here is the bibtex information, copied from that website.
Maybe you want to change the Journal title to Chem. Eur. J.

@article {Segler2016,
author = {Segler, Marwin  H.S. and Waller, Mark},
title = {Modelling Chemical Reasoning to Predict and Invent Reactions},
journal = {Chemistry – A European Journal},
issn = {1521-3765},
url = {http://dx.doi.org/10.1002/chem.201604556},
doi = {10.1002/chem.201604556},
pages = {n/a--n/a},
keywords = {Reaction Prediction
Graph Theory
Artificial Intelligence
Chemical Reasoning},
year = {2016},
}

if CDK-paper3 gets accepted fast enough or a DOI is already available, I can put it into the references of the above article as well, as I did not submit the proofs yet.

Make a list with all files to add as Supp Info

I'm going to need this list during resubmission, to ensure I don't forget files... and I don't want to put it in a zip file, and try to take full advantage of the new Figshare functionality...

Moving internal paragraph conclusion - Improved Coding Standards

After the sentence "The next sections describe some approaches the project have adopted that allows us to maintain the CDK library as it is today." you expect the list but instead there is still a next paragraph. I would suggest to move the paragraph "However, perhaps the biggest factor ..." after all the sections list (Modularization, Documentation, etc..) are described.

Code Examples

Maybe we should render the code examples to PDFs. I'm not sure I trust the formatting to get this looking nice - the verbatim doesn't look great at the moment. The problem is these would then be inline images which would need to be labelled?

add MDL molfile S group support?

"John has had fantastic blog posts and posts on google plus on the SDG. All of that, in particular the graphics, should go into the article.
I am talking about posts like:

http://efficientbits.blogspot.de/2015/11/bringing-molfile-sgroups-to-cdk.html
http://efficientbits.blogspot.de/2015/09/bringing-molfile-sgroups-to-cdk-demo.html

but in facts many more that John recently posted. Do not hesitate to include all of this in the article - it will make it much richer. Happy to help.
"

add ReactPRED

We have used cdk 1.5 for building a reaction prediction system. The tool is recently published in Bioinformatics journal. Following are the details:

  1. Tool Name: ReactPRED
  2. Publication details: 10.1093/bioinformatics/btw491
  3. Cdk version used : 1.5
  4. Website: https://sourceforge.net/projects/reactpred/

missing address

\address[id=aff2]{
  %\orgname{??},
  %\street{D\"{u}sternbrooker Weg 20},
  %\postcode{24105},
  %\city{Kiel},
  \cny{UK}
}

Change to more optimistic conclusion - AtomType

The sentence "That makes the code complex, and at times slow, but so far shown sufficiently performant for many applications" gives from my opinion too much pessimistic conclusion. May be a possible alternative it could be:

  • That makes the code complex ,with the consequence reduction of speed, but still very successful performance for many applications.

Paper title

I think it would be better not to use the number 3 in the title of the paper. It will inevitably create some confusion in the future when CDK version 3 is released.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.