GithubHelp home page GithubHelp logo

kate-fie / syndirella Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 0.0 122.15 MB

Generates and scores synthetically practical elaborations from fragment screens

Home Page: https://syndirella.readthedocs.io/en/latest/

Python 15.31% Jupyter Notebook 83.72% Makefile 0.03% Shell 0.94%

syndirella's People

Contributors

kate-fie avatar rsanchezgarc avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

syndirella's Issues

Format retrosynthesis output for manual review

To help manual route input, need to have a function that will output the top 10 (?) list of retrosynthesis routes from Manifold. Then human curation will result in the exact route input into Syndirella.

Send Max two step examples

For Max to be prepared for two step samples, I need to give him what a .csv will look like of the metadata

Profiling Pipeline

I need to profile my pipeline to understand:

  1. The functions that I've made that are inefficient. Will see this ordering by internal time.
  2. The functions that are inefficient due to calling a lot of other functions. Will see this ordering by cumulative time.

Internal vs. Cumulative Time:
Internal time focuses on the performance of the function's own code, useful for direct code optimization.
Cumulative time can indicate "hot spots" in code that are expensive due to both their own logic and their use of other functions, making it essential for understanding overall system performance.

Place analogues in chunks by number of atoms added

As of now placement is done for each base compound iteratively where analogues are placed for each base compound one by one. A more flexible approach would be to place X analogues for each base in chunks like:

  1. 0–3 atoms
  2. 4–6 atoms
  3. 7–10 atoms

This approach would allow the stopping of a placement campaign early or stopping in accordance with a strict deadline.

If no output.csv but output folder, still handle successes

  • Add successful placements to success dirs even if output.csv does not exist by checking minimised.json
  • Also add second value to success_dirs.json value tuple(success, total placed). Where total placed is determined by number of dirs in output before moving to success

Template Selection

Template selection for placement is extremely important as some experimental fragment poses clash with templates from other fragment hits. I'm thinking there could be a hierarchy of templates to test with the goal of increasing base compound placement success and therefore elaborations of those bases.

Potential pipeline for a given base compound, template, parenthit1, and parenthit2:

  1. Assert that there is no clash with the template for both parent hits.
  2. If there is a clash, change template to the apo structure of the parent hit with the previous clash. Check for clash with both parent hits.
  3. If there is a clash then, throw an Warning. Then proceed with energy minimisation of template around fragment with a less worse clash (determined by distance of closest ligand atom to protein residue).

Add to_hippo format for input to pipeline

Add option to run_pipeline with to hippo format with the reactions separated out. Also could check if intermediate products don't exist and fill those in. But must check if final product from intermediates matches the input final compound.

Add more specific check for chirality in SMARTS

I've noticed that some reactant catalog data is not stereochemistry specific. A more strict check on making sure that the chirality of all queried reactants is correct should be done.

Example of the problem for an epoxide + amine coupling:
Epoxide:
image
Amine:
image
Product:
image

But the reactants in the epoxide metadata look like this:
{'catalogName': 'enamine_bb', 'catalogId': 'EN300-44840', 'smiles': 'c1cc2c(cc1C1CO1)OCO2', 'link': 'https://www.enaminestore.com/catalog/EN300-44840', 'purchaseInfo': {'isBuildingBlock': True, 'isScreening': False, 'bbLeadTimeWeeks': 1.0, 'bbPriceRange': '< $100 / g'}}, {'catalogName': 'enamine_made', 'catalogId': 'BBV-49121744', 'smiles': 'c1cc2c(cc1[C@H]1CO1)OCO2', 'purchaseInfo': {'isBuildingBlock': True, 'isScreening': False, 'bbLeadTimeWeeks': 6.0, 'bbPriceRange': '$500-1k / g'}}

Clearly the first reactant and its catalog info should be removed since it does not define chirality information.

Speed improvement for fragmenstein placement and success collating

Placing with Fragmenstein is much slower than I expected. As of right now, it takes 24 hours to place ~70,000 compounds. And with an average success rate of 43%, that means ~30,000 successful compounds are found per day which is nothing on the order of magnitude of the 1 million placements that I thought was possible.

There could be multiple areas of speed improvement:

  1. Placing without PyRosetta
  2. Better implementation of deleting extra output directories

Errors in reactions

@mwinokan has pointed out examples of synthesis route errors in this issue.

Things to check/Fix/Implement:

  • All deprotection SMARTS
  • Exact number of atom removal expected for deprotections (implemented with hippo.chem)
  • Not elaborating on protection groups (which should be fixed when the SMARTS are exact)
  • Check if reactant can be both reactants in given rxn --> flag

Inchi key uniqueness check after doing any reaction (EDIT: not priority because I'm already checking for duplicates)

I also read that products from this command are not sanitised...

  • Add sanitisation step after getting products in SlipperSynthesizer

Place reactants instead of products

As of now, Fragmenstein places products in the pocket which are then fed into HIPPO. Those that cannot be minimized within the time constraint or have an RMSD over the set threshold are thrown out. This is the bottleneck of the pipeline, placing all the chosen reactant combinations.

What if I placed the reactants instead of the full products?

  • If the reactant clashes, then the product won't fit
  • Atom mapping from reactant to fragment should not be too difficult -> would have to determine which fragment the reactant corresponds to. Could do reactant to fragment labelling from base compound.

Make manual input more user friendly

Right now the manual route specification is really specific with writing out tuples/lists with single quotation marks to interpret directly as that data type. I'd like to not have to depend on that for easier intuitive input.

Not like this:

reactants reaction_names
[('OB(O)c1cccc2cc[nH]c12', 'Ic1cccc(I)n1'), ('Ic1cccc(-c2cccc3cc[nH]c23)n1', 'CCCB(O)O')] ['Sp2-sp2_Suzuki_coupling', 'Sp3-sp2_Suzuki_coupling']
[('COC(=O)NCCB(O)O', 'Cn1nccc1I')] ['Sp3-sp2_Suzuki_coupling']

SMARTS-errors to fix

There are a number of elaborations via pipeline that are ending in errors due to SMARTS handling or Reaction handling. I'll list some examples I've seen here to fix:

2024-08-08 16:31:24,680 - syndirella.Cobbler.Cobbler - INFO - Final product: CCC(=O)c1ccc2c(c1)NC(=O)CO2
2024-08-08 16:31:24,684 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - INFO - Step 1 in this route using Amidation
2024-08-08 16:31:24,686 - syndirella.SMARTSHandler.SMARTSHandler - ERROR - Reactant could not be matched to only SMARTS in reaction.
2024-08-08 16:31:24,695 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 66, in get_final_library
    current_library = cobbler_bench.find_analogues()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblerBench.py", line 77, in find_analogues
    self.reaction.find_reaction_atoms_for_all_reactants()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/Reaction.py", line 188, in find_reaction_atoms_for_all_reactants
    if len(self.matched_smarts_to_reactant) == 0:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
2024-08-10 17:03:25,256 - syndirella.Cobbler.Cobbler - INFO - Final product: CC(=O)c1ccc2c(c1)NC(=O)CO2
2024-08-10 17:03:25,700 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - INFO - Step 1 in this route using Williamson_ether_synthesis
2024-08-10 17:03:25,996 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 66, in get_final_library
    current_library = cobbler_bench.find_analogues()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblerBench.py", line 75, in find_analogues
    self.reaction.find_attachment_ids_for_all_reactants()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/Reaction.py", line 175, in find_attachment_ids_for_all_reactants
    raise ReactionError("No attachment points found for reaction {}".format(self.reaction_name))
syndirella.error.ReactionError: No attachment points found for reaction Williamson_ether_synthesis
2024-08-13 15:17:54,391 - syndirella.Cobbler.Cobbler - INFO - Final product: CC(C)(C)OC1CC(NC(=O)N2CCCC2CS(N)(=O)=O)C1
2024-08-13 15:17:54,395 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - INFO - Step 1 in this route using Formation_of_urea_from_two_amines
2024-08-13 15:17:54,396 - syndirella.SMARTSHandler.SMARTSHandler - ERROR - The reactants are the same in reaction Formation_of_urea_from_two_amines in mol NS(=O)(=O)CC1CCCN1 and CC(C)(C)OC1CC(N)C1.
2024-08-13 15:17:54,406 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 66, in get_final_library
    current_library = cobbler_bench.find_analogues()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblerBench.py", line 77, in find_analogues
    self.reaction.find_reaction_atoms_for_all_reactants()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/Reaction.py", line 188, in find_reaction_atoms_for_all_reactants
    if len(self.matched_smarts_to_reactant) == 0:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
2024-08-10 20:27:23,823 - syndirella.Cobbler.Cobbler - INFO - Final product: CC(=O)Nc1nn(C)c(NC(C)=O)c1CC(=O)NC(CNC(=O)CCl)c1cccnc1
2024-08-10 20:27:23,827 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - INFO - Step 1 in this route using Amidation
2024-08-10 20:27:23,829 - syndirella.SMARTSHandler.SMARTSHandler - ERROR - The reactants are the same in reaction Amidation in mol Cn1nc(Br)c(CC(=O)O)c1N and CC(=O)O.
2024-08-10 20:27:23,831 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 66, in get_final_library
    current_library = cobbler_bench.find_analogues()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblerBench.py", line 77, in find_analogues
    self.reaction.find_reaction_atoms_for_all_reactants()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/Reaction.py", line 188, in find_reaction_atoms_for_all_reactants
    if len(self.matched_smarts_to_reactant) == 0:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
2024-08-08 04:29:54,271 - syndirella.Cobbler.Cobbler - INFO - Final product: CC(=O)Nc1cc(CC(=O)NC(CNC(=O)CN)c2ccc3ccccc3c2)cnn1
2024-08-08 04:29:54,276 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - INFO - Step 1 in this route using N-nucleophilic_aromatic_substitution
2024-08-08 04:29:54,278 - syndirella.SMARTSHandler.SMARTSHandler - ERROR - The reactants do not match the reaction SMARTS in reaction N-nucleophilic_aromatic_substitution in mol O=C(O)Cc1cnnc(Cl)c1 and N.
2024-08-08 04:29:54,285 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 66, in get_final_library
    current_library = cobbler_bench.find_analogues()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblerBench.py", line 77, in find_analogues
    self.reaction.find_reaction_atoms_for_all_reactants()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/Reaction.py", line 188, in find_reaction_atoms_for_all_reactants
    if len(self.matched_smarts_to_reactant) == 0:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
2024-08-08 06:31:43,975 - syndirella.Cobbler.Cobbler - INFO - Running retrosynthesis analysis for CC(=O)Nc1cncc(CC(=O)N[C@@H](CNC(=O)CCl)c2ccc3ccccc3c2)c1 |a:13|...
2024-08-08 06:31:46,230 - syndirella.Fairy.Fairy - INFO - The first route found is 3 steps. The forward synthesis is: ['Amide_Schotten-Baumann_with_amine', 'Ester_amidation', 'Amide_Schotten-Baumann_with_amine']
2024-08-08 06:31:46,230 - syndirella.Fairy.Fairy - INFO - Additional reaction for 'Amide_Schotten-Baumann_with_amine' found in fairy filters. Getting additional routes containing 'Amidation'...
2024-08-08 06:31:46,230 - syndirella.Fairy.Fairy - INFO - Additional reaction for 'Amide_Schotten-Baumann_with_amine' found in fairy filters. Getting additional routes containing 'Amidation'...
2024-08-08 06:31:46,230 - syndirella.Cobbler.Cobbler - INFO - Syndirella 👑 will elaborate the following route:
2024-08-08 06:31:46,230 - syndirella.Cobbler.Cobbler - INFO - 
 Step 1: 
 ['CCOC(=O)Cc1cncc(N)c1', 'CC(=O)Cl'] -> Amide_Schotten-Baumann_with_amine
2024-08-08 06:31:46,230 - syndirella.Cobbler.Cobbler - INFO - 
 Step 2: 
 ['NCC(N)c1ccc2ccccc2c1', 'CCOC(=O)Cc1cncc(NC(C)=O)c1'] -> Ester_amidation
2024-08-08 06:31:46,230 - syndirella.Cobbler.Cobbler - INFO - 
 Step 3: 
 ['O=C(Cl)CCl', 'CC(=O)Nc1cncc(CC(=O)NC(CN)c2ccc3ccccc3c2)c1'] -> Amide_Schotten-Baumann_with_amine
2024-08-08 06:31:46,230 - syndirella.Cobbler.Cobbler - INFO - Final product: CC(=O)Nc1cncc(CC(=O)NC(CNC(=O)CCl)c2ccc3ccccc3c2)c1
2024-08-08 06:31:46,235 - syndirella.Cobbler.Cobbler - INFO - Syndirella 👑 will elaborate the following route:
2024-08-08 06:31:46,235 - syndirella.Cobbler.Cobbler - INFO - 
 Step 1: 
 ['CCOC(=O)Cc1cncc(N)c1', 'CC(=O)Cl'] -> Amide_Schotten-Baumann_with_amine
2024-08-08 06:31:46,235 - syndirella.Cobbler.Cobbler - INFO - 
 Step 2: 
 ['NCC(N)c1ccc2ccccc2c1', 'CCOC(=O)Cc1cncc(NC(C)=O)c1'] -> Ester_amidation
2024-08-08 06:31:46,235 - syndirella.Cobbler.Cobbler - INFO - 
 Step 3: 
 ['O=C(O)CCl', 'CC(=O)Nc1cncc(CC(=O)NC(CN)c2ccc3ccccc3c2)c1'] -> Amidation
2024-08-08 06:36:22,819 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer.SlipperSynthesizer - INFO - Filtering analogues of reactants on SMARTS...
2024-08-08 06:36:22,821 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer.SlipperSynthesizer - INFO - Filtered 802 rows (100.0%) from r1 dataframe.
2024-08-08 06:36:22,821 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer.SlipperSynthesizer - INFO - Ordering analogues of r1 before finding products...
2024-08-08 06:36:22,828 - syndirella.cobblers_workshop.CobblersWorkshop.CobblersWorkshop - ERROR - An error occurred in the route elaboration: Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/cobblers_workshop/CobblersWorkshop.py", line 70, in get_final_library
    slipper.get_products()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 59, in get_products
    self.products: pd.DataFrame = slipper_synth.get_products()
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 60, in get_products
    self.filter_analogues()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 120, in filter_analogues
    df = self.order_analogues(df, reactant_prefix)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 132, in order_analogues
    base_reactant = df[f"{reactant_prefix}_mol"].iloc[0]
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/indexing.py", line 1752, in _getitem_axis
    self._validate_integer(key, axis)
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/indexing.py", line 1685, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

Error with structuring to_hippo output

For base compound: UASUHFYDLOTSPQ-UHFFFAOYSA-N I can't get the to_hippo.pkl.gz... Luckily only 72 elaborations

Traceback (most recent call last):
  File "/data/xchem-fragalysis/kfieseler/syndirella/syndirella/pipeline.py", line 112, in _elaborate_from_cobbler_workshops
    slipper.write_products_to_hippo(uuid=uuid) # only write at the end after placement, to get correct uuid
  File "/data/xchem-fragalysis/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 104, in write_products_to_hippo
    hippo_df = self._structure_products_for_hippo(placements_df=placements,
  File "/data/xchem-fragalysis/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 138, in _structure_products_for_hippo
    hippo_df = self._put_hippo_dfs_together(hippo_dfs)
  File "/data/xchem-fragalysis/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 192, in _put_hippo_dfs_together
    hippo_df_step_last[f'{step}_product_name'] = hippo_df_step_last.apply(self.find_matches,
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/frame.py", line 10037, in apply
    return op.apply().__finalize__(self, method="apply")
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/apply.py", line 837, in apply
    return self.apply_standard()
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/apply.py", line 963, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/apply.py", line 979, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
  File "/data/xchem-fragalysis/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 173, in find_matches
    similarity = self.calculate_inchi_similarity(row[f'{step + 1}_r{row[f"{step + 1}_r_previous_product"]}_smiles'],
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/series.py", line 1040, in __getitem__
    return self._get_value(key)
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/series.py", line 1156, in _get_value
    loc = self.index.get_loc(label)
  File "/data/xchem-fragalysis/kfieseler/conda/envs/fragmenstein/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3797, in get_loc
    raise KeyError(key) from err
KeyError: '3_rNone_smiles'

Better handling for unsuccessful base minimisation

Sometimes the base compound does not successfully minimise, I need to handle these cases better and output errors. Need to add the following implementation:

  1. Place base compound first, attempt to place. If it does not minimise, replace up to 4 more times until it successfully does. Record as flag if it had to be replaced, and what errors were output.
  2. After a successful minimisation, run through intramolecular validity checks similar to PoseBusters. If it does not pass, replace up to 4 more times.

This results in 5 attempts to place + run intermolecular checks for the base compound since it is pivotal to make sure the base compound is valid before performing elaborations from it.

Specific exit vector expansion functionality

The pocket size and shape is not considered at all for elaborations. Syndirella could have directed elaborations off identified points on compound that point to a space in the pocket for elaborations. If the expansion point cannot be synthetically elaborated if it is an inaccessible atom, that atom could be exchanged to be an 'elaborateable' atom.

There are definitley tools out there that already do this and Steph has done this for the Fragment Network. The question is if this is worth the time to do.

Edge cases for base compound placement

There are cases where the base compound is not included as a successful placement. This could be due to:

  1. It does not have a successful energy minimization.
  2. It does not have a negative ΔΔG value.

I need to add the functionality of adding the original conformation of the base compound (as determined by other method) to the success directory.

ChemicalParserException not handled as error

2024-09-04 10:30:30,474 - syndirella.route.CobblersWorkshop - INFO -

        Syndirella 👑 will elaborate the following route for COc1c(Nc2ccc(O)cc2)cnc2c1NCCC2 | OIDLUPDKBQACIL-UHFFFAOYSA-N:
        Route UUID: ixiapz
        Reaction Names: ['Sp3-sp2_Suzuki_coupling', 'N-nucleophilic_aromatic_substitution', 'N-nucleophilic_aromatic_substitution']
        Number of Steps: 3

2024-09-04 10:30:30,474 - syndirella.route.CobblersWorkshop - INFO - Step 1 in this route using Sp3-sp2_Suzuki_coupling
2024-09-04 10:30:30,474 - syndirella.route.Library - INFO - Looking for analogue library .pkl.gz if already created...
2024-09-04 10:30:30,502 - syndirella.fairy - INFO - Reaction name 'Sp3-sp2_Suzuki_coupling' found in reactant filters. Getting cheaper reactants...
2024-09-04 10:30:30,503 - syndirella.fairy - INFO - Performing 'add_bromine'...
2024-09-04 10:30:30,503 - syndirella.Postera - INFO - Running superstructure search for COc1c(Cl)cnc(Cl)c1Cl. Only searching for building blocks.
2024-09-04 10:31:21,540 - syndirella.Postera - INFO - Found 3 hits for COc1c(Cl)cnc(Cl)c1Cl before filtering.
2024-09-04 10:31:21,541 - syndirella.Postera - INFO - Running superstructure search for COc1c(Br)cnc(Br)c1Br. Only searching for building blocks.
2024-09-04 10:32:12,447 - syndirella.Postera - INFO - Found 1 hits for COc1c(Br)cnc(Br)c1Br before filtering.
2024-09-04 10:32:12,449 - syndirella.fairy - INFO - Found 4 before filtering.
2024-09-04 10:32:12,450 - syndirella.fairy - INFO - Removing repeat analogues...
2024-09-04 10:32:12,450 - syndirella.fairy - INFO - Removed 1 molecules (25.0%) by simple filters.
2024-09-04 10:32:12,451 - syndirella.fairy - INFO - Removing chirality from analogues...
2024-09-04 10:32:12,452 - syndirella.fairy - INFO - Removing repeat analogues...
2024-09-04 10:32:12,452 - syndirella.route.Library - INFO - Removed 0 invalid or repeated molecules (0.0%) of r1 analogues.
2024-09-04 10:32:12,452 - syndirella.route.Library - INFO - Checking if analogues contain SMARTS pattern of original reactant...
2024-09-04 10:32:12,452 - syndirella.route.Library - INFO - Checking if analogues contain SMARTS pattern of other reactant...
2024-09-04 10:32:12,454 - syndirella.route.Library - INFO - Saving r1 analogue library to /opt/xchem-fragalysis-2/kfieseler/CHIKV-Mac//OIDLUPDKBQACIL-UHFFFAOYSA-N/extra/OIDLUPDKBQACIL-UHFFFAOYSA-N_ixiapz_Sp3-sp2_Suzuki_coupling_r1_1of3.pkl.gz

2024-09-04 10:32:12,472 - syndirella.route.Library - INFO - Looking for analogue library .pkl.gz if already created...
2024-09-04 10:32:12,473 - syndirella.fairy - INFO - Reaction name 'Sp3-sp2_Suzuki_coupling' found in reactant filters. Getting cheaper reactants...
2024-09-04 10:32:12,473 - syndirella.fairy - INFO - Performing 'add_boronate_ester'...
2024-09-04 10:32:12,473 - syndirella.Postera - INFO - Running superstructure search for NCCCB(O)O. Only searching for building blocks.
2024-09-04 10:32:13,029 - syndirella.Postera - WARNING - Rate limit exceeded. Waiting for 0.5 seconds before retrying...

2024-09-04 10:32:14,063 - syndirella.Postera - WARNING - Rate limit exceeded. Waiting for 1.0 seconds before retrying...
2024-09-04 10:32:15,595 - syndirella.Postera - WARNING - Rate limit exceeded. Waiting for 2.0 seconds before retrying...
2024-09-04 10:32:18,171 - syndirella.Postera - WARNING - Rate limit exceeded. Waiting for 4.0 seconds before retrying...
2024-09-04 10:32:22,707 - syndirella.Postera - WARNING - Rate limit exceeded. Waiting for 8.0 seconds before retrying...
2024-09-04 10:38:42,946 - syndirella.Postera - INFO - Found 331 hits for NCCCB(O)O before filtering.
2024-09-04 10:38:42,947 - syndirella.Postera - INFO - Running superstructure search for CC1(C)OB(CCCN)OC1(C)C. Only searching for building blocks.
2024-09-04 10:44:50,011 - syndirella.Postera - INFO - Found 296 hits for CC1(C)OB(CCCN)OC1(C)C before filtering.
2024-09-04 10:44:50,099 - syndirella.fairy - INFO - Found 627 before filtering.
2024-09-04 10:44:50,141 - syndirella.fairy - INFO - Removing repeat analogues...
2024-09-04 10:44:50,550 - syndirella.fairy - INFO - Removed 392 molecules (62.52%) by simple filters.
2024-09-04 10:44:50,877 - syndirella.fairy - INFO - Removing chirality from analogues...
2024-09-04 10:44:50,890 - syndirella.fairy - INFO - Removing repeat analogues...
2024-09-04 10:44:51,018 - syndirella.route.Library - INFO - Removed 0 invalid or repeated molecules (0.0%) of r2 analogues.
2024-09-04 10:44:51,018 - syndirella.route.Library - INFO - Checking if analogues contain SMARTS pattern of original reactant...
2024-09-04 10:44:51,022 - syndirella.route.Library - INFO - Checking if analogues contain SMARTS pattern of other reactant...
2024-09-04 10:44:51,046 - syndirella.route.Library - INFO - Saving r2 analogue library to /opt/xchem-fragalysis-2/kfieseler/CHIKV-Mac//OIDLUPDKBQACIL-UHFFFAOYSA-N/extra/OIDLUPDKBQACIL-UHFFFAOYSA-N_ixiapz_Sp3-sp2_Suzuki_coupling_r2_1of3.pkl.gz

2024-09-04 10:44:51,120 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtering analogues of reactants on SMARTS...
2024-09-04 10:44:51,122 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtered 0 rows (0.0%) from r1 dataframe.
2024-09-04 10:44:51,123 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Ordering analogues of r1 before finding products...
2024-09-04 10:44:51,125 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtering analogues of reactants on SMARTS...
2024-09-04 10:44:51,126 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtered 0 rows (0.0%) from r2 dataframe.
2024-09-04 10:44:51,126 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Ordering analogues of r2 before finding products...
2024-09-04 10:44:52,743 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Adding metadata to products...
2024-09-04 10:45:02,806 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Since these products are not the final products they will be saved in the /extra folder.

2024-09-04 10:45:02,807 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Saving products to /opt/xchem-fragalysis-2/kfieseler/CHIKV-Mac//OIDLUPDKBQACIL-UHFFFAOYSA-N/extra/OIDLUPDKBQACIL-UHFFFAOYSA-N_ixiapz_Sp3-sp2_Suzuki_coupling_products_1of3.pkl.gz

2024-09-04 10:45:02,917 - syndirella.route.CobblersWorkshop - INFO - Step 2 in this route using N-nucleophilic_aromatic_substitution
2024-09-04 10:45:02,917 - syndirella.route.Library - INFO - Since this is an internal or final step looking for the products .pkl from previous step...
2024-09-04 10:45:02,937 - syndirella.route.Library - INFO - Found /opt/xchem-fragalysis-2/kfieseler/CHIKV-Mac//OIDLUPDKBQACIL-UHFFFAOYSA-N/extra/OIDLUPDKBQACIL-UHFFFAOYSA-N_ixiapz_Sp3-sp2_Suzuki_coupling_products_1of3.pkl.gz as the products .pkl from previous step.
2024-09-04 10:45:03,213 - syndirella.fairy - INFO - Removing chirality from analogues...
2024-09-04 10:45:03,318 - syndirella.fairy - INFO - Removing repeat analogues...
2024-09-04 10:45:09,493 - syndirella.route.Library - INFO - Removed 0 invalid or repeated molecules (0.0%) of r1 analogues.
2024-09-04 10:45:09,493 - syndirella.route.Library - INFO - Checking if analogues contain SMARTS pattern of original reactant...
2024-09-04 10:45:09,696 - syndirella.route.Library - INFO - Saving r1 analogue library to /opt/xchem-fragalysis-2/kfieseler/CHIKV-Mac//OIDLUPDKBQACIL-UHFFFAOYSA-N/extra/OIDLUPDKBQACIL-UHFFFAOYSA-N_ixiapz_N-nucleophilic_aromatic_substitution_r1_2of3.pkl.gz

2024-09-04 10:45:10,037 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtering analogues of reactants on SMARTS...
2024-09-04 10:45:10,040 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Filtered 984 rows (59.13%) from r1 dataframe.
2024-09-04 10:45:10,040 - syndirella.slipper.slipper_synthesizer.SlipperSynthesizer - INFO - Ordering analogues of r1 before finding products...

2024-09-04 10:45:10,068 - syndirella.structure_outputs - ERROR - Could not structure pipeline outputs.
2024-09-04 10:45:10,071 - syndirella.structure_outputs - ERROR - Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/pipeline.py", line 64, in elaborate_from_cobbler_workshops
    final_library = workshop.get_final_library()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/route/CobblersWorkshop.py", line 217, in get_final_library
    slipper.get_products()
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/Slipper.py", line 67, in get_products
    self.products: pd.DataFrame = slipper_synth.get_products()
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 66, in get_products
    self.products = self.get_products_from_single_reactant()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 312, in get_products_from_single_reactant
    products: pd.DataFrame = reactant.apply(self.apply_reaction_single, axis=1)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/frame.py", line 10374, in apply
    return op.apply().__finalize__(self, method="apply")
           ^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/apply.py", line 916, in apply
    return self.apply_standard()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/apply.py", line 1063, in apply_standard
    results, res_index = self.apply_series_generator()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/conda/envs/syndirella/lib/python3.12/site-packages/pandas/core/apply.py", line 1081, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/slipper/slipper_synthesizer/SlipperSynthesizer.py", line 343, in apply_reaction_single
    products = reaction.RunReactants((r1,))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: ChemicalParserException: Number of reactants provided does not match number of reactant templates.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/structure_outputs.py", line 320, in structure_pipeline_outputs
    output_df: pd.DataFrame = structure_route_outputs(error_message=error_message,
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/xchem-fragalysis-2/kfieseler/syndirella/syndirella/structure_outputs.py", line 289, in structure_route_outputs
    row.update(additional_info)
TypeError: 'NoneType' object is not iterable

Make timeline

Need to make a timeline for Syndirella and A71 2A

Place using Wictor

Placing compounds with Wictor class could be a significant speed improvement. After chatting with Matteo today, minimizing with PyRosetta might be unneccessary. The elaborated compounds are similar to the fragments. The template used for placement is the conformation of the protein in an already bound state when using a fragment bound structure. Residues should then not significantly change location.

I will have to do tests to determine this speed + interaction improvement.

Possible change: structuring route into synthons

Instead of getting routes from manifold retrosynthesis output, break compound into synthons. Could use https://github.com/Laboratoire-de-Chemoinformatique/Synt-On.

Positives:

  • Smaller starting reactants
  • Automatic labeling for any reaction a reactant can be used in. Easier extension to finding multiple routes
  • Not restricted to Manifold

Cons:

  • Implementation work required for getting synt-on working in pipeline
  • More complex handling of diverse reactants

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.