zaixizhang / flag Goto Github PK
View Code? Open in Web Editor NEWImplementation of ICLR23 paper "Molecule Generation for Target Protein Binding with Structural Motifs"
Implementation of ICLR23 paper "Molecule Generation for Target Protein Binding with Structural Motifs"
hi, author, I'm still having trouble getting the trainer running.
I saw that in a closed issue, people say there're dataset files missing. But they remained missing till now.
And I've tried every way and looked through the codes, while I still cannot find or generate those files, including './data/cross docked_pocket10/index.pt', './data/pdbbind_pocket10/*', ''/n/holyscratch01/mzitnik_lab/zaixizhang/pdbbind_pocket10/index.pt''.
Please shed some light on me.
Indexing: 1%|█ | 2200/166398 [00:20<25:42, 106.46it/s]
Traceback (most recent call last):
File "train.py", line 63, in
dataset, subsets = get_dataset(config=config.dataset, transform=transform, )
File "/data/zhouzihan/FLAG-main/utils/datasets/init.py", line 11, in get_dataset
dataset = PocketLigandPairDataset(root, *args, **kwargs)
File "/data/zhouzihan/FLAG-main/utils/datasets/pl.py", line 64, in init
self._precompute_name2id()
File "/data/zhouzihan/FLAG-main/utils/datasets/pl.py", line 132, in _precompute_name2id
data = self.getitem(i)
File "/data/zhouzihan/FLAG-main/utils/datasets/pl.py", line 153, in getitem
data = self.transform(data)
File "/home/ahmu/ENTER/envs/flag_env/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 24, in call
data = transform(data)
File "/data/zhouzihan/FLAG-main/utils/transforms.py", line 471, in call
bfs_perm, bfs_focal = self.get_bfs_perm_motif(data['moltree'], self.vocab)
File "/data/zhouzihan/FLAG-main/utils/transforms.py", line 449, in get_bfs_perm_motif
node.wid = vocab.get_index(node.smiles)
File "/data/zhouzihan/FLAG-main/utils/mol_tree.py", line 24, in get_index
return self.vmap[smiles]
KeyError: 'C1CC2CCC(O1)O2'
Hello author, thank you very much for your great work. May I ask if it is convenient for you to provide your pre-trained model? I saw in the sample.yml that it is the file "./pretrained/model.pt". If you could provide this trained model file, I would be very grateful. Thank you.
During sampling, once the exception of UFF is triggered, an error will come from this line (https://github.com/zaixizhang/FLAG/blob/main/motif_sample.py#L295), and the program will exit.
ValueError: Bad Conformer Id
Could you give some explanations or suggestions to address this?
I ran "python train.py" but it seemed that the training job didn't finish normally. How can I resume the training process with a checkpoint file? Many thanks!
hi, thank you for nice work.
May i know how you build the dataset files like pdbbind_pocket10_xxx
Would you please share your code?
The error "cands = enum_assemble(self, neighbors)" on line 91 of the mol_tree.py file in the utils folder is occurring because the method enum_assemble is not defined. I also did not find any import statement for this method. Could you please let me know where I can find this method? Thank you.
We are trying to reproduce the results from the original FLAG paper. We have been able to tweak the training code to make it work, but we still bump into some knotty issues during the sampling/generation stage. Following the original instructions from README.md
, we start the sampling process by running the following command:
python motif_sample.py
The Python interpreter gives the following error:
Traceback (most recent call last):
File "motif_sample.py", line 18, in <module>
from models.maskfill import MaskFillModel
ModuleNotFoundError: No module named 'models.maskfill'
Then I came to realize: there is no such file as ./models/maskfill.py
in the Github repo. I googled for the file and found a file with the same name in the 3DSBDD repo (https://github.com/luost26/3D-Generative-SBDD/blob/main/models/maskfill.py). However, the class __init__()
function arguments do not match. In motif_sample.py:412
:
model = MaskFillModel(
ckpt['config'].model,
protein_atom_feature_dim=protein_featurizer.feature_dim,
ligand_atom_feature_dim=ligand_featurizer.feature_dim,
vocab=vocab,
weight=weight,
).to(args.device)
The vocab
and weight
arguments are non-existent in the 3DSBDD version of maskfill.py
. I assume the FLAG authors have made substantial changes in the maskfill.py
file, but happen not to upload it to Github. I cannot proceed with my experiment reproduction beyond this point before the FLAG version of maskfill.py
is uploaded.
I downloaded the source code and Checkpoints files.
I want to use the motif_sample.py, but it stops with the following error.
$ python motif_sample.py
[2024-03-24 01:11:33,984::sample::INFO] Namespace(config='./configs/sample.yml', data_id=1, device='cuda:0', num_workers=64, outdir='./outputs', vocab_path='vocab.txt')
[2024-03-24 01:11:33,984::sample::INFO] {'dataset': {'name': 'pl', 'path': './data/pdbbind_pocket10', 'split': './data/split_by_name.pt'}, 'model': {'checkpoint': './checkpoints/pretrained.pt', 'hidden_channels': 256, 'random_alpha': False}, 'sample': {'seed': 2024, 'num_samples': 100, 'num_retry': 5, 'max_steps': 12, 'batch_size': 10, 'num_workers': 4, 'n_samples': 5}}
[2024-03-24 01:11:33,984::sample::INFO] Loading data...
Segmentation fault (core dumped)
I checked the code in motif_sample.py and there is an error at line 507 where
data = testset[args.data_id]
Do you have any ideas to solve this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.