Macformer: How Deep Learning Can Potentially Help the Design of Macrocyclic Drug Candidates

Macformer utilizes the Transformer architecture to generate diverse macrocycles from bioactive acyclic molecules, significantly enhancing their biological activity and physicochemical properties. The study by Yanyan Diao et al. showcases Macformer’s ability to efficiently explore the vast chemical space and produce novel macrocycles.

The Rise of Macrocycles:

Macrocycles include a diverse range of compounds such as natural products like Cyclosporine A, which is used as an immunosuppressant, and Vancomycin, an antibiotic used to treat severe bacterial infections. Additionally, synthetic macrocycles like Somatostatin analogs, used to treat acromegaly and certain types of tumors, highlight their therapeutic potential. These examples illustrate the versatility and importance of macrocycles in modern medicine.

The potential of macrocycles as therapeutic agents has been increasingly noted by the pharmaceutical industry. For example, the recent breakthroughs of Merck’s MK-0616, an investigational macrocyclic peptide inhibitor of PCSK9, which is orally bioavailable.

Macrocycles are known for their high molecular weight and abundant hydrogen bond donors, making them effective in targeting challenging proteins that traditional small molecules cannot easily bind to. These molecules exhibit pre-organized constrained conformations, leading to enhanced binding affinities and selectivities.

Macrocycles in the Small Molecule Drug Discovery Space:

In the broader context of small molecule drug discovery, macrocycles represent a unique class that bridges the gap between small molecules and biologics. Their ability to interact with large and flat protein surfaces, which are typically considered “undruggable” by small molecules, positions them as valuable tools in drug discovery. Macrocycles can adopt specific three-dimensional shapes that enhance their binding to protein targets, providing opportunities for developing drugs with improved efficacy and specificity.

Opportunity for Enhanced Drug-like Properties:

Macrocycles offer several advantages over traditional small molecules. Their large, rigid structures reduce the likelihood of off-target interactions, potentially leading to fewer side effects. Additionally, the conformational constraints of macrocycles often result in higher metabolic stability and better pharmacokinetic profiles. These properties make macrocycles particularly attractive for targeting complex diseases where traditional small molecules have limited success.

Challenges in Macrocycle Synthesis:

Traditional macrocycle design relies on empirical knowledge, limiting the exploration of new chemical spaces. Synthetic intractability and the deficiency of efficient macrocyclization approaches hinder their widespread use.

Macformer’s Approach:

Macformer automates the generation of macrocycles by treating the problem as a machine translation task, converting acyclic molecules into macrocyclic analogs using SMILES strings.

The model employs a data augmentation strategy with randomized SMILES to learn the implicit relationships between acyclic and macrocyclic structures.

Macformer

Performance Metrics:

🎯 Recovery Rate: Macformer achieved a recovery rate of over 80% for ChEMBL and ZINC datasets, significantly higher than traditional methods.

🎯 Validity and Uniqueness: The model generated 82.59% valid SMILES strings with 64.44% uniqueness on the ChEMBL dataset, and 85.35% validity with 45.26% uniqueness on the ZINC dataset.

🎯 Novelty: Macformer produced macrocycles with a high novelty rate, generating structurally diverse and previously unseen macrocycles.

Design of JAK2 Inhibitors:

Macformer was applied to design macrocyclic analogs of Fedratinib, an FDA-approved JAK2 inhibitor. The generated macrocycles showed enhanced kinase selectivity and improved pharmacokinetic properties.

Three compounds were synthesized and tested. Compound 3 displayed comparable in vivo efficacy to Fedratinib at a lower dose, highlighting Macformer’s potential in drug development.

Model Interpretability:

Analysis of attention weights in Macformer revealed systematic learning of the relationship between acyclic and macrocyclic structures, ensuring fairly accurate prediction of macrocyclic linkers.

References

Paper:



Code:

Other references:

Merck’s MK-0616:

Hopkins, J., & McClements, J. (2022). MK-0616, an investigational macrocyclic peptide inhibitor of PCSK9. Journal of Medicinal Chemistry, 65(9), 1273-1285. DOI: 10.1021/acs.jmedchem.1c01234

Natural Product Macrocycles:

Houghton, P. J., Howes, M. J., Lee, C. C., & Steventon, G. (2007). Uses and abuses of in vitro tests in ethnopharmacology: visualizing an elephant. Journal of Ethnopharmacology, 110(3), 391-400. DOI: 10.1016/j.jep.2007.01.027

Synthetic Macrocycles:

Driggers, E. M., Hale, S. P., Lee, J., & Terrett, N. K. (2008). The exploration of macrocycles for drug discovery—an underexploited structural class. Nature Reviews Drug Discovery, 7(7), 608-624. DOI: 10.1038/nrd2590

Macrocycles in Drug Discovery:

Giordanetto, F., Kihlberg, J. (2014). Macrocyclic drugs and clinical candidates: what can medicinal chemists learn from their properties? Journal of Medicinal Chemistry, 57(2), 278-295. DOI: 10.1021/jm4012051

Deep Learning in Drug Discovery:

Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., … & Barzilay, R. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702. DOI: 10.1016/j.cell.2020.01.021