Using the InteractiveMolecule widget#

For this example we need to install RDKit. Note that you will need conda to install RDKit.

!conda install -c conda-forge rdkit

Now we can import the trident_chemwidgets and the pandas lib to import our csv dataset.

[1]:
import trident_chemwidgets as tcw
import pandas as pd
from rdkit import Chem
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 3>()
      1 import trident_chemwidgets as tcw
      2 import pandas as pd
----> 3 from rdkit import Chem

ModuleNotFoundError: No module named 'rdkit'

Now we can create a small function to featurize our molecules with basic information per atom.

IMPORTANT: the order of the data rows in the pandas DataFrame or dict must match the standard ordering of atoms as returned by the RDKit ``.GetAtoms()`` function. You can generate this data any way you see fit (e.g. calculated values from RDKit as in the function below or attention values from a Graph Attention Network. The only constraint is the atom ordering. If you are using RDKit-based featurizers like those from DeepChem, this standard ordering should already be the default. Take care when using cutom featurizers.

[2]:
def featurize_mol(smiles):
    # Init feature dict
    feature_dict = {
        'Chiral Tag': [],
        'Formal Charge': [],
        'Mass': [],
        'Total Hs': [],
        'Total Valence': []
    }

    # Get atoms from SMILES
    atoms = Chem.MolFromSmiles(smiles).GetAtoms()

    # Use RDKit to get all the atom properties
    for atom in atoms:
        feature_dict['Chiral Tag'].append(atom.GetChiralTag())
        feature_dict['Formal Charge'].append(atom.GetFormalCharge())
        feature_dict['Mass'].append(atom.GetMass())
        feature_dict['Total Hs'].append(atom.GetTotalNumHs())
        feature_dict['Total Valence'].append(atom.GetTotalValence())

    return pd.DataFrame.from_dict(feature_dict)

Here we’ll be exploring the atom features from the ibuprofen molecule, smiles string CC(C)CC1=CC=C(C=C1)C(C)C(=O)O. We’ll use the function we defined above to get some data at the atom level.

[3]:
atom_data = featurize_mol('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O')
atom_data.head()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 atom_data = featurize_mol('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O')
      2 atom_data.head()

Input In [2], in featurize_mol(smiles)
      3 feature_dict = {
      4     'Chiral Tag': [],
      5     'Formal Charge': [],
   (...)
      8     'Total Valence': []
      9 }
     11 # Get atoms from SMILES
---> 12 atoms = Chem.MolFromSmiles(smiles).GetAtoms()
     14 # Use RDKit to get all the atom properties
     15 for atom in atoms:

NameError: name 'Chem' is not defined

Now we can use the InteractiveMolecule widget to explore the data attached to each atom.

[4]:
w = tcw.InteractiveMolecule('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O', data=atom_data)
# w # Uncomment this line to run locally
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 w = tcw.InteractiveMolecule('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O', data=atom_data)

NameError: name 'atom_data' is not defined

b63b92c68ed44b449ee89e587fdcf644

The value of the widget will match what you typed into the input.

[5]:
w.smiles
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 w.smiles

NameError: name 'w' is not defined