dscript.models

dscript.models.embedding

Embedding model classes.

class dscript.models.embedding.FullyConnectedEmbed(nin, nout, dropout=0.5, activation=ReLU())[source]

Bases: torch.nn.modules.module.Module

Protein Projection Module. Takes embedding from language model and outputs low-dimensional interaction aware projection.

Parameters
  • nin (int) – Size of language model output

  • nout (int) – Dimension of projection

  • dropout (float) – Proportion of weights to drop out [default: 0.5]

  • activation (torch.nn.Module) – Activation for linear projection model

forward(x)[source]
Parameters

x (torch.Tensor) – Input language model embedding \((b \times N \times d_0)\)

Returns

Low dimensional projection of embedding

Return type

torch.Tensor

class dscript.models.embedding.IdentityEmbed[source]

Bases: torch.nn.modules.module.Module

Does not reduce the dimension of the language model embeddings, just passes them through to the contact model.

forward(x)[source]
Parameters

x (torch.Tensor) – Input language model embedding \((b \times N \times d_0)\)

Returns

Same embedding

Return type

torch.Tensor

class dscript.models.embedding.SkipLSTM(nin=21, nout=100, hidden_dim=1024, num_layers=3, dropout=0, bidirectional=True)[source]

Bases: torch.nn.modules.module.Module

Language model from Bepler & Berger.

Loaded with pre-trained weights in embedding function.

Parameters
  • nin (int) – Input dimension of amino acid one-hot [default: 21]

  • nout (int) – Output dimension of final layer [default: 100]

  • hidden_dim (int) – Size of hidden dimension [default: 1024]

  • num_layers (int) – Number of stacked LSTM models [default: 3]

  • dropout (float) – Proportion of weights to drop out [default: 0]

  • bidirectional (bool) – Whether to use biLSTM vs. LSTM

to_one_hot(x)[source]

Transform numeric encoded amino acid vector to one-hot encoded vector

Parameters

x (torch.Tensor) – Input numeric amino acid encoding \((N)\)

Returns

One-hot encoding vector \((N \times n_{in})\)

Return type

torch.Tensor

transform(x)[source]
Parameters

x (torch.Tensor) – Input numeric amino acid encoding \((N)\)

Returns

Concatenation of all hidden layers \((N \times (n_{in} + 2 \times \text{num_layers} \times \text{hidden_dim}))\)

Return type

torch.Tensor

dscript.models.contact

Contact model classes.

class dscript.models.contact.ContactCNN(embed_dim=100, hidden_dim=50, width=7, activation=Sigmoid())[source]

Bases: torch.nn.modules.module.Module

Residue Contact Prediction Module. Takes embeddings from Projection module and produces contact map, output of Contact module.

Parameters
  • embed_dim (int) –

    Output dimension of dscript.models.embedding model \(d\) [default: 100]

  • hidden_dim (int) – Hidden dimension \(h\) [default: 50]

  • width (int) – Width of convolutional filter \(2w+1\) [default: 7]

  • activation (torch.nn.Module) – Activation function for final contact map [default: torch.nn.Sigmoid()]

broadcast(z0, z1)[source]

Calls dscript.models.contact.FullyConnected.

Parameters
  • z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)

  • z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted contact broadcast tensor \((b \times N \times M \times h)\)

Return type

torch.Tensor

forward(z0, z1)[source]
Parameters
  • z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)

  • z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted contact map \((b \times N \times M)\)

Return type

torch.Tensor

predict(B)[source]

Predict contact map from broadcast tensor.

Parameters

B (torch.Tensor) – Predicted contact broadcast \((b \times N \times M \times h)\)

Returns

Predicted contact map \((b \times N \times M)\)

Return type

torch.Tensor

class dscript.models.contact.FullyConnected(embed_dim, hidden_dim, activation=ReLU())[source]

Bases: torch.nn.modules.module.Module

Performs part 1 of Contact Prediction Module. Takes embeddings from Projection module and produces broadcast tensor.

Input embeddings of dimension \(d\) are combined into a \(2d\) length MLP input \(z_{cat}\), where \(z_{cat} = [z_0 \ominus z_1 | z_0 \odot z_1]\)

Parameters
  • embed_dim (int) –

    Output dimension of dscript.models.embedding model \(d\) [default: 100]

  • hidden_dim (int) – Hidden dimension \(h\) [default: 50]

  • activation (torch.nn.Module) – Activation function for broadcast tensor [default: torch.nn.ReLU()]

forward(z0, z1)[source]
Parameters
  • z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)

  • z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted broadcast tensor \((b \times N \times M \times h)\)

Return type

torch.Tensor

dscript.models.interaction

Interaction model classes.

class dscript.models.interaction.LogisticActivation(x0=0, k=1, train=False)[source]

Bases: torch.nn.modules.module.Module

Implementation of Generalized Sigmoid Applies the element-wise function:

\(\sigma(x) = \frac{1}{1 + \exp(-k(x-x_0))}\)

Parameters
  • x0 (float) – The value of the sigmoid midpoint

  • k (float) – The slope of the sigmoid - trainable - \(k \geq 0\)

  • train (bool) – Whether \(k\) is a trainable parameter

forward(x)[source]

Applies the function to the input elementwise

Parameters

x (torch.Tensor) – \((N \times *)\) where \(*\) means, any number of additional dimensions

Returns

\((N \times *)\), same shape as the input

Return type

torch.Tensor

class dscript.models.interaction.ModelInteraction(embedding, contact, pool_size=9, theta_init=1, lambda_init=0, gamma_init=0, use_W=True)[source]

Bases: torch.nn.modules.module.Module

Main D-SCRIPT model. Contains an embedding and contact model and offers access to those models. Computes pooling operations on contact map to generate interaction probability.

Parameters
  • embedding (dscript.models.embedding.FullyConnectedEmbed) – Embedding model

  • contact (dscript.models.contact.ContactCNN) – Contact model

  • use_cuda (bool) – Whether the model should be run on GPU

  • pool_size (bool) – width of max-pool [default 9]

  • theta_init (float) – initialization value of \(\theta\) for weight matrix [default: 1]

  • lambda_init (float) – initialization value of \(\lambda\) for weight matrix [default: 0]

  • gamma_init (float) – initialization value of \(\gamma\) for global pooling [default: 0]

  • use_W (bool) – whether to use the weighting matrix [default: True]

cpred(z0, z1)[source]

Project down input language model embeddings into low dimension using projection module

Parameters
  • z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

  • z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted contact map \((b \times N \times M)\)

Return type

torch.Tensor

embed(z)[source]

Project down input language model embeddings into low dimension using projection module

Parameters

z (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

D-SCRIPT projection \((b \times N \times d)\)

Return type

torch.Tensor

map_predict(z0, z1)[source]

Project down input language model embeddings into low dimension using projection module

Parameters
  • z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

  • z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted contact map, predicted probability of interaction \((b \times N \times d_0), (1)\)

Return type

torch.Tensor, torch.Tensor

predict(z0, z1)[source]

Project down input language model embeddings into low dimension using projection module

Parameters
  • z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

  • z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted probability of interaction

Return type

torch.Tensor, torch.Tensor