dscript.models¶

dscript.models.embedding¶

Embedding model classes.

class dscript.models.embedding.FullyConnectedEmbed(nin, nout, dropout=0.5, activation=ReLU())[source]¶

Bases: torch.nn.modules.module.Module

Protein Projection Module. Takes embedding from language model and outputs low-dimensional interaction aware projection.

Parameters

nin (int) – Size of language model output
nout (int) – Dimension of projection
dropout (float) – Proportion of weights to drop out [default: 0.5]
activation (torch.nn.Module) – Activation for linear projection model

forward(x)[source]¶

Parameters: x (torch.Tensor) – Input language model embedding \((b \times N \times d_0)\)
Returns: Low dimensional projection of embedding
Return type: torch.Tensor

class dscript.models.embedding.IdentityEmbed[source]¶

Bases: torch.nn.modules.module.Module

Does not reduce the dimension of the language model embeddings, just passes them through to the contact model.

forward(x)[source]¶

Parameters: x (torch.Tensor) – Input language model embedding \((b \times N \times d_0)\)
Returns: Same embedding
Return type: torch.Tensor

class dscript.models.embedding.SkipLSTM(nin=21, nout=100, hidden_dim=1024, num_layers=3, dropout=0, bidirectional=True)[source]¶

Bases: torch.nn.modules.module.Module

Language model from Bepler & Berger.

Loaded with pre-trained weights in embedding function.

Parameters

nin (int) – Input dimension of amino acid one-hot [default: 21]
nout (int) – Output dimension of final layer [default: 100]
hidden_dim (int) – Size of hidden dimension [default: 1024]
num_layers (int) – Number of stacked LSTM models [default: 3]
dropout (float) – Proportion of weights to drop out [default: 0]
bidirectional (bool) – Whether to use biLSTM vs. LSTM

to_one_hot(x)[source]¶

Transform numeric encoded amino acid vector to one-hot encoded vector

Parameters: x (torch.Tensor) – Input numeric amino acid encoding \((N)\)
Returns: One-hot encoding vector \((N \times n_{in})\)
Return type: torch.Tensor

transform(x)[source]¶

Parameters: x (torch.Tensor) – Input numeric amino acid encoding \((N)\)
Returns: Concatenation of all hidden layers \((N \times (n_{in} + 2 \times \text{num_layers} \times \text{hidden_dim}))\)
Return type: torch.Tensor

dscript.models.contact¶

Contact model classes.

class dscript.models.contact.ContactCNN(embed_dim=100, hidden_dim=50, width=7, activation=Sigmoid())[source]¶

Bases: torch.nn.modules.module.Module

Residue Contact Prediction Module. Takes embeddings from Projection module and produces contact map, output of Contact module.

Parameters

embed_dim (int) –
Output dimension of dscript.models.embedding model \(d\) [default: 100]
hidden_dim (int) – Hidden dimension \(h\) [default: 50]
width (int) – Width of convolutional filter \(2w+1\) [default: 7]
activation (torch.nn.Module) – Activation function for final contact map [default: torch.nn.Sigmoid()]

broadcast(z0, z1)[source]¶

Calls dscript.models.contact.FullyConnected.

Parameters

z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)
z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted contact broadcast tensor \((b \times N \times M \times h)\)

Return type

torch.Tensor

forward(z0, z1)[source]¶

Parameters

z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)
z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted contact map \((b \times N \times M)\)

Return type

torch.Tensor

predict(B)[source]¶

Predict contact map from broadcast tensor.

Parameters: B (torch.Tensor) – Predicted contact broadcast \((b \times N \times M \times h)\)
Returns: Predicted contact map \((b \times N \times M)\)
Return type: torch.Tensor

class dscript.models.contact.FullyConnected(embed_dim, hidden_dim, activation=ReLU())[source]¶

Bases: torch.nn.modules.module.Module

Performs part 1 of Contact Prediction Module. Takes embeddings from Projection module and produces broadcast tensor.

Input embeddings of dimension \(d\) are combined into a \(2d\) length MLP input \(z_{cat}\), where \(z_{cat} = [z_0 \ominus z_1 | z_0 \odot z_1]\)

Parameters

embed_dim (int) –
Output dimension of dscript.models.embedding model \(d\) [default: 100]
hidden_dim (int) – Hidden dimension \(h\) [default: 50]
activation (torch.nn.Module) – Activation function for broadcast tensor [default: torch.nn.ReLU()]

forward(z0, z1)[source]¶

Parameters

z0 (torch.Tensor) – Projection module embedding \((b \times N \times d)\)
z1 (torch.Tensor) – Projection module embedding \((b \times M \times d)\)

Returns

Predicted broadcast tensor \((b \times N \times M \times h)\)

Return type

torch.Tensor

dscript.models.interaction¶

Interaction model classes.

class dscript.models.interaction.LogisticActivation(x0=0, k=1, train=False)[source]¶

Bases: torch.nn.modules.module.Module

Implementation of Generalized Sigmoid Applies the element-wise function:

\(\sigma(x) = \frac{1}{1 + \exp(-k(x-x_0))}\)

Parameters

x0 (float) – The value of the sigmoid midpoint
k (float) – The slope of the sigmoid - trainable - \(k \geq 0\)
train (bool) – Whether \(k\) is a trainable parameter

forward(x)[source]¶

Applies the function to the input elementwise

Parameters: x (torch.Tensor) – \((N \times *)\) where \(*\) means, any number of additional dimensions
Returns: \((N \times *)\), same shape as the input
Return type: torch.Tensor

class dscript.models.interaction.ModelInteraction(embedding, contact, pool_size=9, theta_init=1, lambda_init=0, gamma_init=0, use_W=True)[source]¶

Bases: torch.nn.modules.module.Module

Main D-SCRIPT model. Contains an embedding and contact model and offers access to those models. Computes pooling operations on contact map to generate interaction probability.

Parameters

embedding (dscript.models.embedding.FullyConnectedEmbed) – Embedding model
contact (dscript.models.contact.ContactCNN) – Contact model
use_cuda (bool) – Whether the model should be run on GPU
pool_size (bool) – width of max-pool [default 9]
theta_init (float) – initialization value of \(\theta\) for weight matrix [default: 1]
lambda_init (float) – initialization value of \(\lambda\) for weight matrix [default: 0]
gamma_init (float) – initialization value of \(\gamma\) for global pooling [default: 0]
use_W (bool) – whether to use the weighting matrix [default: True]

cpred(z0, z1)[source]¶

Project down input language model embeddings into low dimension using projection module

Parameters

z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)
z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted contact map \((b \times N \times M)\)

Return type

torch.Tensor

embed(z)[source]¶

Project down input language model embeddings into low dimension using projection module

Parameters: z (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)
Returns: D-SCRIPT projection \((b \times N \times d)\)
Return type: torch.Tensor

map_predict(z0, z1)[source]¶

Project down input language model embeddings into low dimension using projection module

Parameters

z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)
z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted contact map, predicted probability of interaction \((b \times N \times d_0), (1)\)

Return type

torch.Tensor, torch.Tensor

predict(z0, z1)[source]¶

Project down input language model embeddings into low dimension using projection module

Parameters

z0 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)
z1 (torch.Tensor) – Language model embedding \((b \times N \times d_0)\)

Returns

Predicted probability of interaction

Return type

torch.Tensor, torch.Tensor