|
FreeSASA |
|
======== |
|
|
|
These pages document the |
|
|
|
- @ref CLI |
|
- @ref API "FreeSASA C API" |
|
- @ref Python "FreeSASA Python interface" |
|
- @ref Config-file |
|
- @ref Selection |
|
- @ref Geometry |
|
|
|
The library is released under the [MIT license](license.md). |
|
|
|
Installation instructions can be found in the [README](README.md) file. |
|
|
|
@page CLI Command-line Interface |
|
|
|
Building FreeSASA creates the binary `freesasa`, which is installed by |
|
`make install`. Calling |
|
|
|
$ freesasa -h |
|
|
|
displays a help message listing all options. The following text |
|
explains how to use most of them. |
|
|
|
@section CLI-Default Run using defaults |
|
|
|
In the following we will use the RNA/protein complex PDB structure |
|
3WBM as an example. It has four protein chains A, B, C and D, and two |
|
RNA strands X and Y. To run a simple SASA calculation using default |
|
parameters, simply type: |
|
|
|
$ freesasa 3wbm.pdb |
|
|
|
This generates the following output |
|
|
|
## FreeSASA 2.0 ## |
|
|
|
PARAMETERS |
|
algorithm : Lee & Richards |
|
probe-radius : 1.400 |
|
threads : 2 |
|
slices : 20 |
|
|
|
INPUT |
|
source : 3wbm.pdb |
|
chains : ABCDXY |
|
atoms : 3714 |
|
|
|
RESULTS (A^2) |
|
Total : 25190.77 |
|
Apolar : 11552.38 |
|
Polar : 13638.39 |
|
CHAIN A : 3785.49 |
|
CHAIN B : 4342.33 |
|
CHAIN C : 3961.12 |
|
CHAIN D : 4904.30 |
|
CHAIN X : 4156.46 |
|
CHAIN Y : 4041.08 |
|
|
|
The results are all in the unit Ångström-squared. |
|
|
|
@section parameters Changing parameters |
|
|
|
If higher precision is needed, the command |
|
|
|
$ freesasa -n 100 3wbm.pdb |
|
|
|
specifies that the calculation should use 100 slices per atom instead of |
|
the default 20. The command |
|
|
|
$ freesasa --shrake-rupley -n 200 --probe-radius 1.2 --n-threads 4 3wbm.pdb |
|
|
|
instead calculates the SASA using Shrake & Rupley's algorithm with 200 |
|
test points, a probe radius of 1.2 Å, using 4 parallel threads to |
|
speed things up. |
|
|
|
If the user wants to use their own atomic radii the command |
|
|
|
$ freesasa --config-file <file> 3wbm.pdb |
|
|
|
Reads a configuration from a file and uses it to assign atomic |
|
radii. The program will halt if it encounters atoms in the PDB input |
|
that are not present in the configuration. See @ref Config-file for |
|
instructions how to write a configuration. |
|
|
|
To use the atomic radii from NACCESS call |
|
|
|
$ freesasa --radii=naccess 3wbm.pdb |
|
|
|
Another way to specify a custom set of atomic radii is to store them as |
|
occupancies in the input PDB file |
|
|
|
$ freesasa --radius-from-occupancy 3wbm.pdb |
|
|
|
This option allows the user to first use the option `--format=pdb` (see @ref CLI-PDB) to |
|
write generate a PDB file with the radii used in the calculation, |
|
modify the radii of individual atoms in that file, and then recalculate |
|
the SASA with these modified radii. |
|
|
|
|
|
@section Output Output formats |
|
|
|
In addition to the standard output format above FreeSASA can export |
|
the results as @ref CLI-JSON, @ref CLI-XML, @ref CLI-PDB, @ref |
|
CLI-RSA, @ref CLI-RES and @ref CLI-SEQ using the option |
|
`--format`. The level of detail of JSON and XML output can be |
|
controlled with the option `--output-depth=<depth>` which takes the |
|
values `atom`, `residue`, `chain` and `structure`. If `atom` is |
|
chosen, SASA values are shown for all levels of the structure, |
|
including individual atoms. With `chain`, only structure and chain |
|
SASA values are printed (this is the default). |
|
|
|
The output can include relative SASA values for each residues. To |
|
calculate these a reference SASA value is needed, calculated using the |
|
same atomic radii. At the moment such values are only available for |
|
the ProtOr and NACCESS radii (selected using the option `--radii`), if |
|
other radii are used relative SASA will be excluded (in RSA output all |
|
REL columns will have the value 'N/A'). |
|
|
|
The reference SASA values for residue X are calculated from Ala-X-Ala |
|
peptides in a stretched out configuration. The reference |
|
configurations are supplied for reference in the directory |
|
`rsa`. Since these are not always the most exposed possible |
|
configuration, and because bond lengths and bond angles vary, the |
|
relative SASA values will sometimes be larger than 100 %. At the |
|
moment there is no interface to supply user-defined reference values. |
|
|
|
|
|
@subsection CLI-JSON JSON |
|
|
|
The command |
|
|
|
$ freesasa --format=xml --output-depth=residue 3wbm.pdb |
|
|
|
generates the following |
|
|
|
~~~~{.json} |
|
{ |
|
"source":"FreeSASA 2.0", |
|
"length-unit":"Ångström", |
|
"results":[ |
|
{ |
|
"input":"3wbm.pdb", |
|
"classifier":"ProtOr", |
|
"parameters":{ |
|
"algorithm":"Lee & Richards", |
|
"probe-radius":1.3999999999999999, |
|
"resolution":20 |
|
}, |
|
"structures":[ |
|
{ |
|
"chain-labels":"ABCDXY", |
|
"area":{ |
|
"total":25190.768387067546, |
|
"polar":13638.391677017404, |
|
"apolar":11552.376710050148, |
|
"main-chain":3337.1622502425053, |
|
"side-chain":21853.606136825045 |
|
}, |
|
"chains":[ |
|
{ |
|
"label":"A", |
|
"n-residues":86, |
|
"area":{ |
|
"total":3785.4864049452635, |
|
"polar":1733.8560208488598, |
|
"apolar":2051.6303840964056, |
|
"main-chain":723.34358684348558, |
|
"side-chain":3062.1428181017791 |
|
} |
|
"residues":[ |
|
{ |
|
"name":"THR", |
|
"number":"5", |
|
"area":{ |
|
"total":138.48216994006549, |
|
"polar":56.887951514571867, |
|
"apolar":81.594218425493622, |
|
"main-chain":38.898190013033592, |
|
"side-chain":99.583979927031905 |
|
}, |
|
"relative-area":{ |
|
"total":104.05152148175331, |
|
"polar":113.98106895325961, |
|
"apolar":98.093554250413092, |
|
"main-chain":96.330336832673567, |
|
"side-chain":107.414496739329 |
|
}, |
|
"n-atoms":7 |
|
}, |
|
|
|
... |
|
|
|
}, |
|
|
|
... |
|
|
|
] |
|
} |
|
] |
|
} |
|
] |
|
} |
|
~~~~ |
|
|
|
Where ellipsis indicates the remaining residues and chains. |
|
|
|
@subsection CLI-XML XML |
|
|
|
The command |
|
|
|
$ freesasa --format=xml 3wbm.pdb |
|
|
|
Generates the following |
|
|
|
~~~~{.xml} |
|
<?xml version="1.0" encoding="UTF-8"?> |
|
<results xmlns="http://freesasa.github.io/" source="FreeSASA 2.0" lengthUnit="Ångström"> |
|
<result classifier="ProtOr" input="3wbm.pdb"> |
|
<parameters algorithm="Lee & Richards" probeRadius="1.400000" resolution="20"/> |
|
<structure chains="ABCDXY"> |
|
<area total="25190.768" polar="13638.392" apolar="11552.377" mainChain="3337.162" sideChain="21853.606"/> |
|
<chain label="A" nResidues="86"> |
|
<area total="3785.486" polar="1733.856" apolar="2051.630" mainChain="723.344" sideChain="3062.143"/> |
|
</chain> |
|
<chain label="B" nResidues="84"> |
|
<area total="4342.334" polar="1957.114" apolar="2385.220" mainChain="853.707" sideChain="3488.627"/> |
|
</chain> |
|
<chain label="C" nResidues="86"> |
|
<area total="3961.119" polar="1838.724" apolar="2122.395" mainChain="782.652" sideChain="3178.468"/> |
|
</chain> |
|
<chain label="D" nResidues="89"> |
|
<area total="4904.298" polar="2332.306" apolar="2571.991" mainChain="977.459" sideChain="3926.838"/> |
|
</chain> |
|
<chain label="X" nResidues="25"> |
|
<area total="4156.455" polar="2919.576" apolar="1236.879" mainChain="0.000" sideChain="4156.455"/> |
|
</chain> |
|
<chain label="Y" nResidues="25"> |
|
<area total="4041.076" polar="2856.815" apolar="1184.261" mainChain="0.000" sideChain="4041.076"/> |
|
</chain> |
|
</structure> |
|
</result> |
|
</results> |
|
|
|
~~~~ |
|
|
|
@subsection CLI-PDB PDB |
|
|
|
The command-line interface can also be used as a PDB filter: |
|
|
|
$ cat 3wbm.pdb | freesasa --format=pdb |
|
REMARK 999 This PDB file was generated by FreeSASA 2.0. |
|
REMARK 999 In the ATOM records temperature factors have been |
|
REMARK 999 replaced by the SASA of the atom, and the occupancy |
|
REMARK 999 by the radius used in the calculation. |
|
MODEL 1 |
|
ATOM 1 N THR A 5 -19.727 29.259 13.573 1.64 9.44 |
|
ATOM 2 CA THR A 5 -19.209 28.356 14.602 1.88 5.01 |
|
ATOM 3 C THR A 5 -18.747 26.968 14.116 1.61 0.40 |
|
... |
|
|
|
The output is a PDB-file where the temperature factors have been |
|
replaced by SASA values (last column), and occupancy numbers by the |
|
radius of each atom (second to last column). |
|
|
|
Only the atoms and models used in the calculation will be present in |
|
the output (see @ref Input for how to modify this). |
|
|
|
@subsection CLI-RES SASA of each residue type |
|
|
|
Calculate the SASA of each residue type: |
|
|
|
$ freesasa --format=res 3wbm.pdb |
|
# Residue types in 3wbm.pdb |
|
RES ALA : 251.57 |
|
RES ARG : 2868.98 |
|
RES ASN : 1218.87 |
|
... |
|
RES A : 1581.57 |
|
RES C : 2967.12 |
|
RES G : 1955.16 |
|
RES U : 1693.68 |
|
|
|
@subsection CLI-SEQ SASA of each residue |
|
|
|
Calculate the SASA of each residue in the sequence: |
|
|
|
$ freesasa --format=seq 3wbm.pdb |
|
# Residues in 3wbm.pdb |
|
SEQ A 5 THR : 138.48 |
|
SEQ A 6 PRO : 25.53 |
|
SEQ A 7 THR : 99.42 |
|
... |
|
|
|
@subsection CLI-RSA RSA |
|
|
|
The CLI can also produce output similar to the RSA format from |
|
NACCESS. This format includes both absolute SASA values (ABS) and |
|
relative ones (REL) compared to a precalculated reference max |
|
value. The only significant difference between FreeSASA's RSA output |
|
format and that of NACCESS (except differences in areas due to |
|
different atomic radii), is that FreeSASA will print the value "N/A" |
|
where NACCESS prints "-99.9". |
|
|
|
$ freesasa --format=rsa 3wbm.pdb |
|
REM FreeSASA 2.0 |
|
REM Absolute and relative SASAs for 3wbm.pdb |
|
REM Atomic radii and reference values for relative SASA: ProtOr |
|
REM Chains: ABCDXY |
|
REM Algorithm: Lee & Richards |
|
REM Probe-radius: 1.40 |
|
REM Slices: 20 |
|
REM RES _ NUM All-atoms Total-Side Main-Chain Non-polar All polar |
|
REM ABS REL ABS REL ABS REL ABS REL ABS REL |
|
RES THR A 5 138.48 104.1 99.58 107.4 38.90 96.3 81.59 98.1 56.89 114.0 |
|
RES PRO A 6 25.53 19.3 11.31 11.0 14.23 47.7 21.67 18.7 3.86 23.9 |
|
... |
|
RES GLY A 15 0.64 0.9 0.00 N/A 0.64 0.9 0.00 0.0 0.64 2.0 |
|
... |
|
RES U Y 23 165.16 N/A 165.16 N/A 0.00 N/A 52.01 N/A 113.15 N/A |
|
RES C Y 24 165.01 N/A 165.01 N/A 0.00 N/A 46.24 N/A 118.77 N/A |
|
RES C Y 25 262.46 N/A 262.46 N/A 0.00 N/A 85.93 N/A 176.52 N/A |
|
END Absolute sums over single chains surface |
|
CHAIN 1 A 3785.5 3062.1 723.3 2051.6 1733.9 |
|
CHAIN 2 B 4342.3 3488.6 853.7 2385.2 1957.1 |
|
CHAIN 3 C 3961.1 3178.5 782.7 2122.4 1838.7 |
|
CHAIN 4 D 4904.3 3926.8 977.5 2572.0 2332.3 |
|
CHAIN 5 X 4156.5 4156.5 0.0 1236.9 2919.6 |
|
CHAIN 6 Y 4041.1 4041.1 0.0 1184.3 2856.8 |
|
END Absolute sums over all chains |
|
TOTAL 25190.8 21853.6 3337.2 11552.4 13638.4 |
|
|
|
Note that each `RES` is a single residue, not a residue type as above |
|
(i.e. has the same meaning as `SEQ` above). This unfortunate confusion |
|
of labels is due to RSA support being added much later than the other |
|
options. Fixing it now would break the interface, and will thus |
|
earliest be dealt with in the next major release. |
|
|
|
@subsubsection RSA-naccess Using the NACCESS configuration |
|
|
|
The reference values for the NACCESS configuration in FreeSASA are not |
|
exactly the same as those that ship with NACCESS, but have been |
|
calculated from scratch using the tripeptides that ship with |
|
FreeSASA. Calling |
|
|
|
$ freesasa 3wbm.pdb --format=rsa --radii=naccess |
|
|
|
will give an RSA file where the ABS columns should be identical to |
|
NACCESS (if the latter is run with the flag `-b`). REL values will |
|
differ slightly, due to the differences in reference values. NACCESS |
|
also gives different results for the nucleic acid main-chain and |
|
side-chain (possibly due to a bug in NACCESS?). FreeSASA defines the |
|
(deoxy)ribose and phosphate groups as main-chain and the base as |
|
side-chain. |
|
|
|
@section CLI-select Selecting groups of atoms |
|
|
|
The option `--select` can be used to define groups of atoms whose |
|
integrated SASA we are interested in. It uses a subset of the Pymol |
|
`select` command syntax, see @ref Selection for full |
|
documentation. The following example shows how to calculate the sum of |
|
exposed surface areas of all aromatic residues and of the four chains |
|
A, B, C and D (just the sum of the areas above). |
|
|
|
$ freesasa --select "aromatic, resn phe+tyr+trp+his+pro" --select "abcd, chain A+B+C+D" 3wbm.pdb |
|
... |
|
SELECTIONS |
|
freesasa: warning: Found no matches to resn 'TRP', typo? |
|
freesasa: warning: Found no matches to resn 'HIS', typo? |
|
aromatic : 1196.45 |
|
abcd : 16993.24 |
|
|
|
The lines shown above are appended to the regular output. This |
|
particular protein did not have any TRP or HIS residues, hence the |
|
warnings (written to stderr). The warnings can be supressed with the |
|
flag `-w`. |
|
|
|
@section Chain-groups Analyzing groups of chains |
|
|
|
Calculating the SASA of a given chain or group of chains separately |
|
from the rest of the structure, can be useful for measuring how buried |
|
a chain is in a given structure. The option `--chain-groups` can be |
|
used to do such a separate calculation, calling |
|
|
|
$ freesasa --chain-groups=ABCD+XY 3wbm.pdb |
|
|
|
produces the regular output for the structure 3WBM, but in addition it |
|
runs a separate calculation for the chains A, B, C and D as though X |
|
and Y aren't in the structure, and vice versa: |
|
|
|
PARAMETERS |
|
algorithm : Lee & Richards |
|
probe-radius : 1.400 |
|
threads : 2 |
|
slices : 20 |
|
|
|
|
|
#################### |
|
|
|
INPUT |
|
source : 3wbm.pdb |
|
chains : ABCDXY |
|
atoms : 3714 |
|
|
|
RESULTS (A^2) |
|
Total : 25190.77 |
|
Apolar : 11552.38 |
|
Polar : 13638.39 |
|
CHAIN A : 3785.49 |
|
CHAIN B : 4342.33 |
|
CHAIN C : 3961.12 |
|
CHAIN D : 4904.30 |
|
CHAIN X : 4156.46 |
|
CHAIN Y : 4041.08 |
|
|
|
|
|
#################### |
|
|
|
INPUT |
|
source : 3wbm.pdb |
|
chains : ABCD |
|
atoms : 2664 |
|
|
|
RESULTS (A^2) |
|
Total : 18202.78 |
|
Apolar : 9799.46 |
|
Polar : 8403.32 |
|
CHAIN A : 4243.12 |
|
CHAIN B : 4595.18 |
|
CHAIN C : 4427.11 |
|
CHAIN D : 4937.38 |
|
|
|
|
|
#################### |
|
|
|
INPUT |
|
source : 3wbm.pdb |
|
chains : XY |
|
atoms : 1050 |
|
|
|
RESULTS (A^2) |
|
Total : 9396.28 |
|
Apolar : 2743.09 |
|
Polar : 6653.19 |
|
CHAIN X : 4714.45 |
|
CHAIN Y : 4681.83 |
|
|
|
@section Input PDB input |
|
|
|
@subsection Hetatom-hydrogen Including extra atoms |
|
|
|
The user can ask to include hydrogen atoms and HETATM entries in the |
|
calculation using the options `--hydrogen` and `--hetatm`. In both |
|
cases adding unknown atoms will emit a warning for each atom. This can |
|
either be amended by using the flag `-w` to suppress warnings, or by |
|
using a custom classifier so that they are recognized (see @ref |
|
Config-file). |
|
|
|
@subsection Halt-skip Skipping unknown atoms |
|
|
|
By default FreeSASA guesses the element of an unknown atom and uses |
|
that elements VdW radius. If this fails the radius is set to 0 (and |
|
hence the atom will not contribute to the calculated area). Users can |
|
request to either skip unknown atoms completely (i.e. no guessing) or |
|
to halt when unknown atoms are found and exit with an error. This is |
|
done with the option `--unknown` which takes one of the three |
|
arguments `skip`, `halt` or `guess` (default). Whenever an unknown |
|
atom is skipped or its radius is guessed a warning is printed to |
|
stderr. |
|
|
|
@subsection Chains-models Separating and joining chains and models |
|
|
|
If a PDB file has several chains and/or models, by default all chains |
|
of the first model are used, and the rest of the file is ignored. This |
|
behavior can be modified using the following options |
|
|
|
- `--join-models`: Joins all models in the input into one large |
|
structure. Useful for biological assembly files were different |
|
locations of the same chain in the oligomer are represented by |
|
different MODEL entries. |
|
|
|
- `--separate-models`: Calculate SASA separately for each model in |
|
the input. Useful when the same file contains several |
|
conformations of the same molecule. |
|
|
|
- `--separate-chains`: Calculate SASA separately for each chain in |
|
the input. Can be joined with `--separate-models` to calculate |
|
SASA of each chain in each model. |
|
|
|
- `--chain-groups`: see @ref Chain-groups |
|
|
|
@page API FreeSASA API |
|
|
|
@section Basic-API Basics |
|
|
|
The API is found in the header [freesasa.h](freesasa_8h.html). The |
|
other source-files and headers in the repository are for internal use, |
|
and are not presented here, but are documented in the source |
|
itself. The file [example.c](example_8c_source.html) contains a simple |
|
program that illustrates how to use the API to read a PDB file from |
|
`stdin` and calculate and print the SASA. |
|
|
|
To calculate the SASA of a structure, there are two main options: |
|
|
|
1. Initialize a structure from a PDB-file, using either the default |
|
classifier or a custom one to determine the radius of each atom, |
|
and then run the calculation. |
|
|
|
2. Provide an array of cartesian coordinates and an array containing |
|
the radii of the corresponding atoms to freesasa\_calc\_coord(). |
|
|
|
@subsection API-PDB Calculate SASA for a PDB file |
|
|
|
The following explains how to use FreeSASA to calculate the SASA of a |
|
fictive PDB file (1abc.pdb). At each step one or more error checks |
|
should have been done, but these are ignored here for brevity. See |
|
the documentation of each function to see what errors can occur. |
|
Default parameters are used at every step, the section @ref |
|
Customizing explains how to configure the calculations. |
|
|
|
@subsubsection API-Read-PDB Open PDB file |
|
|
|
The function freesasa\_structure\_from\_pdb() reads the atom |
|
coordinates from a PDB file and assigns a radius to each atom. The |
|
third argument can be used to pass options for how to read the PDB |
|
file. |
|
|
|
~~~{.c} |
|
FILE *fp = fopen("1abc.pdb"); |
|
const freesasa_classifier *classifier = &freesasa_default_classifier; |
|
freesasa_structure *structure = freesasa_structure_from_pdb(fp, classifier, 0); |
|
~~~ |
|
|
|
@subsubsection API-Calc Perform calculation and get total SASA |
|
|
|
Next we use freesasa\_calc\_structure() to calculate SASA using the |
|
structure we just generated, and then print the total area. The argument |
|
`NULL` means use default freesasa_parameters. |
|
|
|
~~~{.c} |
|
freesasa_result *result = freesasa_calc_structure(structure, NULL); |
|
printf("Total area: %f A2\n",result->total); |
|
~~~ |
|
|
|
@subsubsection API-Classes Get polar and apolar area |
|
|
|
We are commonly interested in the polar and apolar areas of a |
|
molecule, this can be calculated by freesasa\_result\_classes(). To |
|
get other classes of atoms we can either define our own classifier, or |
|
use freesasa\_select\_area() defined in the next section. The return |
|
type ::freesasa\_nodearea is a struct contains the total area and the |
|
area of all apolar and polar atoms, and main-chain and side-chain |
|
atoms. |
|
|
|
~~~{.c} |
|
freesasa_nodearea area = freesasa_result_classes(structure, result); |
|
printf("Total : %f A2\n", area.total); |
|
printf("Apolar : %f A2\n", area.apolar); |
|
printf("Polar : %f A2\n", area.polar); |
|
printf("Main-chain : %f A2\n", area.main_chain); |
|
printf("Side-chain : %f A2\n", area.side_chain); |
|
~~~ |
|
|
|
@see @ref Classification |
|
|
|
@subsubsection API-Select Get area of custom groups of atoms |
|
|
|
Groups of atoms can be defined using freesasa\_selection\_new(), which |
|
takes a selection definition uses a subset of the Pymol select syntax |
|
|
|
~~~{.c} |
|
freesasa_selection *selection = |
|
freesasa_selection_new("aromatic, resn phe+tyr+trp+his+pro", |
|
structure, result); |
|
printf("Area of selection '%s': %f A2\n", |
|
freesasa_selection_name(selection), freesasa_selection_area(selection); |
|
~~~ |
|
|
|
@see @ref Selection |
|
|
|
|
|
@subsubsection structure-node Navigating the results as a tree |
|
|
|
In addition to the flat array of results in ::freesasa\_result, and |
|
the global values returned by freesasa\_result\_classes(), FreeSASA |
|
has an interface for navigating the results as a tree. The leaf nodes |
|
are individual atoms, and there are parent nodes at the residue, |
|
chain, and structure levels. The function freesasa\_calc\_tree() does |
|
a SASA calculation and returns the root node of such a tree. (If one |
|
already has a ::freesasa\_result the function freesasa\_tree\_init() |
|
can be used instead). Each node stores a ::freesasa\_nodearea for the |
|
sum of all atoms belonging to the node. The tree can be traversed with |
|
freesasa\_node\_children(), freesasa\_node\_parent() and |
|
freesasa\_node\_next(), and the area, type and name using |
|
freesasa\_node\_area(), freesasa\_node\_type() and |
|
freesasa\_node\_name(). Additionally there are special properties for |
|
each level of the tree. |
|
|
|
@see node |
|
|
|
@subsubsection export-tree Exporting to RSA, JSON and XML |
|
|
|
The tree structure can also be exported to an RSA, JSON or XML file |
|
using freesasa\_tree\_export(). The RSA format is fixed, but the user |
|
can select which levels of the tree to include in JSON and XML. The |
|
following illustrates how one would generate a tree and export it to |
|
XML, including nodes for the whole structure, chains and residues (but |
|
excluding individual atoms). |
|
|
|
~~~~{.c} |
|
freesasa_node *tree = freesasa_calc_tree(structure, |
|
&freesasa_default_parameters, |
|
&freesasa_default_classifier); |
|
FILE *file = fopen("output.xml", "w"); |
|
freesasa_tree_export(file, tree, FREESASA_XML | FREESASA_OUTPUT_RESIDUE); |
|
fclose(file); |
|
freesasa_node_free(tree); |
|
~~~~ |
|
|
|
|
|
@subsection Coordinates |
|
|
|
If users wish to supply their own coordinates and radii, these are |
|
accepted as arrays of doubles passed to the function |
|
freesasa\_calc\_coord(). The coordinate-array should have size 3*n with |
|
coordinates in the order `x1,y1,z1,x2,y2,z2,...,xn,yn,zn`. |
|
|
|
~~~{.c} |
|
double coord[] = {1.0, /* x */ |
|
2.0, /* y */ |
|
3.0 /* z */ }; |
|
double radius[] = {2.0}; |
|
int n_atoms = 1; |
|
freesasa_result *result = freesasa_calc_coord(coord, radius, n_atoms, NULL); |
|
~~~ |
|
|
|
|
|
@subsection Error-handling |
|
|
|
The principle for error handling is that unpredictable errors should |
|
not cause a crash, but rather allow the user to exit gracefully or |
|
make another attempt. Therefore, errors due to user or system |
|
failures, such as faulty parameters, malformatted config-files, I/O |
|
errors or out of memory errors, are reported through return values, |
|
either ::FREESASA\_FAIL or ::FREESASA\_WARN, or by `NULL` pointers, |
|
depending on the context (see the documentation for the individual |
|
functions). |
|
|
|
Errors that are attributable to programmers using the library, such as |
|
passing null pointers where not allowed, are checked by asserts. |
|
|
|
@subsection Thread-safety |
|
|
|
The only global state the library stores is the verbosity level (set |
|
by freesasa\_set\_verbosity()) and the pointer to the error-log |
|
(defaults to `stderr`, can be changed by freesasa\_set\_err\_out()). |
|
|
|
It should be clear from the documentation when the other functions |
|
have side effects such as memory allocation and I/O, and thread-safety |
|
should generally not be an issue (to the extent that your C library |
|
has threadsafe I/O and dynamic memory allocation). The SASA |
|
calculation itself can be parallelized by using a |
|
::freesasa\_parameters struct with ::freesasa\_parameters.n\_threads |
|
\> 1 (default is 2) where appropriate. This only gives a significant |
|
effect on performance for large proteins or at high precision, and |
|
because not all steps are parallelized it is usually not worth it to |
|
go beyond 2 threads. |
|
|
|
@section Customizing Customizing behavior |
|
|
|
The types ::freesasa\_parameters and ::freesasa\_classifier can be |
|
used to change the parameters of the calculations. Users who wish to |
|
use the defaults can pass `NULL` wherever pointers to these are |
|
requested. |
|
|
|
@subsection Parameters Parameters |
|
|
|
Calculation parameters can be stored in a ::freesasa\_parameters |
|
object. It can be initialized to default by |
|
|
|
~~~{.c} |
|
freesasa_parameters param = freesasa_default_parameters; |
|
~~~ |
|
|
|
The following code would run a high precision Shrake & Rupley |
|
calculation with 10000 test points on the provided structure. |
|
|
|
~~~{.c} |
|
freesasa_parameters param = freesasa_default_parameters; |
|
param.alg = FREESASA_SHRAKE_RUPLEY; |
|
param.shrake_rupley_n_points = 10000; |
|
freesasa_result *result = freesasa_calc_structure(structure, param); |
|
~~~ |
|
|
|
@subsection Classification Specifying atomic radii and classes |
|
|
|
Classifiers are used to determine which atoms are polar or apolar, and |
|
to specify atomic radii. In addition the three standard classifiers |
|
(see below) have reference values for the maximum areas of the 20 |
|
standard amino acids which can be used to calculate relative areas of |
|
residues, as in the RSA output. |
|
|
|
The default classifier is available through the const variable |
|
::freesasa\_default\_classifier. This uses the *ProtOr* radii, defined |
|
in the paper by Tsai et |
|
al. ([JMB 1999, 290: 253](http://www.ncbi.nlm.nih.gov/pubmed/10388571)) |
|
for the standard amino acids (20 regular plus SEC, PYL, ASX and GLX), |
|
for some capping groups (ACE/NH2) and the standard nucleic acids. If |
|
the element can't be determined or is unknown, a zero radius is |
|
assigned. It classes all carbons as *apolar* and all other known atoms |
|
as *polar*. |
|
|
|
Early versions of FreeSASA used the atomic radii by Ooi et |
|
al. ([PNAS 1987, 84: 3086-3090](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC304812/)), |
|
this classifier is still available through ::freesasa_oons_classifier. |
|
|
|
Users can provide their own classifiers through @ref Config-file. At |
|
the moment these do not allow the user to specify reference values to |
|
calculate relative SASA values for RSA output. |
|
|
|
The default behavior of freesasa_structure_from_pdb(), |
|
freesasa_structure_array(), freesasa_structure_add_atom() and |
|
freesasa_structure_add_atom_wopt() is to first try the provided |
|
classifier and then guess the radius if necessary (emitting warnings |
|
if this is done, uses VdW radii defined by [Mantina et al. J Phys Chem |
|
2009, 113:5806](http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3658832/)). |
|
|
|
See the documentation for these functions for what parameters to use |
|
to change the default behavior. |
|
|
|
@page Config-file Classifier configuration files |
|
|
|
The configuration files read by freesasa_classifier_from_file() or the |
|
command-line option `-c` should have two sections: `types:` and |
|
`atoms:`, and optionally the section `name:`. |
|
|
|
The types-section defines what types of atoms are available |
|
(aliphatic, aromatic, hydroxyl, ...), what the radius of that type is |
|
and what class a type belongs to ('polar' or 'apolar', case |
|
insensitive). The types are just shorthands to associate an atom with |
|
a given combination of class and radius. The user is free to define as |
|
many types and classes as necessary. |
|
|
|
The atoms-section consists of triplets of residue-name, atom-name (as |
|
in the corresponding PDB entries) and type. A prototype file would be |
|
|
|
~~~ |
|
name: myclassifier # tag and value must be on the same line (optional) |
|
|
|
types: |
|
C_ALIPHATIC 2.00 apolar |
|
C_AROMATIC 1.75 apolar |
|
N 1.55 polar |
|
|
|
atoms: |
|
ANY N N |
|
ANY CA C_ALIPHATIC |
|
ANY CB C_ALIPHATIC |
|
|
|
ARG CG C_ALIPHATIC |
|
|
|
PRO CB C_AROMATIC # overrides ANY CB |
|
~~~ |
|
|
|
The residue type `ANY` can be used for atoms that are the same in all |
|
or most residues (such as backbone atoms). If there is an exception |
|
for a given amino acid this can be overridden as is shown for `PRO CB` |
|
in the example. |
|
|
|
A few example configurations are available in the directory |
|
[share/](https://github.com/mittinatten/freesasa/tree/master/share). The |
|
configuration-file |
|
[protor.config](https://github.com/mittinatten/freesasa/tree/master/share/protor.config) |
|
is a copy of the default classifier, and can be used to add extra |
|
atoms that need to be classified, while keeping the defaults for the |
|
standard residues (also see the file |
|
[scripts/chemcomp2config.pl](https://github.com/mittinatten/freesasa/tree/master/scripts/) |
|
for instructions on how to generate configurations for new chemical |
|
components semi-automatically). If something common is missing in the |
|
default classifier, [create an |
|
issue](https://github.com/mittinatten/freesasa/issues) on Github so |
|
that it can be added. |
|
|
|
FreeSASA also ships with some configuration-files that mimic other |
|
popular programs, such as |
|
[NACCESS](https://github.com/mittinatten/freesasa/tree/master/share/naccess.config) |
|
and |
|
[DSSP](https://github.com/mittinatten/freesasa/tree/master/share/dssp.config). |
|
|
|
The static classifiers in the API were generated using |
|
[scripts/config2c.pl](https://github.com/mittinatten/freesasa/tree/master/scripts/) |
|
to convert the correspoding configurations in `share` to C code. |
|
|
|
@page Selection Selection syntax |
|
|
|
FreeSASA uses a subset of the Pymol select commands to give users an |
|
easy way of summing up the SASA of groups of atoms. This is done by |
|
the function freesasa\_selection\_new() in the C API, |
|
freesasa.selectArea() in the Python interface and the option |
|
`--select` for the command line tool. All commands are case |
|
insensitive. A basic selection has a selection name, a property |
|
selector and a list of arguments |
|
|
|
<selection-name>, <selector> <list> |
|
|
|
For example |
|
|
|
aromatic, resn phe+tyr+trp+his+pro |
|
|
|
Several selectors can be joined using boolean logic and parentheses, |
|
|
|
<selection-name>, (<s1> <l1>) and not (<s2> <l2> or <s3> <l3>) |
|
|
|
where s1, s2 and s3 are selectors and l1, l2 and l3 are lists. The |
|
operator `and` has precedence over `or`, so the second parentheses is |
|
necessary but not the first, in the example above. The selection name |
|
can include letters, numbers and underscores. The name can't be longer |
|
than ::FREESASA\_MAX\_SELECTION\_NAME characters. |
|
|
|
The following property selectors are supported |
|
|
|
- `resn` Residue names like "ala", "arg", "du", etc |
|
- `resi` Residue index (positive or negative integers) |
|
- `chain` Chain labels (single characters) |
|
- `name` Atom names, such as "ca", "c", "oxt", etc |
|
- `symbol` Element symbols, such as "C", "O", "Se", "Fe", etc. |
|
|
|
A list of residues can be selected using |
|
|
|
resn ala+val+leu+ile+met |
|
|
|
and similarly for the other four selectors. In addition `resi` and |
|
`chain` support ranges |
|
|
|
resi 1-10 (residues 1 to 10) |
|
resi -10 (residues indices < 10) |
|
resi 10- (residues indices > 10) |
|
resi 1-10+20-30+35- (residues 1 to 10, 20 to 30 and above 35) |
|
resi \-20-\-15+\-10-5 (residues -20 to -15 and -10 to 5) |
|
chain A+C-E (chains A and C to E, no open intervals allowed here) |
|
|
|
Combining ranges with plus signs, as in the three last lines, is not |
|
allowed in Pymol but supported by FreeSASA. |
|
|
|
If a selection list contains elements not found in the molecule that |
|
is analyzed, a warning is printed and that part of the list does not |
|
contribute to the selection. Not finding a list element can be because |
|
it specifies a residue that does not exist in the particular molecule, |
|
or because of typos. The selector does not keep a list of valid |
|
elements, residue names, etc. |
|
|
|
@page Python Python interface |
|
|
|
If Python is enabled using |
|
|
|
$ ./configure --enable-python-bindings |
|
|
|
Cython is used to build Python bindings for FreeSASA, and `make |
|
install` will install them. The option `--with-python=...` can be |
|
specified to specify which Python binary to use. |
|
|
|
Below follow some illustrations of how to use the package, see the |
|
@ref freesasa "package documentation" for details. |
|
|
|
@section Python-basics Basic calculations |
|
|
|
Using defaults everywhere a simple calculation can be carried out as |
|
follows (assuming the PDB structure 1UBQ is available) |
|
|
|
~~~{.py} |
|
import freesasa |
|
|
|
structure = freesasa.Structure("1ubq.pdb") |
|
result = freesasa.calc(structure) |
|
area_classes = freesasa.classifyResults(result, structure) |
|
|
|
print "Total : %.2f A2" % result.totalArea() |
|
for key in area_classes: |
|
print key, ": %.2f A2" % area_classes[key] |
|
~~~ |
|
|
|
Which would give the following output |
|
|
|
Total : 4804.06 A2 |
|
Polar : 2504.22 A2 |
|
Apolar : 2299.84 A2 |
|
|
|
The following does a high precision L&R calculation |
|
|
|
~~~{.py} |
|
result = freesasa.calc(structure, |
|
freesasa.Parameters({'algorithm' : freesasa.LeeRichards, |
|
'n-slices' : 100})) |
|
~~~ |
|
|
|
Using the results from a calculation we can also integrate SASA over a selection of |
|
atoms, using a subset of the Pymol selection syntax (see @ref Selection): |
|
|
|
~~~{.py} |
|
selections = freesasa.selectArea(('alanine, resn ala', 'r1_10, resi 1-10'), |
|
structure, result) |
|
for key in selections: |
|
print key, ": %.2f A2" % selections[key] |
|
~~~ |
|
which gives the output |
|
|
|
alanine : 120.08 A2 |
|
r1_10 : 634.31 A2 |
|
|
|
@section Python-classification Customizing atom classification |
|
|
|
This uses the NACCESS parameters (the file 'naccess.config' is |
|
available in the share/ directory of the repository). |
|
|
|
~~~{.py} |
|
classifier = freesasa.Classifier("naccess.config") |
|
structure = freesasa.Structure("1ubq.pdb", classifier) |
|
result = freesasa.calc(structure) |
|
area_classes = freesasa.classifyResults(result, structure, classifier) |
|
~~~ |
|
|
|
Classification can be customized also by extending the Classifier |
|
interface. The code below is an illustration of a classifier that |
|
classes Nitrogens separately, and assigns radii based on element only |
|
(and crudely). |
|
|
|
~~~{.py} |
|
import freesasa |
|
import re |
|
|
|
class DerivedClassifier(Classifier): |
|
def classify(self, residueName, atomName): |
|
if re.match('\s*N', atomName): |
|
return 'Nitrogen' |
|
return 'Not-nitrogen' |
|
|
|
def radius(self, residueName, atomName): |
|
if re.match('\s*N',atomName): # Nitrogen |
|
return 1.6 |
|
if re.match('\s*C',atomName): # Carbon |
|
return 1.7 |
|
if re.match('\s*O',atomName): # Oxygen |
|
return 1.4 |
|
if re.match('\s*S',atomName): # Sulfur |
|
return 1.8 |
|
return 0; # everything else (Hydrogen, etc) |
|
|
|
classifier = DerivedClassifier() |
|
|
|
# use the DerivedClassifier to calculate atom radii |
|
structure = freesasa.Structure("1ubq.pdb", classifier) |
|
result = freesasa.calc(structure) |
|
|
|
# use the DerivedClassifier to classify atoms |
|
area_classes = freesasa.classifyResults(result,structure,classifier) |
|
~~~ |
|
|
|
Of course, this example is somewhat contrived, if we only want the |
|
integrated area of Nitrogen atoms, the simpler choice would be |
|
~~~{.py} |
|
selection = freesasa.selectArea('nitrogen, symbol n', structure, result) |
|
~~~ |
|
|
|
However, extending freesasa.Classifier, as illustrated above, allows |
|
classification to arbitrary complexity and also lets us redefine the |
|
radii used in the calculation. |
|
|
|
@section BioPDB Bio.PDB |
|
|
|
FreeSASA can also calculate the SASA of a Bio.PDB structure |
|
|
|
~~~{.py} |
|
from Bio.PDB import PDBParser |
|
parser = PDBParser() |
|
structure = parser.get_structure("Ubiquitin", "1ubq.pdb") |
|
result, sasa_classes = freesasa.calcBioPDB(structure) |
|
~~~ |
|
|
|
If one needs more control over the analysis the structure can be |
|
converted to a freesasa.Structure using freesasa.structureFromBioPDB() |
|
and the calculation can be performed the normal way using this |
|
structure. |
|
|
|
@page Geometry Geometry of Lee & Richards' algorithm |
|
|
|
This page explains the geometry of the calculations in L&R |
|
and can be used to understand the source code. As far as possible the |
|
code uses similar notation to the formulas here. |
|
|
|
We will use the following notation: An atom \f$i\f$ has a van der |
|
Waals radius \f$r_i\f$, the rolling sphere (or *probe*) has radius |
|
\f$r_\text{P}\f$ and when these are added we get an extended radius |
|
\f$R_i = r_i + r_\text{P}\f$. The sphere of radius \f$R_i\f$ centered |
|
at the position of atom \f$i\f$ represents the volume not accessible |
|
to the center of the probe. The SASA for a molecule is then obtained |
|
by calculating the non-buried surface area of the extended spheres. |
|
|
|
The L&R algorithm calculates the surface area by slicing the |
|
protein, calculating the length of the solvent exposed contours in |
|
each slice and then adding up the length multiplied by the slice |
|
thickness. |
|
|
|
 |
|
|
|
Divide atom \f$i\f$ into \f$n\f$ slices, orthogonal to an arbitrary |
|
axis, of thickness \f$\delta = 2R_i/n\f$. The position of the middle |
|
of the slice along that axis is denoted \f$z\f$, and the center of |
|
atom \f$i\f$, along the same axis, is at \f$z_i\f$. In each slice, the |
|
atom is thus a circle of radius \f[R_i^\prime = |
|
\sqrt{R_i^2-(z-z_i)^2}\,.\f] These circles are either completely |
|
buried inside neighboring atoms, completely exposed, or partially |
|
exposed. |
|
|
|
 |
|
|
|
The exposed arc lengths for each atom can be calculated exactly. For |
|
each pair of atoms \f$i,j\f$, the distance between their centers |
|
projected on the slice is \f$d_{ij}\f$ (independent of \f$z\f$). If |
|
\f$d_{ij} > R_i^\prime + R_j^\prime\f$, there is no overlap. If |
|
\f$d_{ij} < R_j^\prime - R_i^\prime\f$ circle \f$i\f$ is completely |
|
inside \f$j\f$ (and the other way around). If \f$d_{ij}\f$ lies |
|
between these two cases the angle of circle \f$i\f$ that is buried due |
|
to circle \f$j\f$ is |
|
|
|
\f[ \alpha = 2\arccos \bigl[({R_i^\prime}^2_{\,} |
|
+ d_{ij}^2 - {R_{j}^\prime}^2_{\,})/(2R_i^\prime d_{ij})\bigr]. \f] |
|
|
|
If the middle point of this arc on the circle is at an angle |
|
\f$\beta\f$, the arc spans the interval |
|
\f$[\beta-\alpha/2,\beta+\alpha/2]\f$. By adding up these arcs and |
|
taking into account any overlap between them we get the total buried |
|
angle \f$\gamma\f$ in this slices. The exposed arc angle for this atom |
|
and slice is thus \f$2\pi-\gamma\f$ and the total SASA of that atom |
|
|
|
\f[ A_i =R_i \delta \!\! \sum_{s\in\text{slices}} \!\! |
|
\left[2\pi-\gamma_s\right]\,. \f] |
|
|
|
The angle is multiplied by \f$R_i\f$ (not \f$R_i^\prime\f$) to give |
|
the area of a conical frustum circumscribing the sphere at the |
|
slice. Finally, the total area \f$A\f$ is the sum of all \f$A_i\f$. |
|
|
|
In FreeSASA, the L\&R SASA calculation begins by finding overlapping |
|
spheres and storing the contacts in an adjacency list. It then |
|
iterates through all the slices of each atom and checks for overlap |
|
with adjacent atoms in each slice, and adds up the exposed arcs to |
|
calculate the atom's contribution to the SASA of the slice. The |
|
calculations for each atom are completely independent and can thus be |
|
parallelized over an arbitrary number of threads, whereas the |
|
calculation of adjacency lists has not been parallelized. |
|
|