Molecular Structures¶
Molecular structures are represented by the pv.mol.Mol()
class. While nothing restricts the type of molecules stored in an instance of pv.mol.Mol()
, the data structure is optimized for biological macromolecules and follows the same hierarchical organizing principle. The lowest level of the hierarchy is formed by chains. The chains consist of one or more residues. Depending on the type of residues the chain holds, the chain is interpreted as a linear chain of residues, e.g. a polyeptide, or polynucleotide, or a collection of an unordered group of molecules such as water. In the former case, residues are ordered from N to C terminus, whereas in the latter the ordering of the molecules does not carry any meaning. Each residue consists of one or more atoms.
Tightly coupled to pv.mol.Mol()
is the concept of structural subset, a pv.mol.MolView()
. MolViews have the exact same interface than pv.mol.Mol()
and in most cases behave exactly the same. Thus, from a user perspective it mostly does not matter whether one is working with a complete structure or a subset thereof. In the following, the APIs for the pv.mol.Mol()
and pv.mol.MolView()
classes are described together. Where differences exist, they are documented.
Obtaining and Creating Molecular Structures¶
The most common way to construct molecules
is through one of the io functions. For example, to import the structure from a PDB file, use pv.io.pdb()
. The whole structure, or a subset thereof can then be displayed on the screen by using one of the rendering functions.
The following code example fetches a PDB file from PDB.org imports it and displays the chain with name ‘A’ on the screen. For more details on how to create subsets, see Creating Subsets of a Molecular Structure.
$.ajax('http://pdb.org/pdb/files/'+pdbId+'.pdb')
.done(function(data) {
// data contains the contents of the PDB file in text form
var structure = pv.io.pdb(data);
var firstChain = structure.select({chain: 'A'});
viewer.cartoon('firstChain', firstChain);
});
Alternatively, you can create the structure by hand. That’s typically not required, unless you are implementing your own importer for a custom format. The following code creates a simple molecule consisting of 10 atoms arranged along the x-axis.
var structure = new pv.mol.Mol();
var chain = structure.addChain('A');
for (var i = 0; i < 10; ++i) {
var residue = chain.addResidue('ABC', i);
residue.addAtom('X', [i, 0, 0], 'C');
}
Creating Subsets of a Molecular Structure¶
It is quite common to only apply operations (coloring, displaying) to subset of a molecular structure. These subsets are modelled as views and can be created in different ways.
- The most convenient way to create views is by using
pv.mol.Mol.select()
. Select accepts a set of predicates and returns a view containing only chains, residues and atoms that match the predicates.- Alternatively for more complex selections, one can use
pv.mol.Mol.residueSelect()
, orpv.mol.Mol.atomSelect()
, which evaluates a function on each residue/atom and includes residues/atoms for which the function returns true.- Selection by distance allows to select parts of a molecule that are within a certain radius of another molecule.
- Views can be assembled manually through
pv.mol.MolView.addChain()
,pv.mol.ChainView.addResidue()
,pv.mol.ResidueView.addAtom()
. This is the most flexible but also the most verbose way of creating views.
Loading Molecular Structures¶
The following functions import structures from different data formats.
-
pv.io.
pdb
(pdbData[, options])¶ Loads a structure from the pdbData string and returns it. In case multiple models are present (as designated by MODEL/ENDMDL), only the first is read. This behavior can be changed by passing
loadAllModels : true
to the options dictionary. In that case all models present in the string are loaded and returned as an array. Secondary structure and assembly information is assigned to all of the models.
The following record types are handled:
- ATOM/HETATM for the actual coordinate data. Alternative atom locations other than those labelled as A are discarded.
- HELIX/STRAND for assignment of secondary structure information.
- REMARK 350 for handling of biological assemblies
-
pv.io.
sdf
(sdfData)¶ Load small molecules from sdfData and returns them. In case multiple molecules are present, these molecules are returned as separate chains of the same
pv.mol.Mol()
instance.Currently, only a minimal set of information is extracted from SDF files:
- atom position, element, atom name (set to the element)
- connectivity information
- the chain name is set to the structure title
-
pv.io.
fetchPdb
(url, callback[, options])¶ -
pv.io.
fetchSdf
(url, callback)¶ Performs an adjax request the provided URL and loads the data as a structure using either
pv.io.pdb()
, orpv.io.sdf()
. Upon success, the callback is invoked with the loaded structure as the only argument. options is passed as-is topv.io.pdb()
.
Mol (and MolView)¶
-
class
pv.mol.
Mol
()¶ Represents a complete molecular structure which may consist of multiple polypeptide chains, solvent and other molecules. Instances of mol are typically created through one of the io functions, e.g.
pv.io.pdb()
, orpv.io.sdf()
.
-
class
pv.mol.
MolView
()¶ Represents a subset of a molecular structure, e.g. the result of a selection operation. Except for a few differences, it’s API is identical to
pv.mol.Mol()
.
-
pv.mol.Mol.
eachAtom
(callback)¶ -
pv.mol.MolView.
eachAtom
(callback)¶ Invoke callback for each atom in the structure. For example, the following code calculates the number of carbon alpha atoms.
var carbonAlphaCount = 0; myStructure.eachAtom(function(atom) { if (atom.name() !== 'CA') return; if (!atom.residue().isAminoacid()) return; carbonAlphaCount += 1; }); console.log('number of carbon alpha atoms', carbonAlphaCount);
-
pv.mol.Mol.
eachResidue
(callback)¶ -
pv.mol.MolView.
eachResidue
(callback)¶ Invoke callback for each residue in the structure or view.
-
pv.mol.Mol.
full
()¶ -
pv.mol.MolView.
full
()¶ Convenience function that always links back to
pv.mol.Mol()
. For instances ofpv.mol.Mol()
, returns this directly, for instances ofpv.mol.MolView()
returns a reference to thepv.mol.Mol()
the subset was derived from.
-
pv.mol.Mol.
atomCount
()¶ -
pv.mol.MolView.
atomCount
()¶ Returns the number of atoms in the structure, subset of structure.
-
pv.mol.Mol.
center
()¶ -
pv.mol.MolView.
center
()¶ Returns the geometric center of all atoms in the structure.
-
pv.mol.Mol.
chains
()¶ -
pv.mol.MolView.
chains
()¶ Returns an array of all chains in the structure. For
pv.mol.Mol()
, this returns a list ofpv.mol.Chain()
instances, forpv.mol.MolView()
a list ofpv.mol.ChainView()
instances.
-
pv.mol.Mol.
select
(what)¶ -
pv.mol.MolView.
select
(what)¶ Returns a
pv.mol.MolView()
containing a filtered subset of chains, residues and atoms. what determines how the filtered subset is created. It can be set to a predefined string for commonly required selections, or be set to a dictionary of predicates that have to match for a chain, residue or atom to be included in the result. Currently, the following predefined selections are accepted:- water: selects residues with names HOH and DOD (deuteriated water).
- protein: returns all amino-acids found in the structure. Note that this might return amino acid ligands as well.
- ligand: selects all residues which are not water nor protein.
Matching by predicate dictionary provides a flexible way to specify selections without having to write custom callbacks. A predicate is a condition which has to be fullfilled in order to include a chain, residue or atom in the results. Some of the predicates match against chain ,e.g. cname, others against residues, e.g. rname, and others against atoms, e.g. ele. When multiple predicates are specified in the dictionary, all of them have to match for an item to be included in the results.
Available Chain Predicates:
- cname/chain: A chain is included iff the chain name it is equal to the cname/chain. To match against multiple chain names, use the plural forms cnames/chains.
Available Residue Predicates:
- rname: A residue is included iff the residue name it is equal to rname. To match against multiple residue names, use the plural form rnames.
- rindexRange include residues at position in a chain in the interval rindexRange[0] and rindexRange[1]. The residue at rindexRange[1] is also included. Indices are zero-based.
- rindices includes residues at certain positions in the chain. Indices are zero based.
- rnum includes residues having the provided residue number value. Only the numeric part is honored, insertion codes are ignored. To match against multiple residue numbers, use the plural form rnums.
- rnumRange include residues with numbers between rnumRange[0] and rnumRange[1]. The residue with number rnumRange[1] is also included.
Available Atom Predicates:
- aname An atom is included iff the atom name it is equal to aname. To match against multiple atom names, use the plural form anames.
- hetatm An atom is included iff the atom hetatm flag matches the provided value.
Examples:
// select chain with name 'A' and all its residues and atoms var chainA = myStructure.select({cname : 'A'}); // select carbon alpha of chain 'A'. Residues with no carbon alpha will not be // included in the result. var chainACarbonAlpha = myStructure.select({cname : 'A', aname : 'CA'});
When none of the above selection mechanisms is flexible enough, consider using
pv.mol.Mol.residueSelect()
, orpv.mol.Mol.atomSelect()
.Returns: pv.mol.MolView()
containing the subset of chains, residues and atoms.
-
pv.mol.Mol.
selectWithin
(structure[, options])¶ -
pv.mol.MolView.
selectWithin
(structure[, options])¶ Returns an instance of
pv.mol.MolView()
containing chains, residues and atoms which are in spatial proximity to structure.Arguments: - structure –
pv.mol.Mol()
orpv.mol.MolView()
to which proximity is required. - options – An optional dictionary of options to control the behavior of selectWithin (see below)
Options
- radius sets the distance cutoff in Angstrom. The default radius is 4.
- matchResidues whether to use residue matching mode. When set to true, all atom of a residue are included in result as soon as one atom is in proximity.
- structure –
-
pv.mol.Mol.
residueSelect
(predicate)¶ -
pv.mol.MolView.
residueSelect
(predicate)¶ Returns an instance of
pv.mol.MolView()
only containing residues which match the predicate function. The predicate must be a function which accepts a residue as its only argument and return true for residues to be included. For all other residues, the predicate must return false. All atoms of matching residues will be included in the view.Example:
var oddResidues = structure.residueSelect(function(res) { return res.index() % 2; });
-
pv.mol.Mol.
atomSelect
(predicate)¶ -
pv.mol.MolView.
atomSelect
(predicate)¶ Returns an instance of
pv.mol.MolView()
only containing atoms which match the predicate function. The predicate must be a function which accepts an atom as its only argument and return true for atoms to be included. For all other atoms, the predicate must return false.Example:
var carbonAlphas = structure.atomSelect(function(atom) { return res.name() === 'CA'; });
-
pv.mol.Mol.
addChain
(name)¶ Adds a new chain with the given name to the structure
Arguments: - name – the name of the chain
Returns: the newly created
pv.mol.Chain()
instance
-
pv.mol.MolView.
addChain
(residue, includeAllResiduesAndAtoms)¶ Adds the given chain to the structure view
Arguments: - chain – the chain to add. Must either be a
pv.mol.ChainView()
, orpv.mol.Chain()
instance. - includeAllResiduesAndAtoms – when true, residues and atoms contained in the chain are directly added as new
pv.mol.ResidueView()
,pv.mol.AtomView()
instances. When set to false (the default), the new chain view is created with an empty list of residues.
Returns: the newly created
pv.mol.ChainView()
instance- chain – the chain to add. Must either be a
-
pv.mol.Mol.
chain
(name)¶ -
pv.mol.MolView.
chain
(name)¶ Alias for
pv.mol.Mol.chainByName()
Chain (and ChainView)¶
-
class
pv.mol.
Chain
()¶ Represents either a linear chain of molecules, e.g. as in peptides or an unordered collection of molecules such as water. New instances are created by calling
pv.mol.Mol.addChain()
.
-
class
pv.mol.
ChainView
()¶ Represents a subset of a chain, that is a selected subset of residues and atoms. New instances are created and added to an existing
pv.mol.MolView()
instance by callingpv.mol.MolView.addChain()
.
-
pv.mol.Chain.
name
()¶ -
pv.mol.ChainView.
name
()¶ The name of the chain. For chains loaded from PDB, the chain names are alpha-numeric and no longer than one character.
-
pv.mol.Chain.
residues
()¶ -
pv.mol.ChainView.
residues
()¶ Returns the list of residues contained in this chain. For
pv.mol.Chain()
instances, returns an array ofpv.mol.Residue()
, forpv.mol.ChainView()
instances returns an array ofpv.mol.ResidueView()
instances.
-
pv.mol.Chain.
eachBackboneTrace
(callback)¶ -
pv.mol.ChainView.
eachBackboneTrace
(callback)¶ Invokes callback for each stretch of consecutive amino acids found in the chain. Each trace contains at least two amino acids. Two amino acids are consecutive when their backbone is complete and the carboxy C-atom and the nitrogen N could potentially form a peptide bond.
Arguments: - callback – a function which accepts the array of trace residues as an argument
-
pv.mol.Chain.
backboneTraces
()¶ -
pv.mol.ChainView.
backboneTraces
()¶ Convenience function which returns all backbone traces of the chain as a list. See
pv.mol.Chain.eachBackboneTrace()
.
-
pv.mol.Chain.
addResidue
(name, number[, insCode])¶ Appends a new residue at the end of the chain
Arguments: - name – the name of the residue, for example ‘GLY’ for glycine.
- number – the numeric part of the residue number
- insCode – the insertion code character. Defaults to ‘\0’.
Returns: the newly created
pv.mol.Residue()
instance
-
pv.mol.Chain.
residueByRnum
(rnum)¶ -
pv.mol.ChainView.
residueByRnum
(rnum)¶ Returns the first residue in the chain with the given numeric residue number. Insertion codes are ignored. In case no residue has the given residue number, null is returned. This function internally uses a binary search when the residue numbers of the chain are ordered, and falls back to a linear search in case the residue numbers are unordered.
Returns: if found, the residue instance, and null if no such residue exists.
-
pv.mol.Chain.
residuesInRnumRange
(start, end)¶ -
pv.mol.ChainView.
residuesInRnumRange
(start, end)¶ Returns the list of residues that have residue number in the range start, end. Insertion codes are ignored. This function internally uses a binary search to quickly determine the residues included in the range when the residue numbers in the chain are ordered, and falls back to a linear search in case the residue numbers are unordered.
Example:
// will contain residues with numbers from 5 to 10. var residues = structure.chain('A').residuesInRnumRange(5, 10);
-
pv.mol.ChainView.
addResidue
(residue, includeAllAtoms)¶ Adds the given residue to the chain view
Arguments: - residue – the residue to add. Must either be a
pv.mol.ResidueView()
, orpv.mol.Residue()
instance. - includeAllAtoms – when true, all atoms of the residue are directly added as new AtomViews to the residue. When set to false (the default), a new residue view is created with an empty list of atoms.
Returns: the newly created
pv.mol.ResidueView()
instance- residue – the residue to add. Must either be a
Residue (and ResidueView)¶
-
class
pv.mol.
Residue
()¶ Represents a residue, e.g. a logical unit of atoms, such as an amino acid, a nucleotide, or a sugar. New residues are created and added to an existing
pv.mol.Chain()
instance by callingpv.mol.Chain.addResidue()
.
-
class
pv.mol.
ResidueView
()¶ Represents a subset of a residue, e.g. a subset of the atoms the residue contains. New residue views are created and added to an existing
pv.mol.ChainView()
by callingpv.mol.ChainView.addResidue()
.
-
pv.mol.Residue.
name
()¶ -
pv.mol.ResidueView.
name
()¶ Returns the three-letter-code of the residue, e.g. GLY for glycine.
-
pv.mol.Residue.
isWater
()¶ -
pv.mol.ResidueView.
isWater
()¶ Returns true when the residue is a water molecule. Water molecules are recognized by having a one-letter-code of HOH or DOD (deuteriated water).
-
pv.mol.Residue.
isAminoAcid
()¶ -
pv.mol.ResidueView.
isAminoAcid
()¶ Returns true when the residue is an amino acid. Residues which have the four backbone atoms N, CA, C, and O are considered as amino acids, all others not.
-
pv.mol.Residue.
num
()¶ -
pv.mol.ResidueView.
num
()¶ Returns the numeric part of the residue number, ignoring insertion code.
-
pv.mol.Residue.
atoms
()¶ -
pv.mol.ResidueView.
atoms
()¶ Returns the list of atoms of this residue. For
pv.mol.Residue()
, returns an array ofpv.mol.Atom()
instances, forpv.mol.ResidueView()
, resturns an array ofpv.mol.AtomView()
instances.
-
pv.mol.Residue.
atom
(nameOrIndex)¶ -
pv.mol.ResidueView.
atom
(nameOrIndex)¶ Get a particular atom from this residue. nameOrResidue can either be an integer, in which case the atom at that index is returned, or a string, in which case an atom with that name is searched and returned.
Returns: For pv.mol.Residue()
, apv.mol.Atom()
instance, forpv.mol.ResidueView()
, apv.mol.AtomView()
instance. If no matching atom could be found, null is returned.
-
pv.mol.Residue.
addAtom
(name, pos, element)¶ Adds a new atom to the residue.
Arguments: - name – the name of the atom, for example CA for carbon-alpha
- pos – the atom position
- element – the atom element string, e.g. ‘C’ for carbon, ‘N’ for nitrogen
Returns: the newly created
pv.mol.Atom()
instance
-
pv.mol.ResidueView.
addAtom
(atom)¶ Adds the given atom to the residue view
Returns: the newly created pv.mol.AtomView()
instance
Atom (and AtomView)¶
-
class
pv.mol.
Atom
()¶ Stores properties such as positions, name element etc of an atom. Atoms always have parent residue. New atoms are created by adding them to an existing residue through
pv.mol.Residue.addAtom()
.
-
class
pv.mol.
AtomView
()¶ Represents a selected atom as part of a view. New atom views are created by adding them to an existing
pv.mol.ResidueView()
throughpv.mol.ResidueView.addAtom()
.
-
pv.mol.Atom.
element
()¶ -
pv.mol.AtomView.
element
()¶ The element of the atom. When loading structures from PDB, the atom element is taken as is from the element column if it is not empty. In case of an empty element column, the element is guessed from the atom name.
-
pv.mol.Atom.
isHetatm
()¶ -
pv.mol.AtomView.
isHetatm
()¶ Returns true when the atom was imported from a HETATM record, false if not. This flag is only meaningful for structures imported from PDB files and will return false for other file formats.