Spaces:
No application file
No application file
# Copyright 2012 by Wibowo Arindrarto. All rights reserved. | |
# This file is part of the Biopython distribution and governed by your | |
# choice of the "Biopython License Agreement" or the "BSD 3-Clause License". | |
# Please see the LICENSE file that should have been included as part of this | |
# package. | |
"""Bio.SearchIO object to model search results from a single query.""" | |
from copy import deepcopy | |
from itertools import chain | |
from Bio.SearchIO._utils import optionalcascade | |
from ._base import _BaseSearchObject | |
from .hit import Hit | |
class QueryResult(_BaseSearchObject): | |
"""Class representing search results from a single query. | |
QueryResult is the container object that stores all search hits from a | |
single search query. It is the top-level object returned by SearchIO's two | |
main functions, ``read`` and ``parse``. Depending on the search results and | |
search output format, a QueryResult object will contain zero or more Hit | |
objects (see Hit). | |
You can take a quick look at a QueryResult's contents and attributes by | |
invoking ``print`` on it:: | |
>>> from Bio import SearchIO | |
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) | |
>>> print(qresult) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... | |
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... | |
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... | |
3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... | |
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... | |
5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... | |
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... | |
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... | |
8 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... | |
9 2 gi|301171447|ref|NR_035871.1| Pan troglodytes microRNA... | |
10 1 gi|301171276|ref|NR_035852.1| Pan troglodytes microRNA... | |
11 1 gi|262205290|ref|NR_030188.1| Homo sapiens microRNA 51... | |
... | |
If you just want to know how many hits a QueryResult has, you can invoke | |
``len`` on it. Alternatively, you can simply type its name in the interpreter:: | |
>>> len(qresult) | |
100 | |
>>> qresult | |
QueryResult(id='33211', 100 hits) | |
QueryResult behaves like a hybrid of Python's built-in list and dictionary. | |
You can retrieve its items (Hit objects) using the integer index of the | |
item, just like regular Python lists:: | |
>>> first_hit = qresult[0] | |
>>> first_hit | |
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) | |
You can slice QueryResult objects as well. Slicing will return a new | |
QueryResult object containing only the sliced hits:: | |
>>> sliced_qresult = qresult[:3] # slice the first three hits | |
>>> len(qresult) | |
100 | |
>>> len(sliced_qresult) | |
3 | |
>>> print(sliced_qresult) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... | |
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... | |
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... | |
Like Python dictionaries, you can also retrieve hits using the hit's ID. | |
This is useful for retrieving hits that you know should exist in a given | |
search:: | |
>>> hit = qresult['gi|262205317|ref|NR_030195.1|'] | |
>>> hit | |
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) | |
You can also replace a Hit in QueryResult with another Hit using either the | |
integer index or hit key string. Note that the replacing object must be a | |
Hit that has the same ``query_id`` property as the QueryResult object. | |
If you're not sure whether a QueryResult contains a particular hit, you can | |
use the hit ID to check for membership first:: | |
>>> 'gi|262205317|ref|NR_030195.1|' in qresult | |
True | |
>>> 'gi|262380031|ref|NR_023426.1|' in qresult | |
False | |
Or, if you just want to know the rank / position of a given hit, you can | |
use the hit ID as an argument for the ``index`` method. Note that the values | |
returned will be zero-based. So zero (0) means the hit is the first in the | |
QueryResult, three (3) means the hit is the fourth item, and so on. If the | |
hit does not exist in the QueryResult, a ``ValueError`` will be raised. | |
>>> qresult.index('gi|262205317|ref|NR_030195.1|') | |
0 | |
>>> qresult.index('gi|262205330|ref|NR_030198.1|') | |
5 | |
>>> qresult.index('gi|262380031|ref|NR_023426.1|') | |
Traceback (most recent call last): | |
... | |
ValueError: ... | |
To ease working with a large number of hits, QueryResult has several | |
``filter`` and ``map`` methods, analogous to Python's built-in functions with | |
the same names. There are ``filter`` and ``map`` methods available for | |
operations over both Hit objects or HSP objects. As an example, here we are | |
using the ``hit_map`` method to rename all hit IDs within a QueryResult:: | |
>>> def renamer(hit): | |
... hit.id = hit.id.split('|')[3] | |
... return hit | |
>>> mapped_qresult = qresult.hit_map(renamer) | |
>>> print(mapped_qresult) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 NR_030195.1 Homo sapiens microRNA 520b (MIR520B), micr... | |
1 1 NR_035856.1 Pan troglodytes microRNA mir-520b (MIR520B... | |
2 1 NR_032573.1 Macaca mulatta microRNA mir-519a (MIR519A)... | |
... | |
The principle for other ``map`` and ``filter`` methods are similar: they accept | |
a function, applies it, and returns a new QueryResult object. | |
There are also other methods useful for working with list-like objects: | |
``append``, ``pop``, and ``sort``. More details and examples are available in | |
their respective documentations. | |
Finally, just like Python lists and dictionaries, QueryResult objects are | |
iterable. Iteration over QueryResults will yield Hit objects:: | |
>>> for hit in qresult[:4]: # iterate over the first four items | |
... hit | |
... | |
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) | |
Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps) | |
Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps) | |
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) | |
If you need access to all the hits in a QueryResult object, you can get | |
them in a list using the ``hits`` property. Similarly, access to all hit IDs is | |
available through the ``hit_keys`` property. | |
>>> qresult.hits | |
[Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...] | |
>>> qresult.hit_keys | |
['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...] | |
""" | |
# attributes we don't want to transfer when creating a new QueryResult class | |
# from this one | |
_NON_STICKY_ATTRS = ("_items", "__alt_hit_ids") | |
def __init__(self, hits=(), id=None, hit_key_function=None): | |
"""Initialize a QueryResult object. | |
:param id: query sequence ID | |
:type id: string | |
:param hits: iterator yielding Hit objects | |
:type hits: iterable | |
:param hit_key_function: function to define hit keys | |
:type hit_key_function: callable, accepts Hit objects, returns string | |
""" | |
# default values | |
self._id = id | |
self._hit_key_function = hit_key_function or _hit_key_func | |
self._items = {} | |
self._description = None | |
self.__alt_hit_ids = {} | |
self.program = "<unknown program>" | |
self.target = "<unknown target>" | |
self.version = "<unknown version>" | |
# validate Hit objects and fill up self._items | |
for hit in hits: | |
# validation is handled by __setitem__ | |
self.append(hit) | |
def __iter__(self): | |
"""Iterate over hits.""" | |
return iter(self.hits) | |
def hits(self): | |
"""Hit objects contained in the QueryResult.""" | |
return list(self._items.values()) | |
def hit_keys(self): | |
"""Hit IDs of the Hit objects contained in the QueryResult.""" | |
return list(self._items.keys()) | |
def items(self): | |
"""List of tuples of Hit IDs and Hit objects.""" | |
return list(self._items.items()) | |
def iterhits(self): | |
"""Return an iterator over the Hit objects.""" | |
yield from self._items.values() | |
def iterhit_keys(self): | |
"""Return an iterator over the ID of the Hit objects.""" | |
yield from self._items | |
def iteritems(self): | |
"""Return an iterator yielding tuples of Hit ID and Hit objects.""" | |
yield from self._items.items() | |
def __contains__(self, hit_key): | |
"""Return True if hit key in items or alternative hit identifiers.""" | |
if isinstance(hit_key, Hit): | |
return self._hit_key_function(hit_key) in self._items | |
return hit_key in self._items or hit_key in self.__alt_hit_ids | |
def __len__(self): | |
"""Return the number of items.""" | |
return len(self._items) | |
def __bool__(self): | |
"""Return True if there are items.""" | |
return bool(self._items) | |
def __repr__(self): | |
"""Return string representation of the QueryResult object.""" | |
return "QueryResult(id=%r, %r hits)" % (self.id, len(self)) | |
def __str__(self): | |
"""Return a human readable summary of the QueryResult object.""" | |
lines = [] | |
# set program and version line | |
lines.append("Program: %s (%s)" % (self.program, self.version)) | |
# set query id line | |
qid_line = " Query: %s" % self.id | |
try: | |
seq_len = self.seq_len | |
except AttributeError: | |
pass | |
else: | |
qid_line += " (%i)" % seq_len | |
lines.append(qid_line) | |
if self.description: | |
line = " %s" % self.description | |
line = line[:77] + "..." if len(line) > 80 else line | |
lines.append(line) | |
# set target line | |
lines.append(" Target: %s" % self.target) | |
# set hit lines | |
if not self.hits: | |
lines.append(" Hits: 0") | |
else: | |
lines.append(" Hits: %s %s %s" % ("-" * 4, "-" * 5, "-" * 58)) | |
pattern = "%13s %5s %s" | |
lines.append(pattern % ("#", "# HSP", "ID + description")) | |
lines.append(pattern % ("-" * 4, "-" * 5, "-" * 58)) | |
for idx, hit in enumerate(self.hits): | |
if idx < 30: | |
hid_line = "%s %s" % (hit.id, hit.description) | |
if len(hid_line) > 58: | |
hid_line = hid_line[:55] + "..." | |
lines.append(pattern % (idx, len(hit), hid_line)) | |
elif idx > len(self.hits) - 4: | |
hid_line = "%s %s" % (hit.id, hit.description) | |
if len(hid_line) > 58: | |
hid_line = hid_line[:55] + "..." | |
lines.append(pattern % (idx, len(hit), hid_line)) | |
elif idx == 30: | |
lines.append("%14s" % "~~~") | |
return "\n".join(lines) | |
def __getitem__(self, hit_key): | |
"""Return a QueryResult object that matches the hit_key.""" | |
# retrieval using slice objects returns another QueryResult object | |
if isinstance(hit_key, slice): | |
# should we return just a list of Hits instead of a full blown | |
# QueryResult object if it's a slice? | |
hits = list(self.hits)[hit_key] | |
obj = self.__class__(hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
# if key is an int, then retrieve the Hit at the int index | |
elif isinstance(hit_key, int): | |
length = len(self) | |
if 0 <= hit_key < length: | |
for idx, item in enumerate(self.iterhits()): | |
if idx == hit_key: | |
return item | |
elif -1 * length <= hit_key < 0: | |
for idx, item in enumerate(self.iterhits()): | |
if length + hit_key == idx: | |
return item | |
raise IndexError("list index out of range") | |
# if key is a string, then do a regular dictionary retrieval | |
# falling back on alternative hit IDs | |
try: | |
return self._items[hit_key] | |
except KeyError: | |
return self._items[self.__alt_hit_ids[hit_key]] | |
def __setitem__(self, hit_key, hit): | |
"""Add an item of key hit_key and value hit.""" | |
# only accept string keys | |
if not isinstance(hit_key, str): | |
raise TypeError("QueryResult object keys must be a string.") | |
# hit must be a Hit object | |
if not isinstance(hit, Hit): | |
raise TypeError("QueryResult objects can only contain Hit objects.") | |
qid = self.id | |
hqid = hit.query_id | |
# and it must have the same query ID as this object's ID | |
# unless it's the query ID is None (default for empty objects), in which | |
# case we want to use the hit's query ID as the query ID | |
if qid is not None: | |
if hqid != qid: | |
raise ValueError( | |
"Expected Hit with query ID %r, found %r instead." % (qid, hqid) | |
) | |
else: | |
self.id = hqid | |
# same thing with descriptions | |
qdesc = self.description | |
hqdesc = hit.query_description | |
if qdesc is not None: | |
if hqdesc != qdesc: | |
raise ValueError( | |
"Expected Hit with query description %r, found %r instead." | |
% (qdesc, hqdesc) | |
) | |
else: | |
self.description = hqdesc | |
# remove existing alt_id references, if hit_key already exists | |
if hit_key in self._items: | |
for alt_key in self._items[hit_key].id_all[1:]: | |
del self.__alt_hit_ids[alt_key] | |
# if hit_key is already present as an alternative ID | |
# delete it from the alternative ID dict | |
if hit_key in self.__alt_hit_ids: | |
del self.__alt_hit_ids[hit_key] | |
self._items[hit_key] = hit | |
for alt_id in hit.id_all[1:]: | |
self.__alt_hit_ids[alt_id] = hit_key | |
def __delitem__(self, hit_key): | |
"""Delete item of key hit_key.""" | |
# if hit_key an integer or slice, get the corresponding key first | |
# and put it into a list | |
if isinstance(hit_key, int): | |
hit_keys = [list(self.hit_keys)[hit_key]] | |
# the same, if it's a slice | |
elif isinstance(hit_key, slice): | |
hit_keys = list(self.hit_keys)[hit_key] | |
# otherwise put it in a list | |
else: | |
hit_keys = [hit_key] | |
for key in hit_keys: | |
deleted = False | |
if key in self._items: | |
del self._items[key] | |
deleted = True | |
if key in self.__alt_hit_ids: | |
del self._items[self.__alt_hit_ids[key]] | |
del self.__alt_hit_ids[key] | |
deleted = True | |
if not deleted: | |
raise KeyError(repr(key)) | |
# properties # | |
id = optionalcascade("_id", "query_id", """QueryResult ID string""") | |
description = optionalcascade( | |
"_description", "query_description", """QueryResult description""" | |
) | |
def hsps(self): | |
"""Access the HSP objects contained in the QueryResult.""" | |
return sorted( | |
(hsp for hsp in chain(*self.hits)), key=lambda hsp: hsp.output_index | |
) | |
def fragments(self): | |
"""Access the HSPFragment objects contained in the QueryResult.""" | |
return list(chain(*self.hsps)) | |
# public methods # | |
def absorb(self, hit): | |
"""Add a Hit object to the end of QueryResult. | |
If the QueryResult already has a Hit with the same ID, append the new | |
Hit's HSPs into the existing Hit. | |
:param hit: object to absorb | |
:type hit: Hit | |
This method is used for file formats that may output the same Hit in | |
separate places, such as BLAT or Exonerate. In both formats, Hit | |
with different strands are put in different places. However, SearchIO | |
considers them to be the same as a Hit object should be all database | |
entries with the same ID, regardless of strand orientation. | |
""" | |
try: | |
self.append(hit) | |
except ValueError: | |
assert hit.id in self | |
for hsp in hit: | |
self[hit.id].append(hsp) | |
def append(self, hit): | |
"""Add a Hit object to the end of QueryResult. | |
:param hit: object to append | |
:type hit: Hit | |
Any Hit object appended must have the same ``query_id`` property as the | |
QueryResult's ``id`` property. If the hit key already exists, a | |
``ValueError`` will be raised. | |
""" | |
# if a custom hit_key_function is supplied, use it to define th hit key | |
if self._hit_key_function is not None: | |
hit_key = self._hit_key_function(hit) | |
else: | |
hit_key = hit.id | |
if hit_key not in self and all(pid not in self for pid in hit.id_all[1:]): | |
self[hit_key] = hit | |
else: | |
raise ValueError( | |
"The ID or alternative IDs of Hit %r exists in this QueryResult." | |
% hit_key | |
) | |
def hit_filter(self, func=None): | |
"""Create new QueryResult object whose Hit objects pass the filter function. | |
:param func: filter function | |
:type func: callable, accepts Hit, returns bool | |
Here is an example of using ``hit_filter`` to select Hits whose | |
description begins with the string 'Homo sapiens', case sensitive:: | |
>>> from Bio import SearchIO | |
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) | |
>>> def desc_filter(hit): | |
... return hit.description.startswith('Homo sapiens') | |
... | |
>>> len(qresult) | |
100 | |
>>> filtered = qresult.hit_filter(desc_filter) | |
>>> len(filtered) | |
39 | |
>>> print(filtered[:4]) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... | |
1 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... | |
2 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... | |
3 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... | |
Note that instance attributes (other than the hits) from the unfiltered | |
QueryResult are retained in the filtered object. | |
>>> qresult.program == filtered.program | |
True | |
>>> qresult.target == filtered.target | |
True | |
""" | |
hits = list(filter(func, self.hits)) | |
obj = self.__class__(hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
def hit_map(self, func=None): | |
"""Create new QueryResult object, mapping the given function to its Hits. | |
:param func: map function | |
:type func: callable, accepts Hit, returns Hit | |
Here is an example of using ``hit_map`` with a function that discards all | |
HSPs in a Hit except for the first one:: | |
>>> from Bio import SearchIO | |
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) | |
>>> print(qresult[:8]) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... | |
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... | |
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... | |
3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... | |
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... | |
5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... | |
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... | |
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... | |
>>> top_hsp = lambda hit: hit[:1] | |
>>> mapped_qresult = qresult.hit_map(top_hsp) | |
>>> print(mapped_qresult[:8]) | |
Program: blastn (2.2.27+) | |
Query: 33211 (61) | |
mir_1 | |
Target: refseq_rna | |
Hits: ---- ----- ---------------------------------------------------------- | |
# # HSP ID + description | |
---- ----- ---------------------------------------------------------- | |
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... | |
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... | |
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... | |
3 1 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... | |
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... | |
5 1 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... | |
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... | |
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... | |
""" | |
hits = [deepcopy(hit) for hit in self.hits] | |
if func is not None: | |
hits = [func(x) for x in hits] | |
obj = self.__class__(hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
def hsp_filter(self, func=None): | |
"""Create new QueryResult object whose HSP objects pass the filter function. | |
``hsp_filter`` is the same as ``hit_filter``, except that it filters | |
directly on each HSP object in every Hit. If the filtering removes | |
all HSP objects in a given Hit, the entire Hit will be discarded. This | |
will result in the QueryResult having less Hit after filtering. | |
""" | |
hits = [x for x in (hit.filter(func) for hit in self.hits) if x] | |
obj = self.__class__(hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
def hsp_map(self, func=None): | |
"""Create new QueryResult object, mapping the given function to its HSPs. | |
``hsp_map`` is the same as ``hit_map``, except that it applies the given | |
function to all HSP objects in every Hit, instead of the Hit objects. | |
""" | |
hits = [x for x in (hit.map(func) for hit in list(self.hits)[:]) if x] | |
obj = self.__class__(hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
# marker for default self.pop() return value | |
# this method is adapted from Python's built in OrderedDict.pop | |
# implementation | |
__marker = object() | |
def pop(self, hit_key=-1, default=__marker): | |
"""Remove the specified hit key and return the Hit object. | |
:param hit_key: key of the Hit object to return | |
:type hit_key: int or string | |
:param default: return value if no Hit exists with the given key | |
:type default: object | |
By default, ``pop`` will remove and return the last Hit object in the | |
QueryResult object. To remove specific Hit objects, you can use its | |
integer index or hit key. | |
>>> from Bio import SearchIO | |
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) | |
>>> len(qresult) | |
100 | |
>>> for hit in qresult[:5]: | |
... print(hit.id) | |
... | |
gi|262205317|ref|NR_030195.1| | |
gi|301171311|ref|NR_035856.1| | |
gi|270133242|ref|NR_032573.1| | |
gi|301171322|ref|NR_035857.1| | |
gi|301171267|ref|NR_035851.1| | |
# remove the last hit | |
>>> qresult.pop() | |
Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps) | |
# remove the first hit | |
>>> qresult.pop(0) | |
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) | |
# remove hit with the given ID | |
>>> qresult.pop('gi|301171322|ref|NR_035857.1|') | |
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) | |
""" | |
# if key is an integer (index) | |
# get the ID for the Hit object at that index | |
if isinstance(hit_key, int): | |
# raise the appropriate error if there is no hit | |
if not self: | |
raise IndexError("pop from empty list") | |
hit_key = list(self.hit_keys)[hit_key] | |
try: | |
hit = self._items.pop(hit_key) | |
# remove all alternative IDs of the popped hit | |
for alt_id in hit.id_all[1:]: | |
self.__alt_hit_ids.pop(alt_id, None) | |
except KeyError: | |
try: | |
hit = self.pop(self.__alt_hit_ids[hit_key]) | |
except KeyError: | |
# hit_key is not a valid id | |
# use the default if it has been set | |
if default is not self.__marker: | |
hit = default | |
else: | |
raise KeyError(hit_key) from None | |
return hit | |
def index(self, hit_key): | |
"""Return the index of a given hit key, zero-based. | |
:param hit_key: hit ID | |
:type hit_key: string | |
This method is useful for finding out the integer index (usually | |
correlated with search rank) of a given hit key. | |
>>> from Bio import SearchIO | |
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) | |
>>> qresult.index('gi|301171259|ref|NR_035850.1|') | |
7 | |
""" | |
if isinstance(hit_key, Hit): | |
return list(self.hit_keys).index(hit_key.id) | |
try: | |
return list(self.hit_keys).index(hit_key) | |
except ValueError: | |
if hit_key in self.__alt_hit_ids: | |
return self.index(self.__alt_hit_ids[hit_key]) | |
raise | |
def sort(self, key=None, reverse=False, in_place=True): | |
"""Sort the Hit objects. | |
:param key: sorting function | |
:type key: callable, accepts Hit, returns key for sorting | |
:param reverse: whether to reverse sorting results or no | |
:type reverse: bool | |
:param in_place: whether to do in-place sorting or no | |
:type in_place: bool | |
``sort`` defaults to sorting in-place, to mimic Python's ``list.sort`` | |
method. If you set the ``in_place`` argument to False, it will treat | |
return a new, sorted QueryResult object and keep the initial one | |
unsorted. | |
""" | |
if key is None: | |
# if reverse is True, reverse the hits | |
if reverse: | |
sorted_hits = list(self.hits)[::-1] | |
# otherwise (default options) make a copy of the hits | |
else: | |
sorted_hits = list(self.hits)[:] | |
else: | |
sorted_hits = sorted(self.hits, key=key, reverse=reverse) | |
# if sorting is in-place, don't create a new QueryResult object | |
if in_place: | |
self._items = {self._hit_key_function(hit): hit for hit in sorted_hits} | |
# otherwise, return a new sorted QueryResult object | |
else: | |
obj = self.__class__(sorted_hits, self.id, self._hit_key_function) | |
self._transfer_attrs(obj) | |
return obj | |
def _hit_key_func(hit): | |
"""Map hit to its identifier (PRIVATE). | |
Default hit key function for QueryResult.__init__ use. | |
""" | |
return hit.id | |
# if not used as a module, run the doctest | |
if __name__ == "__main__": | |
from Bio._utils import run_doctest | |
run_doctest() | |