IntSet: Compressed mergeable sets of integers¶
-
class
intset.
IntSet
(wrapped)[source]¶ An IntSet is a compressed immutable representation of a sorted list of unsigned 64-bit integers with fast membership, union and range restriction.
It mostly behaves as if it were a sorted list of deduplicated integer values. In particular, you can index it if it were, and it will sort and compare equal (to other IntSets) as if it were.
Note that unlike lists, intsets may feasibly have more than sys.maxint elements, and calling len() on such an intset may raise an OverflowError. If you wish to avoid this, use .size() instead.
Because IntSet is immutable, unlike list it may also be used as a hash key.
It also supports set operations. In particular, all the boolean operations are supported:
x & y: An IntSet containing the values that are present in both x and y
x | y: An IntSet containing the values present in either x or y
x - y: An IntSet containing the values present in x but not y
x ^ y: An IntSet containing the values present in x or y but not both
- ~x: An IntSet containing all values in the range 0 <= i < 2 ** 64 that
- are not present in x (IntSet can represent this efficiently. It won’t allocate 2 ** 64 integers worth of memory).
IntSets may be constructed either from the dedicated class methods or by calling the class as you usually would for a set. So IntSet([1, 2, 3]) is an IntSet containing the values 1, 2 and 3.
When calling an IntSet this way, non-integer values which are iterable sequences of length 2 will be interpreted as intervals start <= x < end. So e.g. IntSet([1, [10, 100]]) will contain the numbers 1 and 10, ..., 99.
-
class
Builder
[source]¶ An IntSet.Builder is for building up an IntSet incrementally through a series of insertions.
This will typically be much faster than repeatedly calling insert on an IntSet object. The intended usage is to repeatedly call insert() or insert_interval() on a builder, then call build() at the end. Note that you can continue to insert further data into a Builder afterwards if you wish, and this will not affect previously built IntSet instances.
-
classmethod
IntSet.
single
(value)[source]¶ Return an IntSet containing only the single value provided.
-
classmethod
IntSet.
interval
(start, end)[source]¶ Return an IntSet containing only the values x such that start <= x < end
-
classmethod
IntSet.
from_iterable
(values)[source]¶ Return an IntSet containing everything in values, which should be an iterable over intsets in the valid range.
-
classmethod
IntSet.
from_intervals
(intervals)[source]¶ Return a new IntSet which contains precisely the intervals passed in.
-
IntSet.
size
()[source]¶ This returns the same as len() when the latter is defined, but IntSet may have more values than will fit in the size of index that len will allow.
-
IntSet.
insert
(value)[source]¶ Returns an IntSet which contains all the values of the current one plus the provided value.
-
IntSet.
discard
(value)[source]¶ Returns an IntSet which contains all the values of the current one except for the passed in value.
Returns self if the value is not present rather than raising an error
-
IntSet.
restrict
(start, end)[source]¶ Return a new IntSet with all values x in self such that start <= x < end.