util Package¶
util
Package¶
-
class
okcupyd.util.
cached_property
(func)[source]¶ Bases:
object
Descriptor that caches the result of the first call to resolve its contents.
-
bust_self
(obj)[source]¶ Remove the value that is being stored on obj for this
cached_property
object.Parameters: obj – The instance on which to bust the cache.
-
classmethod
bust_caches
(obj, excludes=())[source]¶ Bust the cache for all
cached_property
objects on objParameters: obj – The instance on which to bust the caches.
-
-
class
okcupyd.util.
REMap
(re_value_pairs=(), default=<object object>)[source]¶ Bases:
object
A mapping object that matches regular expressions to values.
-
NO_DEFAULT
= <object object>¶
-
classmethod
from_string_pairs
(string_value_pairs, **kwargs)[source]¶ Build an
REMap
from str, value pairs by applying re.compile to each string and calling the __init__ ofREMap
-
pattern_to_value
¶
-
-
okcupyd.util.
IndexedREMap
(*re_strings, **kwargs)[source]¶ Build a
REMap
from the provided regular expression string. Each string will be associated with the index corresponding to its position in the argument list.Parameters: - re_strings – The re_strings that will serve as keys in the map.
- default – The value to return if none of the regular expressions match
- offset – The offset at which to start indexing for regular expressions defaults to 1.
compose
Module¶
currying
Module¶
-
okcupyd.util.currying.
curry
[source]¶ Curry a function or method.
Applying
curry
to a function creates a callable with the same functionality that can be invoked with an incomplete argument list to create a partial application of the original function.@curry def greater_than(x, y): return x > y >>> less_than_40 = greater_than(40) >>> less_than_40(39) True >>> less_than_40(50) False
curry
allows functions to be partially invoked an arbitary number of times:@curry def add_5_things(a, b, c, d, e): return a + b + c + d + e # All of the following invocations of add_5_things >>> add_5_things(1)(1)(1)(1)(1) 5 one_left = add_5_things(1, 1)(3)(4) # A one place function that will # add 1 + 1 + 3 + 4 = 9 to whatever is provided as its argument. >>>> one_left(5) 14 >>> one_left(6) 15
A particular compelling use case for
curry
is the creation of decorators that take optional arguments:@curry def add_n(function, n=1): def wrapped(*args, **kwargs): return function(*args, **kwargs) + n return wrapped @add_n(n=12) def multiply_plus_twelve(x, y): return x * y @add_n def multiply_plus_one(x, y): return x * y >>> multiply_plus_one(1, 1) 2 >>> multiply_plus_twelve(1, 1) 13
Notice that we were able to apply add_n regardless of whether or not an optional argument had been supplied earlier.
The version of curry that is available for import has been curried itself. That is, its constructor can be invoked partially:
@curry(evaluation_checker=lambda *args, **kwargs: len(args) > 2) def args_taking_function(*args): return reduce(lambda x, y: x*y, args) >>> args_taking_function(1, 2) 2 >>> args_taking_function(2)(3) 6 >>> args_taking_function(2, 2, 2, 2) 16
fetchable
Module¶
Most of the collection objects that are returned from function
invocations in the okcupyd library are instances of
Fetchable
. In most cases, it is fine
to treat these objects as though they are lists because they can be iterated
over, sliced and accessed by index, just like lists:
for question in user.profile.questions:
print(question.answer.text)
a_random_question = user.profile.questions[2]
for question in questions[2:4]:
print(question.answer_options[0])
However, in some cases, it is important to be aware of the subtle
differences between Fetchable
objects
and python lists.
Fetchable
construct the elements that
they “contain” lazily. In most of its uses in the okcupyd library,
this means that http requests can be made to populate
Fetchable
instances as its elments
are requested.
The questions
Fetchable
that is used in the example
above fetches the pages that are used to construct its contents in
batches of 10 questions. This means that the actual call to retrieve
data is made when iteration starts. If you enable the request logger
when you run this code snippet, you get output that illustrates this
fact:
2014-10-29 04:25:04 Livien-MacbookAir requests.packages.urllib3.connectionpool[82461] DEBUG "GET /profile/ShrewdDrew/questions?leanmode=1&low=11 HTTP/1.1" 200 None
Yes
Yes
Kiss someone.
Yes.
Yes
Sex.
Both equally
No, I wouldn't give it as a gift.
Maybe, I want to know all the important stuff.
Once or twice a week
2014-10-29 04:25:04 Livien-MacbookAir requests.packages.urllib3.connectionpool[82461] DEBUG "GET /profile/ShrewdDrew/questions?leanmode=1&low=21 HTTP/1.1" 200 None
No.
No
No
Yes
Rarely / never
Always.
Discovering your shared interests
The sun
Acceptable.
No.
Some fetchables will continue fetching content for quite a long time. The search fetchable, for example, will fetch content until okcupid runs out of search results. As such, things like:
for profile in user.search():
profile.message("hey!")
should be avoided, as they are likely to generate a massive number of requests to okcupid.com.
Another subtlety of the Fetchable
class is that its instances cache its contained results. This means that
the second iteration over okcupyd.profile.Profile.questions
in the
example below does not result in any http requests:
for question in user.profile.questions:
print(question.text)
for question in user.profile.questions:
print(question.answer)
It is important to understand that this means that the contents of a
Fetchable
are not guarenteed to be in
sync with okcupid.com the second time they are requested. Calling
refresh()
will cause the
Fetchable
to request new data from
okcupid.com when its contents are requested. The code snippet that
follows prints out all the questions that the logged in user has
answered roughly once per hour, including ones that are answered while
the program is running.
import time
while True:
for question in user.profile.questions:
print(question.text)
user.profile.questions.refresh()
time.sleep(3600)
Without the call to user.profile.questions.refresh(), this program would never update the user.profile.questions instance, and thus what would be printed to the screen with each iteration of the for loop.
-
class
okcupyd.util.fetchable.
Fetchable
(fetcher, **kwargs)[source]¶ Bases:
object
List-like container object that lazily loads its contained items.
-
refresh
(nice_repr=True, **kwargs)[source]¶ Parameters: - nice_repr (bool) – Append the repr of a list containing the items that have been fetched to this point by the fetcher.
- kwargs – kwargs that should be passed to the fetcher when its fetch method is called. These are merged with the values provided to the constructor, with the ones provided here taking precedence if there is a conflict.
-
-
class
okcupyd.util.fetchable.
FetchMarshall
(fetcher, processor, terminator=None, start_at=1)[source]¶ Bases:
object
-
class
okcupyd.util.fetchable.
SimpleProcessor
(session, object_factory, element_xpath)[source]¶ Bases:
object
Applies object_factory to each element found with element_xpath
Accepts session merely to be consistent with the FetchMarshall interface.
misc
Module¶
-
okcupyd.util.misc.
add_command_line_options
(add_argument, use_short_options=True)[source]¶ Parameters: - add_argument – The add_argument method of an ArgParser.
- use_short_options – Whether or not to add short options.