.. _oop_lab: ******************************************* Object Oriented Programming (OOP) in Python ******************************************* Author: Spencer Lyon .. ipython:: python :suppress: import numpy as np import pandas as pd np.random.seed(42) So far in your python training you have been introduced to many different ideas and concepts. You have probably heard the term "object" a few times, without really understanding what it means. This being true, the code you have written up to this point has been object based -- you have passed around objects in your scripts, used them in expressions, called their methods, ect. In order for your code to be truly object-oriented you need to start to defining your own classes. This lab will teach you the basics of OOP and get you some hands-on experience using them in your code. Introduction ============ The term object oriented programming encapsulates a new way of thinking about your code. Instead of having your programs simply being a list of instructions to be executed in sequential order when your script is run, OOP takes a more abstract approach that focuses more on the relationship between the different parts of your code and which tasks should be performed by each part. Why use OOP? ------------ At this point, you might be wondering why you need to learn a whole new way to think about programming. Hasn't what you have been doing all along good enough? The answer is ... kind of. What you have been doing up to this point, writing out a list of instructions for the interpreter to execute, is known as procedural programming. It is the foundation of modern computer science. For this reason old, powerful languages like C and Fortran, which do not support OOP by themselves, are classified as procedural programming languages. However, there has been a clear shift in mainstream programming toward newer languages that do support OOP. Some examples are Java, C++, Python (surprise!), Ruby, C#, Objective-C, and many more. So why are more and more programmers using OOP in their code? I can think of 2 (there are definitely more) distinct advantages an OOP approach has over a standard procedural approach. 1. Code re-usability: This is probably the biggest plus for OOP. With OOP you define objects that have certain properties. If you then realize that you need an object with very similar features, you can simply subclass the original object and save your self from writing all that code over again. Although copy/paste is a plausible form of code reuse, it quickly becomes difficult to manage with larger, more complicated projects. As an example, say you wanted to code an OLG model. One way to do this from and OOP persepective might be to define the following classes with respective properties: - ``Agent``: - ``age`` property that specifies the agent's age. - ``utility`` method (function) that returns the utility for a given state of the economy and a given choice of consumption and labor supply. - ``consume`` method that evaluates the optimal amount of food to consume given a current state of the economy -- uses the ``utility`` method. - ``work`` method that returns the optimal amount of labor to supply given the current state of the economy -- uses the ``utility`` method. - ``Firm``: - ``produce`` method that returns how much of a good is produced given the state of the economy - ``captial`` property that specifies the aggregate capital stock in the current period - ``labor`` property that specifies the total supply of labor in the current period - ``Economy``: - ``period`` property that keeps track of the time period - ``agents`` list that contains a list of all current agents - ``firms`` list that contains a list of all current firms - ``shocks`` dictionary of ``pandas.Series`` objects that specify the exogenous shocks over time - ``endog_state`` property of a ``pandas.DataFrame`` containing the endogenous state data over time. - ``gather_choices`` method that sends each object in the the ``agents`` and ``firms`` lists the current data and receives the choice variables. You could then create multiple instances of the ``Agent`` class to represent agents of different ages in your economy. You would also need at least one ``Firm`` instance to represent the other half of the market and a single ``Economy`` instance to keep everything organized (you could also have methods in the ``Economy`` class that would gather consumption, labor supply and production results from each of the other instances in the economy). 2. Unite data and functionality: An object is really just a generic container in which you can place data and different methods that act on that data. You have already seen this benefit in the BSplineBasis class you wrote earlier. Had you tried to do this without object oriented programming, every function that makes up the spline (generating knots, defining basis functions, evaluating or plotting the basis functions, ect.) would have had to have additional arguments that specify the details of the spline (things like degree, number of knots, the knot vector or list of basis functions, ect.). Even in this small example you can see how this could get out of hand with a bigger problem. An Outline - OOP in Python -------------------------- Python is, at its core, an object oriented programming language. *Everything* you use or variable you define in python is an object. You can verify this using the ``type`` function. I will demonstrate below. .. ipython:: python type(42) type(1.2) type('foo') type([1, 3, 4]) type({1:3, 2:4}) type(('b', [3], 4)) type(np.array([1, 3, 4])) type(pd.Series([1, 2])) Every object in python has characteristics that are broken into two main categories: 1. [data] attributes, and 2. methods. Attributes are the specific pieces of data that make one object unique from another. Methods are python functions that act on that data. As an example consider a python list: .. ipython:: python a = [1, 2, 'string', 2.3] The data for this list are the items in the list. In this example they are the ``int`` 1, the ``int`` 2, the ``str`` 'string', and the ``float`` 2.3. Lets take a look at what methods are available to a list: .. ipython:: python not_special = lambda x: not x.startswith('__') filter(not_special, dir(a)) Calling any of these methods will apply that python function to the list and alter it somehow: .. ipython:: python a # reverse order of data a.reverse() a # return the last item in the list and remove it from the list a.pop() # see, it's gone! a The ``class`` keyword is used to define new types of objects in python. A class definition will describe what kind of data it stores and what functions are available for acting on this data (functions defined within classes are referred to as methods). On their own, class definitions are not useful, just like defining a function without calling it is not useful. To make use of a class, you need to create an object of that class or type. These are known as instances of the class. It is most common for each instance of a class to have its own unique data, while the methods are common across instances. Throughout the rest of this lab we will focus on how to create and use new objects we define. Defining Custom Classes ======================= As noted above, new classes are defined using the ``class`` keyword. Imagine I wanted to create a class that represented a die. I would begin like this: .. code-block:: python :linenos: class Die(object): pass I will break this down one word at a time: - ``class``: like the keyword ``def`` for functions, this tells python I am about to define a new class - ``Die``: This is the name of the new class. It is common convention in python to have class names where the first letter in each word is capitalized, and there are no spaces in multi-word class names (e.g. ``DataFrame`` follows convention) - ``(object):`` This tells us that we are going to be subclassing the built-in python class ``object``. We will discuss sublcasses more later so don't worry too much about this for now. If you don't know what you want the parent class to be, it is convention to put ``object`` here. Also note the ``:``. - ``pass``: A python keyword that allows us to not write anything in the body, but also not have the interpreter raise an ``IndentationError`` when the file is executed. The ``__init__`` method ----------------------- After having told python that I will be defining a new class, I need to tell it what should happen when a new instance of this class is created. This happens in the ``__init__`` method (the two leading and trailing underscores are important and we will talk more about them shortly). This is the function that is called when we are creating a new class. For the ``Die``, it might looks something like this: .. code-block:: python :linenos: class Die(object): def __init__(self, sides=6, dots=None): self.nsides = sides self.dots = dots .. note:: In all the examples in this lab I will omit docstrings and I may, at times, number lines in a non PEP8 compliant way as I have done above. I am doing this just for instructional purposes. When actually writing code that will be executed I wouldn't do this. I will break this down one line at a time: - [2] Tells python that I am going to be defining the ``__init__`` method. The first argument to **ALL** class methods is the keyword ``self``. The things that follow will be unique to the particular class. In this case we can pass in the number of sides the current value of the dots. - [3] Set the attribute ``nsides`` and have it be equal to the ``sides`` parameter to the function - [4] Set the attribute ``dots`` and have it be equal to the ``dots`` parameter to the function Adding methods -------------- At this point, I could create an instance of the ``Die`` class, but it wouldn't be able to do much. Before we do that I will add one more method: roll. As I am sure you can guess, this method will simulate the rolling of the die and update the ``dots`` parameter of the object. .. code-block:: python :linenos: from random import randint class Die(object): def __init__(self, sides=6, dots=None): self.nsides = sides self.dots = dots def roll(self): self.dots = randint(1, self.nsides) Explanation by line: - [1] import the ``randint`` function from the python ``random`` module - [8] Begin definition of ``roll`` method. Notice the passing of ``self`` as the only argument and the single line of spacing between class methods - [9] Update the ``self.dots`` attribute to be a new random integer between 1 and the ``self.nsides`` attribute. I am now going to create a new instance of my ``Die`` class and roll it a few times. .. ipython:: python :suppress: from random import randint class Die(object): def __init__(self, sides=6, dots=None): self.nsides = sides self.dots = dots def roll(self): self.dots = randint(1, self.nsides) .. ipython:: python d = Die(6) # [1] d.nsides # [2] d.dots # [3] d.roll() # [4] d.dots # [5] d.roll() # [6] d.dots # [7] d # [8] Explanation by line: - [1] Create a new instance of the ``Die`` class with 6 sides. Store it in a variable named ``d``. - [2] Check the ``d.nsides`` attribute - [3] Access the ``dots`` attribute of the ``d``. Notice that it is not printed because it's current value is the default to ``__init__``: ``None`` - [4] Call the ``roll`` method of the ``Die`` class and update ``d.dots`` - [5] Check the value of ``d.dots`` - [6] Call the ``roll`` method of the ``Die`` class and update ``d.dots`` - [7] Check the value of ``d.dots`` - [8] A very unhelpful string representation of our object (see ``__repr__`` and ``__str__`` below.) Notice that when I am defining the class or calling a method that I treat the ``self`` parameter to the methods as if is isn't there. I tell the ``__init__`` methods the value of its ``sides`` argument by passing it as the first argument to ``Die``. Other Special Methods --------------------- In a python class having a method name start and end with two underscores tells python that it is a special or magic method. These methods are usually used to allow access to object's data or functions via python keywords or operators. This is a bit abstract and I best understood by example. Below is a table of many of the most common special methods: +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | Method Name | Description | Usage | +=================+================================================================================+================================+ | ``__repr__`` | Returns the string printed to the screen when an object is not assigned a name | ``d`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__str__`` | Called when ``str`` or ``print`` is called on the object | ``str(d)`` or ``print(d)`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__sub__`` | Called when using ``-`` | ``d - d`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__ge__`` | Called when using `` >`` | ``d1 > d2`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__getitem__`` | Called when trying to index object | ``d1[2]`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__cmp__`` | Implements the full suite of comparison operators | ``d > d`` or ``d == d1``, ect. | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__rsub__`` | Called when using ``-=`` | ``d -= 1`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__len__`` | Called when using ``len()`` | ``len(d)`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ | ``__call__`` | Called when using ``d()`` | ``d()`` | +-----------------+--------------------------------------------------------------------------------+--------------------------------+ I have barely scratched the surface of the special methods in python. For example, in addition to ``__sub__`` there are the ``__mul__``, ``__div__``, ``__add__`` methods that implement other basic mathematical operations. Another example is that in addition to ``__ge__``, there are 5 other comparison operator methods to be used when calling ``<``, ``<=``, ``>=``, ``==``, and ``!=``. See the official `Python docs on special methods`_ and this great site on `Operator Overloading`_ for more in-depth examples. Let's add a few of these to the ``Die`` class: .. literalinclude:: ../resources/Code/yhatzee.py :pyobject: Die :linenos: :emphasize-lines: 5, 6, 8-33 I will now test some of these out: .. ipython:: python :suppress: execfile('source/resources/Code/yhatzee.py') .. ipython:: python d1 = Die(6, 5) d2 = Die(6, 4) d1 d2 d1 > d2 d2 <= d1 d1 == d2 Subclassing ----------- Another term in object-oriented programming is subclass. This term means that one class is a child, or inherits from another class. When this happens, all methods and data that are part of the first, super, or parent class become part of the subclass. There is the added benefit, however that you can override the default behavior of the parent class. As an example, suppose we wanted to create a new class called ``StandardDie``. This class would be a subclass of the ``Die`` class we wrote earlier, but we would only allow it to have 6 sides. Suppose also that we wanted to update the printing system for the class and give it some very basic ASCII art representing the face of the die. I have done this below: .. literalinclude:: ../resources/Code/yhatzee.py :pyobject: StandardDie :linenos: We will test to make sure the ``StandardDie`` class has all the comparison operators and roll function that were only implemented in the ``Die`` class, but also make sure that the new printing system took hold: .. ipython:: python :suppress: class StandardDie(Die): def __init__(self, dots=1): super(StandardDie, self).__init__(sides=6, dots=dots) self.dots = dots def __repr__(self): if self.dots == 1: msg = "\n x\n" elif self.dots == 2: msg = "x\n\n x" elif self.dots == 3: msg = "x\n x\n x" elif self.dots == 4: msg = "x x\n\nx x" elif self.dots == 5: msg = "x x\n x\nx x" elif self.dots == 6: msg = "x x\nx x\nx x\n" else: msg = "" return msg .. ipython:: python d1 = Die(6, dots=4) d2 = StandardDie(4) # remember only dots is a parameter here d1 d2 d1 > d2 d1 == d2 d2.roll() d2 def check_inheritance(x): return isinstance(x, Die), isinstance(x, StandardDie) # Should be True, False check_inheritance(d1) # Both True check_inheritance(d2) Yhatzee Example --------------- I have taken this dice example and extended it to create the game Yhatzee. You can download the file :download:`here <../resources/Code/yhatzee.py>`, but I will include the other pieces here for completeness. .. literalinclude:: ../resources/Code/yhatzee.py :linenos: .. todo:: Using the provided yhatzee game as an example, create the dice game farkle. You can find the `Farkle rules`_ at the linked website. A lot of the logic for prompting players and controlling the game is shown in the yhatzee example. .. note:: For this problem and the ones to follow you may need to search the web for more guidance and examples of OOP in practice. I have only included this one example by design, as I feel that at this point in your programming careers you should learn to be comfortable searching for answers on your own. We are, of course, willing to help, but we encourage you to do as much as you can with your fellow boot-campers and the internet. Distribution Families ===================== As you learned during an economics lecture earlier in bootcamp, there are many different distributions. Many of these distributions are related in a tree-like manner. Below is an image of a distribution tree for the skewed generalized T distribution family: .. image:: ../resources/SGT_family.png The pdf for the SGT (top of the tree) is given below: .. math:: SGT(\epsilon; \lambda, s, p, q) = \frac{p}{2 s q^{1/p} \beta \left(\frac{1}{p}, q \right) \left(1 + \frac{|\epsilon|^p}{q s^p (1 + \lambda \text{sign}(\epsilon))^p} \right)^{q + 1 / p}} where :math:`\beta(a, b)` is the beta function of :math:`a` and :math:`b`. .. todo:: Implement the entire SGT from the diagram. For each distribution you need to have attributes representing the support and, when possible, first two moments (mean and standard deviation) of the distribution. You must also have methods for the pdf of the distribution and for the cdf. Note that often the cdf has no closed-form representation. In cases such as these, approximate the cdf numerically using an integration routine like that found in ``scipy.integrate.quad``. Get the limits of integration using the information on the support you included as an attribute for the class. Hint: By far the easiest way to do this problem is to come up with a good implementation for the distribution at the top of the tree, in this case the SGT. You can then work your way down the tree by subclassing the more general distributions and pinning parameters down, similar to how I subclassed ``Die`` when I created ``StandardDie``. Hint2: You will find the function ``scipy.special.beta`` useful. Economic Application ==================== .. FIXME: Chase, this came from section 3 of the draft of their paper. Talk to Jeremy and/or Kerk if you need a copy of it. Disclaimer: This problem comes from a working paper by Christian Baker, Jeremy Bejarano, Rick Evans, Ken Judd, and Kerk Phillips (they might be good people to ask for help with understanding the problem). In the working paper, they develop a sales tax model that features households and the government (partial equilibrium because they don't compute the production side of the market). In this section of the lab, we will ask you to implement the household's problem using the object oriented concepts you have learned in the lab. The Model --------- Let the economy be characterized by :math:`I` different consumption goods :math:`c_i`, where :math:`i=1, 2,...I`. Define aggregate consumption :math:`C` by the constant elasticity of substitution (CES) aggregator: .. math:: C \equiv \left(\sum_{i=1}^I \alpha_i(c_i - \bar{c}_i)^{\frac{\eta - 1}{\eta}} \right) ^{\frac{\eta}{\eta - 1}} where :math:`\eta \ge 1` is the elasticity of substitution among all of the consumption goods, :math:`\alpha_i \in [0, 1]` is the weight on the consumption of each type of good with :math:`\sum_i \alpha_i = 1`, and :math:`\bar{c}_i \ge 0` is a minimum level of consumption for each type of good. The constant relative risk aversion (CRRA) utility function for the individual with CES preferences over consumption is the following: .. math:: u(C) = \frac{C^{1 - \gamma} - 1}{1 - \gamma} where the aggregate consumption of an individual :math:`C` is defined above and :math:`\gamma` is the coefficient of relative risk aversion. Let the price of consumption good :math:`i` be determined by the competitive equilibrium assumption of marginal cost pricing with the additional sales tax levied on good :math:`i`. If :math:`mc_i` is the marginal cost of producing good :math:`i` and :math:`\tau_i` is the sales tax rate on good :math:`i`, then competitive equilibrium implies that he price of each consumptino good :math:`p_i` is given by .. math:: p_i = (1 + \tau_i) mc_i We can normalize the marginal cost of each good to unity :math:`mc_i = 1` for all :math:`i` by simply changing the units of each good. So the price of each good can be simplified to its normalized version. .. math:: p_i = 1 + \tau_i Given a nominal wage of :math:`w`, the household budget constraint is: .. math:: \sum_{i=1}^I (1 + \tau_i) c_i \le w The household's problem is to choose a consumption basket :math:`\{c_i\}_{i=1}^I` to maximize utility subject to this budget constraing. In this example, we will allow consumers to be heterogeneous in terms of their elasticity of substitution :math:`\eta` among different consumption goods and in terms of income :math:`w`. So :math:`\theta = (\eta, w) \in \Theta = [1, \infty) \times (0, \infty)`. Let the joint distribution over consumer types in the economy be :math:`\Gamma(\eta, w) = \Gamma(\eta) \Gamma(w)`, where :math:`\eta \sim ([\eta_{min}, \eta_{max}])` and :math:`w \sim GB2(a, b, p, q)`. now we can write the consumers optimization problem in terms of vectors of variables, .. math:: \max_c u(\boldsymbol{c}; \eta, w, \tau) \text{ s.t. } w \ge \sum_{i=1}^I (1 + \tau_i) c_i \text{ and } c_i \ge \bar{c}_i \forall i where :math:`\boldsymbol{c} = \{c_i\}_{i=1}^I` and :math:`\tau = \{\tau_i\}_{i=1}^I`. If the budget constraint binds and :math:`c_i \ge \bar{c}_i` for all :math:`i`, the solution to the objective function can be summarized by :math:`I - 1` Euler equations: .. math:: \alpha_i(c_i - \bar{c}_i)^{\frac{-1}{\eta}} = \alpha_I \frac{1 + \tau_i}{1 + \tau_I}(c_I - \bar{c}_I)^{\frac{-1}{\eta}} \text{ for } i = \{1, 2, \dots, I - 1\} where :math:`w` and all the :math:`tau_i` are introduced into this equation because good :math:`c_I` is substituted out of the utility function using the budget constraint. The solution to this problem is individual consumption functions :math:`c_i(\eta, w, \tau)` that are functions of consumer types (:math:`(\eta, w)`) and tax rates :math:`\tau`. This household has equilibrium utility :math:`u\left(\boldsymbol{c}(\eta, w, \tau) \right)` and total taxes paid of :math:`r(\eta, w, \tau) = \sum_{i=1}^I \tau_i c_i(\eta, w, \tau)`. However, it is also the case that both the budget constraing and :math:`c_i \ge \bar{c}_i` are not satisfied for some sales tax schedules :math:`\tau`. .. todo:: In order to simplify this problem, we will permit you to assume that :math:`I=3`. Now write a python class for the agents in this model where there are subclasses for each agent (i.e. there should be a subclass for each individual). The inputs should be a wage, elasticity of substitution (:math:`\eta`), and a tax schedule. Each of these subclasses should be able to return the following: the values for consumption, capital, amount payed in taxes, and utility. The class should also be able to return the both the steady state and current values of government revenue and overall welfare (utility). .. todo:: Given a discrete support for :math:`w` and :math:`\eta`, a tax schedule :math:`\tau` do some really cool stuff. Your wage and elasticty values will respectively be :math:`(.15, .85, 1.5)` and :math:`(2, 2, 3)` where each value corresponds with one of the agents. Impose two different tax regimes: first is a constant tax rate of .35 and the other is a tax bracket where :math:`\tau = .5` if :math:`w>1` and :math:`\tau = .2` if :math:`w \leq 1`. For each of these cases create a table comparing each agents' utility, overall welfare, and government revenue (you can sum their utility and tax revenue to get overall welfare and government revenue). Which tax regime do you think is better and why. .. math:: .. U = \int_{\eta} \int_w \Gamma(\eta, w) u(\boldsymbol{c}(\eta, w, \tau)) dw d \eta .. R = \int_{\eta} \int_w \Gamma(\eta, w) r(\eta, w, \tau) dw d \eta .. todo: Give them a distribution tree to do. .. Links on the page .. _Python docs on special methods: http://docs.python.org/2/reference/datamodel.html#special-method-names .. _Operator Overloading: http://docs.python.org/2/reference/datamodel.html#basic-customization .. _Farkle rules: http://www.buzzle.com/articles/farkle-rules.html