Classes (alternative version, by baseline) COMMENT: **This is my attempt at rewriting the introduction to the Classes section, to make it... useful. This was orginally hosted at parlar.infogami.com** COMMENT: **The target audience for this page is "people with some programming experience, but with little or no OO experience". For discussion, see the [comments section](#end).** ## What's a class? If you're coming to Python from an object-oriented programming language (C#/Java/Smalltalk/Ruby, etc), then you can probably skip over this section (though you might want to skim over the parts where I discuss `self`) If Python is your first programming language though, or your first encounter with object oriented programming, then you're definitely going to want to read this. A class, in simple terms, is a feature that lets you keep a bunch of closely related things "together". Let's take a simple example to see why you would want classes. ### 3rd Grade Class Assume that you're the teacher of a 3rd grade class (you know, the type of class with a bunch of little kids running around, has nothing to do with programming languages). As the teacher, you've decided it's a good idea to stay current with the hip, new technologies (like Python!) to keep your students from getting ahead of you. So, after going through the first few chapters of this tutorial, you decided to build a little Python program to track some stats for your class. We'll implement the program below in the interactive interpreter, for the sake of the example, but in real life, you'd put this stuff into actual modules. I'll interspere the interpreter code with comments, but keep in mind that all this code goes together. So the first thing you want, is a list of students in your class. We'll use a list, and not a tuple, because you never know when the administration is going to give you more students!: >>> student_list = ["Simon", "Mal", "River", "Zoe", "Jane", "Kaylee", "Hoban"] You know how fickle the administration is, and with the housing market the way it is, people are moving all the time. So you're definitely going to need ways to add and remove kids from your class list.: >>> def add_student(student): ... student_list.append(student) ... >>> def remove_student(student): ... student_list.remove(student) ... Ok, so that works nicely. What now? Well, it'd be nice to track the grades of each student. Probably the easiest way to do that is to create a dictionary, where each key in the dict is a student name, and each value is a list of their marks. >>> student_marks = {} >>> for student in student_list: ... student_marks[student] = [] ... So there we've initialized our `student_marks`, with no marks for any student yet. So now we should make a function to add marks. >>> def add_mark(student, mark): ... student_marks[student].append(mark) ... What if we want a function to change the mark though? That's a lot tricker. To change a mark, we need to know a few things. First, we need to know the student, that's easy. Second, we need to know _where_ in the value of `student_marks[student]` the old mark existed, or we need to know what the old mark is. Here is a possible way to do this: >>> def change_mark(student, oldmark, newmark): ... # If you know the old mark ... temp_mark_list = student_marks[student] ... position = temp_mark_list.index(oldmark) ... temp_mark_list[position] = newmark So we've given a simple way to change the marks, if you know the old mark and the new mark you want to use. As a final function, let's add a class attendance feature. We'll assume that most days, the entire class will be there. So the function will, by default, say that everyone one was there on a certain day. We will though pass in an optional list of names of people who weren't there. First, we need another dictionary, to track attendance: >>> student_attendance = {} ... for student in student_list: ... student_attendance[student] = 0 This time, we're initializing the list with zeros, ie. the number of days they've attended class. Every day, that number will increase by one for each student who is there. >>> def another_day(absent = []): ... for student in student_list: ... if student not in absent: ... old_attendance = student_attendance[student] ... student_attendance[student] = old_attendance + 1 So now we can call `another_day`, and pass in an optional list of students who aren't there. Everyone else will have their attendance increase by 1. So this is great and all, and works fine right now. But what if want want to start adding a bunch of other teaching related functions, lists and dictionaries into this file? Suddenly, at the top level of the file, we'll have a lot of different lists defined (like `student_attendance`, `student_marks`, etc.), a whole lot of functions (`another_day`, `add_mark`, etc.) and no way to tell what goes together with what. In other words, which functions need which variables, how is everything related? And this is _essentially_ what classes do. They provide "encapsulation", a method of grouping together things that logically relate to each other. So how do we do this? The first thing we have to do is create a "class". The class is the "thing" that will group together common elements. Let's call our first class `Student`. For each student in the class, so far, we're tracking a lot of different things, in different variables. We're tracking the student's name (in `student_list`), then we have a mapping between their name and their marks (`student_marks`), and a mapping between the name and their attendance (`student_attendance`). It'd be really nice to keep all the information for each student together, in one place. >>> class Student: ... def __init__(self, name): ... self.name = name ... self.attendance = 0 ... self.marks = [] Ok, so some of that definitely looks kind of crazy at this point, but some probably makes some sense. For instance, `self.attendance = 0` and `self.marks = []` should look at least a bit familiar, and should make a little bit of sense. So what exactly are we doing here? Well, first off, we're declaring that we are creating a new class, with `class Student(object)`. The name of this class is `Student`. So, that's fine, nothing too tough there. But what is that next silly looking thing, the `__init__(self, name)`? That's called a "constructor". It is a special function of the class that is called whenever we create a new instance of the class. Wow, lots of terminology there. Maybe a simple example will help. >>> class ExampleClass: ... def __init__(self, some_message): ... self.message = some_message ... print "New ExampleClass instance created, with message:" ... print some_message ... >>> first_instance = ExampleClass("message1") New ExampleClass instance created, with message: message1 >>> second_instance = ExampleClass(message2") New ExampleClass instance created, with message: message2 So what have we done there? Well, we created a new type of class, called `ExampleClass`. In the constructor (`__init__`), we print out a message when a new instance gets created. After defining the class, we created two new instances, `first_instance` and `second_instance`. When we created them, we can see that the `print` statements in the `__init__` function got called, and more importantly, the variable we passed to the class (ie. "message1" in `ExampleClass("message1"`), gets passed to the `__init__` function. Ok, so that's fine, but what's up with the `self` as the first argument to the `__init__` function? Every function in a class (functions in classes are actually called "methods", I'll call them that from now on) has to take `self` as the first argument. For anyone coming from another object oriented language, this will seem VERY strange. For new programmers, it will just seem annoying. For now though, have faith that it's needed, and you'll understand why later. After the `self`, you can start putting the "real" arguments to the method, the ones you care about. So what arguments did we define? Just `some_message`. And what is this `some_message` used for? Well, in this example, we used it when we did `print some_message`, but more interestingly, we used it to do `self.message = some_message`. So what's that all about? By doing `self.message =`, we created something called an "attribute". An attribute (as the name implies), is a piece of information for the class. Once we assign that attribute, we can access it from outside the class, like so: >>> first_instance.message 'message1' >>> second_instance.message 'message2' See that? We assigned the attribute in the `__init__` constructor, and now, we can access that attribute from outside the class! Is the `Student` class making more sense now? Let's create an example instance of it, and see what happens: >>> bobby = Student("Bobby") >>> bobby.name 'Bobby' >>> bobby.attendance 0 >>> bobby.marks [] Isn't that MUCH nicer than having to keep three separate lists/dictionaries? All the information for the student "Bobby" is kept in one single place, an instance of the `Student` class. And remember, it's not just from _outside_ the class that you can access these atributes. You can of course access them from within the class. Any attribute tied to `self` (like we did with `self.name`, `self.attendance` and `self.marks`) essentially becomes a global variable to that *instance*. So anytime you do anything with that instance, the value of the attribute is still around. Any variables you create inside a class, that aren't prepended with `self` will be local variables, only around during a particular call to a function. Let's see an example of that. We'll redefine our `Student` class as follows: >>> class Student: ... def __init__(self, name): ... self.name = name ... self.attendance = 0 ... self.marks = [] ... number_of_marks = len(self.marks) ... print "%s marks so far!" % number_of_marks ... >>> b = Student("Bobby") 0 marks so far! >>> b.marks [] >>> b.number_of_marks Traceback (most recent call last): File "", line 1, in ? AttributeError: Student instance has no attribute 'number_of_marks' >>> So what happened there? In the `__init__` function, we created three attributes, `name`, `attendance` and `marks`. We know they are attributes because we put the `self` in front of them. We also created a local variable though, `number_of_marks`. As stated above, local variables only hang around for as long as the function is executing. Once the `__init__` function is done, any local variable created in it will go away. That's why when we tried to do `b.number_of_marks`, we got an `AttributeError` exception. And remember that values of attribute variables are unique to each instance. So if we do: >>> b = Student("Bobby") >>> m = Student("Mary") >>> b.name 'Bobby' >>> m.name 'Mary' We can see that the `b` instance has its own value for the attribute `name`, and the `m` instance has its own value for that attribute. So let's get a bit fancier, let's create a `StudentTracker`. This tracker will receive a list of student names as an argument to its constructor, and then will create a `Student` instance for EACH of those names: >>> class StudentTracker: ... def __init__(self, initial_student_list): ... self.student_names = initial_student_list ... self.students = {} ... for name in self.student_names: ... self.students[name] = Student(name) ... So, we created a nice attribute, `self.students`, which is a dictionary of `Student` instances (or objects, it is common to call an instance an "object"). We still need to be able to do stuff with those instances though. The way we'll do that is by defining some methods in the class. A method is a function that is specific just to the class it's defined in. Here's a simple example: >>> class Multiplier: ... def __init__(self, number): ... self.number = number ... def multiply_by(self, x): ... return self.number * x So this class will have one attribute, `self.number`. It also has one method, `multiply_by`, which takes another number, multiplies it by our original number, and returns the result. Let's see it in action. >>> f = Multiplier(10) >>> f.number 10 >>> f.multiply_by(5) 50 >>> f.number 10 Does that make sense? We created an instance, and called it `f`. We then showed the attribute, `f.number`. We then called the method on the class, by doing `f.multiply_by(5)`, which returned 5*10. Notice though that in our definition of `multiply_by`, we don't change the value of `self.number`, which is why it remains 10. It is important to note _how_ we called the method. We can't just do `multiply_by(5)`, we have to say `f.multiply_by(5)`. Why is that? Well, imagine what would happen if we had created two separate instances. How is Python supposed to know which one to call, unless you tell it?: >>> f = Multiplier(10) >>> g = Multipler(20) >>> f.multiply_by(5) 50 >>> g.multiply_by(5) 100 So we told Python which instance to call `multiply_by` on, and it did it, and everything worked perfectly! So let's get back to our `StudentTracker`. We haven't yet defined any regular methods for it (we defined `__init__`, but that's a special method, you're not supposed to call it yourself. Having `__` on both sides of the method means you're not suposed to call it, it's a special method that Python will call by itself). Let's redefine our `Student`, and `StudentTracker`, but this time with useful methods: >>> class Student: ... def __init__(self, name): ... self.name = name ... self.attendance = 0 ... self.marks = [] ... def add_mark(self, mark): ... self.marks.append(mark) ... def present(self): ... self.attendance = self.attendance + 1 ... def get_average(self): ... return sum(self.marks) / len(self.marks) ... def change_mark(self, oldmark, newmark): ... position = self.marks.index(oldmark) ... self.marks[position] = newmark ... def __str__(self): ... message = "Name: " + self.name + " " ... message = message + "Attendance: " + str(self.attendance) ... message = message + "Average: " + str(self.get_average()) ... return message >>> class StudentTracker: ... def __init__(self, initial_student_list): ... self.student_names = initial_student_list ... self.students = {} ... for name in self.student_names: ... self.students[name] = Student(name) ... def another_day(self, absent = []): ... for name in self.student_names: ... if name not in absent: ... self.students[name].present() ... def add_mark(self, name, mark): ... self.students[name].add_mark(mark) ... def change_mark(self, student, oldmark, newmark): ... self.students[name].change_mark(oldmark, newmark) ... def prettyprint_students(self): ... for student in self.students.values(): ... print student Almost everything there should be pretty self explanatory at this point (except the `__str__`), but I'll point out a few key ideas. The `__str__` method is another special method. It gets called when Python is told to convert something to a string (using the `str()` function), or when Python is told to print an instance. A small example is as follows: >>> class Foo: ... def __str__(self): ... return "I am an instance of Foo!!!" >>> f = Foo() >>> print f I am an instance of Foo!!! >>> str(f) 'I am an instance of Foo!!!' In our `__str__` method, we build up a nice long message, including the student's name, attendance, and mark average, and return that. Note that in our `__str__` method, we do `self.get_average()`. Just like when a class instance wants to access one of its own attributes, we must prepend the `self.` to the method call. Reminder about `self`: Note again that all the methods we defined had `self` as their first argument, but when we actually call the method, it essentially gets ignored. That is a little bit of magic Python is doing for you. It should make sense when you get deeper into Python programming. For now, just trust that when you _define_ a method, you need `self` as the first argument, but when you _call_ a method, you can ignore the `self`. Notice the short-hand in `another_day` and the `StudentTracker` versions of `add_mark` and `change_mark`. In `another_day`, we have the following line: self.students[name].present() You've probably figured out what that does, but just in case, I'll explain it. Remember what `self.students` is, right? It's a dictionary, where the keys are the students' names, and the values are instances of the `Student` class. So if we do `self.students[name]`, that returns an instance, right? So, we would normally do: student = self.students[name] student.present() But, if the only thing we need to do with the instance right now is called one method, why waste space? We can instead just do what we did above, namely: self.students[name].present() So, the `self.students[name]` part of that is executed first, and it returns the instance object. It then does the `.present()` on the instance object. This is an idiom you'll see all the time in Python code (and in most object-oriented programming languages), so make sure you understand it. We did the exact same thing in the `StudentTracker` version of `add_mark`, namely: self.students[name].add_mark(mark) And that ends our mini introduction to what classes are. The further sections in this chapter will go into more detail. I leave it as an exercise to the reader to actually try these out. Create a `StudentTracker` instance with some names, play around a bit, try to break the code (there's no error handling, so there should be a few ways to break it). Messing around and experimenting with it will be the best way to learn. And to continue with Python, it is pretty important that you learn how classes work. Most Python code is written with classes, most of the standad library is written with classes, it's just the way things are done. So even if you don't want to ever write your own classes, you'll have to understand how they work if you want to use other peoples' code. ------------------------------- ##Suggested Alternative Intro ### A First Example Imagine that you're a teacher and, after going through the first few chapters of this tutorial, you've decided to build a little python program to keep track of some stats for your students. So how do you start? Well, the first thing you want is a list of your students: >>> student_list = ["Simon", "Mal", "River", "Zoe", "Jane", "Kaylee", "Hoban"] (We'll implement the program in the interactive interpreter, for the sake of the example, but in real life, you'd put this stuff into actual modules. I'll interspere the interpreter code with comments, but keep in mind that all this code goes together.) You know how fickle the administration is, and with the housing market the way it is, people are moving all the time. So you're definitely going to need ways to add and remove kids from the list.: >>> def add_student(student): ... student_list.append(student) ... >>> def remove_student(student): ... student_list.remove(student) ... >>> remove_student("Mal") >>> add_student("Bill") >>> print student_list ["Simon", "River", "Zoe", "Jane", "Kaylee", "Hoban", "Bill"] Ok, so that works nicely. What now? Well...