Going from beginner to ... to what actually?

When I started this project, I started off hacking my way through Python as if it were a Matlab-clone, using Python tutorials and online resources. Soon enough I realized that wasn’t getting me anywhere, because even opening a file and putting its contents in an array was too much to ask.

After that, I decided to start reading Learning Python to get a better grasp of the basics. After finishing it, I was capable of most of the basic tasks, I could use code samples provided by Stack Overflow users and thought I had an idea of what I should be doing. However, every ‘beginner’ book ends with Object Oriented programming and Classes. They end here with a good reason: using these tools clearly distinguishes you from a beginner and take you into the realm of journeyman programmers on their way to become masters. Sadly, it becomes a lot harder to find good books that help you get beyond the initial beginner status.

Which became painfully obvious when I wanted to apply my paw detection to all measurements of a dog. All of a sudden I had to make my script apply itself to multiple files and maintain them in a sensible way. This takes you into the realm of classes, but if you ever saw an entry level example of a class:

# We can create a class that support that:
class BalanceError(Exception):
    value = "Sorry you only have $%6.2f in your account"

class BankAccount:
    def __init__(self, initialAmount):
    self.balance = initialAmount
    print "Account created with balance %5.2f" % self.balance

    def deposit(self, amount):
    self.balance = self.balance + amount

    def withdraw(self, amount):
    if self.balance >= amount:
         self.balance = self.balance - amount

you surely understand what a daunting task it is to design one yourself the first time. So I decided to ask another SO-question to help address this problem. Even though S.Lott’s answer was very helpful, I guess I was aiming a bit too high as it didn’t really help me understand how to get the code working the way I intended.

Luckily after some messing around I did manage to get it working, but I ran into my next problem. I had divided the code into three classes:

Dogs, which would require the folder path and name of the dog; then it lists all the files that were present for this dog and creates a list of all the file paths, so they are easy to load. This class is also intended to apply Measurements to each file and stuffs the results into a database.
Measurements, which loads the file that’s passed to it and returns a slice of the array (basically coordinates) and the data from this slice (an array) for each paw.
Paws, which would detect the toes within a paw and return their coordinates and values.

I hadn’t done any coding on a paw level yet, as I haven’t sorted the paws yet. And that’s what’s bothering me: each slice of data needs to know to which paw it belongs. This sorting should be in Measurements, as the measurement knows where a slice of data is relative to the others.

However, I have a measurement log that tells me for each measurement, which paw touched the plate first. I figured that if I could take all the paws, throw them on a heap, cluster them into 4 groups and then count how many of my first paws where in each group. If there is enough similarity between each print of the same paw, but enough difference between the 4 different paws, this should create four groups and let the log tell me what the front paws are. Then I only have to sort the hind paws, which shouldn’t be too much of a problem.

BUT! If I want to group up all the measurements, that’s not something done within Measurements. Only Dogs knows there are even multiple files to begin with! So something tells me I don’t really know what I’m doing Thankfully, I got a couple additional books and I’m planning to try and work my way through them in the coming weeks. For now, I’ll focus on getting the sorting sorted out!

Currently my code returns a dictionary with the sliced out array and the slice itself. Let’s call them contact, because the array describes the contact of a paw with the plate and coordinates, because the slice is basically the X and Y coordinates over time (Z). I’ll put these in my Dropbox, so that I can share it with anyone interested in helping out.

While sorting the paws could be done with clustering, it’s perhaps much easier to keep it on a measurement level. Especially because each measurement should contain enough information to sort them already:

Entire plate with temporal information

As you can see, there’s a clear, repeatable pattern to it. So perhaps a better approach would be to apply several heuristics and that decide which paw it is.

You see, the problem is that while it’s not so difficult to sort the paws for healthy dogs, this won’t necessarily be true for lame dogs. (Note: this project is for a veterinary clinic!) So I can’t rely on just one algorithm as it wouldn’t work for all the dogs that would be measured with it.

However, using heuristics should at least be a bit more robust. Some of the rules I’m thinking off are:

I always know which the first paw is, thanks to the log. Due to the tracking of the paw detection, this should hold true for other contacts of this paw. The front and hind paw are connected both temporally and spatially, when the front paw is lifted the hind paw should make contact close in both time and space. If not, the dog would simply fall over! A pressure measurement is like a fingerprint. Each contact of a paw looks alike, so if you can identify one paw correctly, this will be true for the following contacts too. How do they look alike?

The duration of the contact will be very similar and is likely to be different between front and hind paws.
The pressure distribution, basically the location of the peak pressures will be unique for each paw.
The path of the center of pressure can be used for clustering contacts. Furthermore, the sideways motion indicates whether it’s left or right. The patterns are also different between the front and hind paws, because of the anatomy of the legs and their function.
The pattern of the total pressure under the paw over time, based on the assumption that the ratio of weight bearing between the front and hind paws is 60-40%.
The pattern with which the toes come in to contact with the ground and leave it again is another way of distinguishing between left and right and front and hind paws.
The contact surface of the paws will be different, especially because the hind paws tend to be somewhat smaller.

So here we have a set of rules that need to be quantified, compared to a data set that has been manually sorted and then figure out how to sort other measurements as well.

The first thing I’m going to do now, is try and quantify these rules and I will update my post later on with the results. Once I have those, I’ll be able to write a Stack Overflow question for additional help and useful built-ins I can use.

links

social