I have been lucky enough to have a chance to do some of my more basic data processing in Python (with numpy/scipy/matplotlib). One of the things I’ve been working on is the effect of interference between the direct path and surface bounce arrivals from a fin whale that is near the surface. Hopefully I’ll have a post dedicated to that problem in particular, but for the time being, I’ve been trying to figure out the best way to animate a series of plots output each time the code goes through a loop.

There are a few options for doing this. From what I can gather, it’s not something that is easily done within Python (someone correct me if that’s wrong!). So instead, I’m creating a series of image files (.png’s), and converting to video outside Python. I looked at a couple of options and settled on ffmpeg, which seemed really simple.

I grabbed it from fink:

fink install ffmpeg

All of my image files were saved in a single folder, with names like 001.png, 002.png, … 198.png. Converting them to video format:

ffmpeg -i %03d.png -vb 1024K video.mpg

The -i flag is for input file(s) and -vb is video bitrate. Here’s the output:

Wow, it has been a long time since I actually played around with Python. My brain has melted into Matlab-only mode, and I am having a tough time remembering how to do the simplest things…

So here is the latest. Jaci was telling me about what she was working on in Python. She has a comma delimited file containing data about different types of plankton, and the plankton are listed by numbers. Then she has a different text file containing information on what the numbers mean. She wants to link the list of data to the list of id numbers and plankton names. It is entirely possible I mis-interpreted her problem, but it was fun to work this one out anyway – had to remember about lists and tuples and dictionaries and slicing, and also about how to read in a file line by line… good thing I still have some of my old scripts so I can copy and paste things that I forget!

Here are my fake data files:

critters.txt, which contains the actual data in comma-delimited format. the first column is the ID number, and the second column is the shoe size. Because different types of plankton wear different shoe sizes:

id,shoesize
10,6
11,7
10,2
10,3
12,4
11,14

idnumbers.txt, which contains the ID numbers in the first column, and the names that go with that plankton in the second column. (guys, I am not a biologist. Just sayin.):

# preparing variables
datalist = [] # actual data – id numbers
datavalue = [] # actual data – data value (shoe size in my file)
dataid = [] # dictionary – data id number
critterlookup = {} # dictionary – critter names

infile1 = open(critterfile)
infile1.readline() # skip header line, could instead save column names.
lines = infile1.readlines()
for line in lines:
splitup = line[:-1].split(‘,’) # ignore newline character, split on com\
ma
datalist.append(splitup[0])
datavalue.append(splitup[1])

infile2 = open(idfile)
infile2.readline()
lines2 = infile2.readlines()
for line in lines2:
splitup2 = line[:-1].split(‘,’)
critterlookup[splitup2[0]] = splitup2[1]

for cdex in datalist:
dataid.append(critterlookup[cdex])

It may not be the most efficient way to do it, but it seems to work! 🙂 Oh, and if you’re forgetful like me, it’s a good idea to hang on to every script or function you write, even the unfinished ones, because they can be super helpful later on.

Now that I’m (somewhat) on my way to getting ObsPy up and running, it sort of makes sense to figure out if I could realistically use it to do my work. You know, not just to play around and make pictures. Right now my data is pretty much all in a couple of Datascope databases, and I access it using Matlab. But a bit of Google searching led me to the Antelope Users Group on Github, where they have user-contributed software designed for Antelope and Datascope. And from what I can tell, there’s a Python interface somewhere in there.

Here are the steps for getting the Antelope Users Group contributed code:

1. Download and unzip the contributed code. It will look something like this:

Well, I’m officially hooked on Google +. I hope my friends and family all join me over there, because it’s way nicer. Still having a hard time convincing John though…

I finally got around to trying out ObsPy… well, okay, I am still trying to figure out how to install it properly. Finally made some headway though. I followed the instructions on the ObsPy installation page. I thought everything was going smoothly. For starters, I had all of the dependencies – most importantly Python, SciPy, NumPy, and Matplotlib. I also had the Easy_install package already from when I was installing Mercurial.

The installations all seemed to go just fine, I was getting messages telling me it was all good. But then it just didn’t work. I opened up iPython, tried to give it a whirl, and it just didn’t recognize those packages at all. Turns out the default installation location was not actually in the Python path. It sounds like it should be pretty easy to fix, but it took me a long time to figure it out.

So, normally, if you want to import from a Python library, you would use something like this:

from numpy import *

But when I tried importing obspy.core, like this:

from obspy.core import read

it couldn’t find the library. After much searching, I finally figured out how to add to my python path. First, say in iPython, import sys, and check the path:

import sys
sys.path

This will list what’s in the path. So for me, it was:

So, you might have noticed (okay, probably not) that I am now on post number 159. Yup, that’s a big jump from the 100-post mark a few days ago. How did I do that, you ask? Well, for one, I finally realized that I could still import posts from my original WordPress.com blog (duh. I’ve even done it before). So I added those 53 posts from late 2009/early 2010 – sweet! Now you can search back through the glorious dorkiness of my pre-grad school days. Well, actually my between grad school days. I used to be so into linux and python. What happened? I am going to make an effort to get back into it. What better time then summer, during the few sunny days we get here in Seattle.

Kim was asking me lots of questions about Python and Linux this afternoon before she left. It made me realize just how much I’ve forgotten (or how little I knew to begin with). And it was a good kick in the butt to remind me to get back to learning about, and embracing, the world of Ubuntu and Numpy and Matplotlib… Nerdy posts will resume! But since a few people very dear to me enjoy my daily sketches, I will endeavor to continue those as well. (I love you guys!)

The seismic data from the Neptune Canada ocean bottom seismometers (OBSs) is stored on the Iris database, and it’s in Miniseed format. This is a standard binary format for seismic data. I have used the Matlab toolbox for miniseed in the past, but was wondering about trying a Python version. Turns out there is a Miniseed reader for Python, and it’s part of the ObsPy python toolbox.

Before I’d stumbled upon this, I had entertained some brief thoughts about writing my own reader in Python, just so I could learn how to read binary data. I checked out a few sites, and was starting get the idea that it would be pretty tough for me to figure it out, and it would probably take me a very long time. Not that that would necessarily stop me, but if there’s already a tool out there, then I’m going to try that first.

Here’s an example showing a waveform plotted using the ObsPy.imaging tools:

I never did manage to resuscitate my old blog, but Google does cache some webpages, so I have access to my old posts that way for the time being. I’m not too worried about most of them, but a few were pretty handy. And since I’m interested in learning about inverse theory, this seems appropriate. So before it disappears forever, I will re-produce it here…

This post is actually a practice in least squares line fitting, linear algebra, Python, and Numpy, all mashed into one. In preparation for the possibility of learning geophysical inverse theory at some point in the future, I have backed up, waaay back to the beginning. Fitting a line to a set of points when there are more points than required to uniquely define the line.

I’m doing this example in 2D, and I’m basically following Gilbert Strang’s Linear Algebra, p 206-211 (section on least squares approximations).

I started with 3 points: (0,6)
(1,0)
(2,0)

I saved these in a file called “points.txt” (just for flexibility)

I started my Python script (lspractice.py) in the normal way, importing what I thought I’d need:

#!/usr/bin/python
# Fitting a line to a set of points
# Following an example from Linear Algebra by Gilbert Strang, p 206-211
# The points are in a file called points.txt
from numpy import *
from pylab import *

–*don’t forget to make executable using chmod +x lspractice.py

Next, I opened the text file and filled in my x and y variables:

# Open text file (comma delimited x/y coords)
f = open('points.txt',"r")
lines = f.readlines()
xi = []
yi = []
for line in lines:
fields = line.split(',')
xi.append(float(fields[0]))
yi.append(float(fields[1]))

Next – setting up the model:

# Filling these points into the equation of a line (the functional model)
# y = C + Mt
# F1: C + M(0) = 6
# F2: C + M(1) = 0
# F3: C + M(2) = 0
# unknowns are C and M. Take partial derivatives of each of the functions wrt C and M. These results will populate the design matrix.
dFdM = np.array(xi)
dFdb = tile(1,shape(dFdM))
A = np.array([dFdb,dFdM])
At = np.transpose(A)
b = (np.array(yi))

And finally, working out a solution for C and M:

# The solution to this type of problem is At*A*xhat = At*b
# OR xhat = inverse(At*A)*(At*b)
xhat = np.dot( linalg.inv(np.dot(A,At)) , np.dot(b,At))

And a little sanity check:

# Now a little graphical check
xtest = arange(min(xi)-3,max(xi)+5,1)
ytest = xhat[0] + xhat[1]*xtest
plot(xi,yi,'o')
plot(xtest,ytest,'.-r')
grid()
title('Fitting a line to 3 points using <strong>least</strong> <strong>squares</strong>')
show()

So this is what it looks like in the end. The original points are shown as blue dots. The line described by the parameters that minimize the sum of the squared residuals is in red. Yeah, it’s embarrassingly simple, but at least it’s a baseline to go on.

Also worth noting – I have not fully figured out Python/Numpy’s matrix multiplication yet. In order to get it to work, I had to use numpy.dot(B,A) when I wanted to do A*B (I had to reverse the order of the arrays). According to the documentation this should not be the case. So I’m stumped. Maybe I’m just not getting the array dimensions right to begin with. That’s what I get for being a Matlab user I guess 🙂

Yet another reason why I love Aquamacs… I love being able to do this in Matlab (Ctrl-T/Ctrl-R), but it always bugged me that I couldn’t figure out how to do this quickly while editing scripts in Emacs. I still don’t know how to do it in plain old Bash emacs, but I did find out how to do it in Aquamacs:

ctrl-c ;

This toggles between commenting and uncommenting. I still haven’t checked to see if I can do the same thing in Aquamacs with Python files. I suspect there’s a way. (actually I suspect there’s a way in command line emacs as well, I just haven’t found it).

Monica just blogged about Pylint, and I thought it was really cool – it checks your code for bugs and for quality based on Python standards. More information can be found at the Python Pylint page.