Python and plankton

Wow, it has been a long time since I actually played around with Python.  My brain has melted into Matlab-only mode, and I am having a tough time remembering how to do the simplest things…

So here is the latest.  Jaci was telling me about what she was working on in Python.  She has a comma delimited file containing data about different types of plankton, and the plankton are listed by numbers.  Then she has a different text file containing information on what the numbers mean.  She wants to link the list of data to the list of id numbers and plankton names.  It is entirely possible I mis-interpreted her problem, but it was fun to work this one out anyway – had to remember about lists and tuples and dictionaries and slicing, and also about how to read in a file line by line… good thing I still have some of my old scripts so I can copy and paste things that I forget!

Here are my fake data files:

critters.txt, which contains the actual data in comma-delimited format.  the first column is the ID number, and the second column is the shoe size.  Because different types of plankton wear different shoe sizes:

id,shoesize
10,6
11,7
10,2
10,3
12,4
11,14

idnumbers.txt, which contains the ID numbers in the first column, and the names that go with that plankton in the second column.  (guys, I am not a biologist.  Just sayin.):

id,name
10,copepod
11,diatom
12,cyanobacteria

And here is the python script, lookuptest.py:

#!usr/bin/python

# a quick lookup table: may not be the best way to do this. I’m new at this stuff!

critterfile = ‘critters.txt’
idfile = ‘idnumbers.txt’

# preparing variables
datalist = [] # actual data – id numbers
datavalue = [] # actual data – data value (shoe size in my file)
dataid = [] # dictionary – data id number
critterlookup = {} # dictionary – critter names

infile1 = open(critterfile)
infile1.readline() # skip header line, could instead save column names.
lines = infile1.readlines()
for line in lines:
splitup = line[:-1].split(‘,’) # ignore newline character, split on com\
ma
datalist.append(splitup[0])
datavalue.append(splitup[1])

infile2 = open(idfile)
infile2.readline()
lines2 = infile2.readlines()
for line in lines2:
splitup2 = line[:-1].split(‘,’)
critterlookup[splitup2[0]] = splitup2[1]

for cdex in datalist:
dataid.append(critterlookup[cdex])

It may not be the most efficient way to do it, but it seems to work! 🙂 Oh, and if you’re forgetful like me, it’s a good idea to hang on to every script or function you write, even the unfinished ones, because they can be super helpful later on.

Leave a Reply

Your email address will not be published. Required fields are marked *