Wednesday, January 19, 2011

Generate all possible proteins from ambiguous DNA

This had me stumped for awhile, but this works pretty well.  Does NOT handle stop codons or gap characters like '-'.  Requires BioPython


import itertools
from Bio.Seq import Seq
from Bio.Data import CodonTable
from Bio.Data import IUPACData</pre>
# Takes Bio.Seq.Seq object as input
# Returns list of all possible proteins
# Assumes sequence is in frame +1
def generateProtFromAmbiguousDNA(s):
   std_nt = CodonTable.unambiguous_dna_by_name["Standard"]
   nonstd = IUPACData.ambiguous_dna_values
   aa_trans = []
   for i in range(0,len(s),3):
      codon = s.tostring()[i:i+3]
      aa = CodonTable.list_possible_proteins(codon,std_nt.forward_table,nonstd) 
      aa_trans.append(aa)
   proteins = list(itertools.product(*aa_trans))
   possible_proteins = []
   for x in proteins:
      possible_proteins.append("".join(x))
   return possible_proteins
def main():
   a = Seq('ATGGCARTTGTAHAC')
   print "DNA: ",a.tostring()
   print "Proteins:"
   foo = generateProtFromAmbiguousDNA(a)
   for s in foo: print s
if __name__ == '__main__':
   main()

Creating a quick codon table

I didn't think this up, the code comes from Peter Collingridge here.  But it is rather elegant.


bases = ['t', 'c', 'a', 'g']
codons = [a+b+c for a in bases for b in bases for c in bases]
amino_acids = "F F L L S S S S Y Y stop stop C C stop W L L L L P P P P H H Q Q R R R R I I I M T T T T N N K K S S R R V V V V A A A A D D E E G G G G".split(' ')
codon_table = dict(zip(codons, amino_acids))

Thursday, January 13, 2011

Update the locate database on the Mac

This is the command for updating the locate database on the OSX system.

sudo /usr/libexec/locate.updatedb


I should figure out how to make this run everyday.

Wednesday, January 5, 2011

Connecting to PostgreSQL with Python and Psycopg2

Basic syntax for making a database connection, executing and retrieving data:



import psycopg2 as pg
# create database connection
try:
   conn = pg.connect("dbname='template1' user='dbuser' host='localhost' password='dbpass'")
except:
   print "Unable to connect to database"
# create database cursor
cur = conn.cursor()
# execute SQL and fetch results
cur.execute("""SELECT datname from pg_database""")
rows = cur.fetchall()
print "\nShow database results:\n"
for row in rows:
   print row[0]