What is the function of this protein?
CSB352H Bioinformatic Methods Summer2012 ? Lab Report 3 CSB352: Lab report 3 For this lab report, you need to articulate the details and outcome of an in silico experiment involving the use of protein-protein interaction (PPI) data and any other bioinformatic techniques you have learned in the course thus far (no trees, though). You are free to BLAST, build BLAST databases, build MSAs, build HMMs from MSAs, search any publicly accessible organism?s genome for interesting protein domains or homologous sequences, evaluate Gene Ontology (GO) information and examine any gene you like in whichever organism(s) you?d like. You must however, make use of PPI data as a core component of your analysis. Also, do not exceed 4 pages Times Roman 12 or Helvetic 10 double spaced, not including figures, of which there is a requirement of exactly 3 (of your choosing). Good PPI databases for Arabidopsis genes are the BAR?s Arabidopsis Interactions Viewer database at https://bar.utoronto.ca/interactions, and BioGRID as per the PPI lab. You are free to follow the example thought experiments below, but creative exploration on your part is highly encouraged. Have fun, but do think scientifically and logically about your question and approach. And do, of course, visit PubMed, browse some abstracts in an area of interest, even look at a couple papers, before formulating your question. You should cite at least 2 primary sources plus all tools/websites/data sources used using the journal Nature?s reference format. Introduction (3 marks) ~ ? page Introduce a gene (or possibly protein domain) of interest and state 1) What question you are asking and why? 2) How did you attempt to answer your question? This could be as simple as: ?What is the function of this protein??, if its function is unknown. Or this could be a more expressed desire to characterize the function via the examination of protein domains and protein-protein interaction data. Methods (4 marks) ~ ? page A succinct description of your analysis with enough information for someone to reproduce your result. Be sure to cite web sources of data and tools. It?s enough to say ?I blasted accession ?x? against ?nr? with blastp and took all siginficant hits (E<1e-20)?, and ?I retrieved all interactions for my gene of interest from arabidopsis PPI data in BioGRID, filtering by a minimum evidence level of 3 and minimum interaction level of 2?. Results (3 marks) ~ 2 pages State the results of the experiment and their significance. Focus your ?story? around your 3 figures. What results did you find? What can you infer or conclude from your data? Negative results can also be informative. Discussion (5 marks) ~ 1 page State what you learned and what you might ask/check next if given the time and interest. Attempt to provide a hypothesis for in vivo activity and propose a follow up experiment, be it in silico or in vivo. CSB352H Bioinformatic Methods Summer2012 ? Lab Report 3 DUE AT THE BEGINNING OF CSB352 LAB 10 Examples: Question 1: My favourite protein functions as a ?hub? in a protein-protein interaction network. Are all these proteins interacting with a single domain on my protein or with multiple domains? Do the interacting proteins share a potentially common interaction domain among them? Approach: Determine protein interations with my protein from a PPI database. Perform motif scanning with Pfam against a collection of sequences for these proteins. Look for protein domain distribution on my protein and that in interactors, to come up with a potential hypothesis regarding how the proteins interact. Question 2: My favourite protein is part of a complex. How well is this complex conserved in different eukaryotic species? Approach: Identify orthologs for my protein in yeast, fly, human. Identify interactors with my protein in each species. Generate sequence data sets for these interaction groups and BLAST the data sets against one another to assess the level of conservation of the complex between species (or use online data bases and tools to answer the same question). Note that we have chosen your assigned Arabidopsis gene/protein such that they should have at least one experimentally-determined interacting partner in the BAR?s AIV database. It may be useful to examine the interacting partner?s interactors if there is just one or if there are just a few interactors for your assigned gene/protein. Question 3: How many proteins is my favourite PPI domain (e.g. PDZ, WD40) found within in my favourite eukaryotic genome? Are there any interesting patterns within PPI data for proteins possessing my favourite domain? Approach: Scan your favourite genome with an HMM for the domain and collect the significant hits. Feed the gene identifiers these genes to BioGRID or the BAR?s AIV to determine interactors, examining GO annotation and connectivity between interaction maps for each protein. A useful tool for performing GO enrichment analysis for lists of Arabidopsis genes or proteins is AgriGO at https://bioinfo.cau.edu.cn/agriGO/ (Du et al., 2010, Nucleic Acids Research; doi: https://dx.doi.org/10.1093/nar/gkq310)……………………