28 Mar 2013 11:00am

Gene prioritisation with brain specific expression data


Like many other labs around the world we are using next-generation sequencing data to identify disease causing variants, mainly for large effect size variants, such as those observed in single gene, or Mendelian disorders. Whilst this approach has delivered an avalanche of genes it also has a high failure rate, ~60%. This failure is unlikely to be mainly due to technical problems but is thought to be more due to violations of the assumptions made when applying the filtering process to the hundreds of thousands of variants detected in each sample. Since each family is precious and would like an answer to their genetic disorder it is worthwhile pursuing additional methods to try and pinpoint genes that warrant further investigation. One such method is "in silico gene prioritisation". These methods have been around since early 2000 but despite a raft of papers have not yielded many tangible results for gene hunters. I will present our work in this field. We have elected to focus on two tissue specific gene expression data resources that have recently become available which solely target the brain and are thus very important for our work on disease of the brain, such as epilepsy and autism. There are quite a few interesting statistical discussion points. I will discuss topics such as (i) measuring co-expression - Pearson's versus modern computational methods, (ii) establishing significance, (iii) establishing measures to facilitate the ranking of genes and show some applications.