[BioC] bioconductor
    Warnes, Gregory R 
    gregory_r_warnes at groton.pfizer.com
       
    Wed Nov 19 16:39:07 MET 2003
    
    
  
We talked about this at the BioCBUG meeting a couple of weeks ago.  The web
site does have clear instructions for *installing* Bioconductor, it is just
not clear what to do once it is installed.
I think that the necessary documentation is available, but it is fragmented:
1) It is not clear from the web site what documentation you need to read to
get started.
2) None of the vignettes that I've looked at show a complete analysis
session from start to finish.
[I think the reason for this is that the people writing the vignettes are
the *package* authors and they have slightly different interests from
*consumers*]
I would suggest 
1) Adding a topic on the front page and on the navbar "Getting Started with
Bioconductor" that brings up a page with a small number of vignettes titled
like:
	Getting Started with Affymetrix Data
	Getting Started with Custom Two-Channel Data
	Getting Started with XXXXX Data
	...
These vignettes should go through an common-case example analysis from start
to finish.  From my work the flow should be something like this for
Affymetrix data:
1) Prerequisites 
	- Software: R, Bioconductor	
	- Data: CEL files, experiment information
	- install the required CDF package
2) Load the data
	
3) Perform standard (technology) Quality Control tests
	- 3'/5' ratios
	- Chip images
	- RNA digestion plots
4) Normalize/scale/standardize the data
5) Perform 'overall' visualizations
	- MDS and PCA for samples using all probesets
6) Apply a statistical model to all probesets
	- ANOVA / ANCOVA
	- Contrasts
7) Apply multiple comparison correction (FDR, ...)
	
8) Filter based on statistical model
	- Select probesets with FDR < 0.05
	[Note that I didn't metion filtering earlier, I think it is a bad
idea to 
       filter before applying a model!]
9) Add annotation 
10) Generate visualizations
	- PCA/MDS for samples using statistically significant genes
	- Profile plots across experimental conditions / treatments
	- heatmap including 2-way hirarchical clustering
11) Generate tabulations
	- Table of top XX results from statistical model with subset of
annotation
12) Generate output dataset for interactive visualization in
Spotfire/Excel/...
	- All results from statistical model with all annotation
For the getting started document I would recommend giving the *simplest*
good-practice method of accomplishing each task.  Each section should also
include a pointer to other documents that can provide further details on how
the alogrithms work / alternative commands / etc.
-Greg
> -----Original Message-----
> From: rossini at blindglobe.net [mailto:rossini at blindglobe.net]
> Sent: Wednesday, November 19, 2003 9:51 AM
> To: Roger Vallejo
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] bioconductor
> 
> 
> 
> You aren't being helpful or explicit.  3-4 hours doing what?  What
> exactly have you read?  How do you expect us to suggest things when
> you don't tell us what you've done?  
> 
> 
> But more importantly, have you tried
> 
> library(tkWidgets)
> vExplorer()
> 
> and looked at the affy vignettes? 
> 
> 
> 
> "Roger Vallejo" <rvallejo at psu.edu> writes:
> 
> > DO you have a manual that shows how to learn to use BIOCONDUCTOR?
> >
> > I have spent 3-4 hrs and I see only lots of bla bla bla but 
> any direct
> > instructions on how to start loading affy genechip data and 
> performing
> > rudimentary microarray data analysis.
> >
> > Many thanks in advance for the help..
> >
> > Roger
> >
> >  
> >
> >  
> >
> > Roger L. Vallejo, Ph.D.
> >
> > Assist. Professor of Genomics/Bioinformatics
> >
> > The Pennsylvania State University
> >
> > Department of Dairy & Animal Science
> >
> > Genomics & Bioinformatics Laboratory
> >
> > 305 Henning Building
> >
> > University Park, PA 16802
> >
> > Phone:        (814) 865-1846 
> >
> > Fax:            (814) 863-6042
> >
> > Email:         rvallejo at psu.edu <mailto:rvallejo at psu.edu> 
> >
> > Website:     http://genomics.cas.psu.edu/ 
<http://genomics.cas.psu.edu/>
>
>
>  
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
-- 
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
    
    
More information about the Bioconductor
mailing list