README.md

Workflow for generating phenotype score combinations and correlating them to biofilm. 

There is one rule: no Excel. Every time I use excel, I have to rename the file and they get lost and I can't retrace my steps. Forcing no excel, I can see every step and fix them where I need to. 

First things first: 

1. Generate normalized scores from the sorted scores. 
	* A sorted score is a the average of the raw scores from the biological replicates. An individual photo is a biological replicate. 

	* `score_wrangler.R` takes in the un-normalized scores and generates a normalized column using the `preProcess()` function from the `caret` package. 
	
	* This program will also remove data that we do not want (we removed certain non-albicans *Candida* species that didn't grow under certain conditions. 

	* After this, the files are modified with `column_clean.py` (called inside the R script) to remove the leading column and to clean up the column content if necessary. 

	* Finally, the program makes a file with all the score data in it. Repeatability. No Excel. 

	* I also had it combine all the scores. That just made things a lot easier. 

	
2. Correlate all the normalized sum scores with biofilm.  
	* I need a table for these that include the information on what scores are included in the composite scores, the media, and the temperature, as well as the correlation metrics. 

	* `additive_correlator.R` Using the `cor.test()` function described by [STHDA](http://www.sthda.com/english/wiki/correlation-test-between-two-variables-in-r)