Workflow for generating phenotype score combinations and correlating them to biofilm. There is one rule: no Excel. Every time I use excel, I have to rename the file and they get lost and I can't retrace my steps. Forcing no excel, I can see every step and fix them where I need to. First things first: 1. Generate normalized scores from the sorted scores. * A sorted score is a the average of the raw scores from the biological replicates. An individual photo is a biological replicate. * `score_wrangler.R` takes in the un-normalized scores and generates a normalized column using the `preProcess()` function from the `caret` package. * This program will also remove data that we do not want (we removed certain non-albicans *Candida* species that didn't grow under certain conditions. * After this, the files are modified with `column_clean.py` (called inside the R script) to remove the leading column and to clean up the column content if necessary. * Finally, the program makes a file with all the score data in it. Repeatability. No Excel. * I also had it combine all the scores. That just made things a lot easier. 2. Add normalized scores in different combinations. * Adhesion, filamentation, and invasion scores need to be summed together in all combinations of pairs and once all together for *each* condition. * There are 6 conditions (3 different media and 2 temperatures, which don't match across the biofilm assays). Using the `cor.test()` function described by [STHDA](http://www.sthda.com/english/wiki/correlation-test-between-two-variables-in-r) 3. Correlate all the normalized sum scores with biofilm. * I need a table for these that include the information on what scores are included in the composite scores, the media, and the temperature, as well as the correlation metrics.