Skip to content

Commit

Permalink
Adding Unequal Variance to Signature Analysis T-Test (#186)
Browse files Browse the repository at this point in the history
* make independent t-test have equal variance!

* add tissue balance and rerun separate notebook

* update top features in notebook 2

* scientific data updated the URL!

* add the actual data in case it updates again

* make sure equal variance is false

* remove hard coded best features in visualization

also calculate t-test in figures after filtering tissues

* update full separation plot figure

* rerun full signature pipeline

* make points in figure bigger b/c using test set

* rerun pipeline

* closes #187

* modify figure for rerunning analysis

* rerun pipeline
  • Loading branch information
gwaybio authored Sep 21, 2019
1 parent 323d363 commit 843923e
Show file tree
Hide file tree
Showing 24 changed files with 192,554 additions and 195,349 deletions.
16 changes: 10 additions & 6 deletions 10.gene-expression-signatures/0.download-validation-data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
"data": {
"text/plain": [
"('download/2019-01-22-CellLineSTAR-fpkm-2pass_matrix.txt',\n",
" <http.client.HTTPMessage at 0x7fe8cae44320>)"
" <http.client.HTTPMessage at 0x7f2d72ae0080>)"
]
},
"execution_count": 4,
Expand Down Expand Up @@ -100,7 +100,7 @@
"metadata": {},
"outputs": [],
"source": [
"url = \"https://www.nature.com/articles/sdata201733/tables/3\"\n",
"url = \"https://www.nature.com/articles/sdata201733/tables/4\"\n",
"name = \"nbl_cellline_phenotype.txt\"\n",
"path = os.path.join(\"download\", name)"
]
Expand Down Expand Up @@ -223,12 +223,16 @@
}
],
"source": [
"html = requests.get(url).content\n",
"if not os.path.isfile(path):\n",
" html = requests.get(url).content\n",
"\n",
"pheno_df = pd.read_html(html)[0]\n",
"pheno_df['Cell Line'] = pheno_df['Cell Line'].str.replace(\"-\", \"\")\n",
" pheno_df = pd.read_html(html)[0]\n",
" pheno_df['Cell Line'] = pheno_df['Cell Line'].str.replace(\"-\", \"\")\n",
"\n",
"pheno_df.to_csv(path, sep='\\t', index=False)\n",
" pheno_df.to_csv(path, sep='\\t', index=False)\n",
"\n",
"else:\n",
" pheno_df = pd.read_csv(path, sep=\"\\t\")\n",
"\n",
"pheno_df.head()"
]
Expand Down
Loading

0 comments on commit 843923e

Please sign in to comment.