Histone modification data¶

Example using Histone modification data downloaded from Encode¶

[5]:

import pandas as pd
import numpy as np
from scivae import VAE

# Set the location of the mnist data
data_dir ='~/Documents/code/scivae_public/tests/data/'
df = pd.read_csv(f'{data_dir}mouse_HM_var500_data.csv')
df

[5]:

	entrezgene_id	external_gene_name	ensembl_gene_id	embryonic-facial-prominence_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF003VMR_width	embryonic-facial-prominence_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF003VMR_signal	embryonic-facial-prominence_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF310NGB_width	embryonic-facial-prominence_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF310NGB_signal	embryonic-facial-prominence_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF565QAD_width	embryonic-facial-prominence_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF565QAD_signal	embryonic-facial-prominence_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF053GHW_width	...	stomach_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF814BNR_width	stomach_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF814BNR_signal	stomach_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF501CJA_width	stomach_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF501CJA_signal	stomach_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF569KWB_width	stomach_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF569KWB_signal	stomach_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF068FWP_width	stomach_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF068FWP_signal	stomach_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF544RGQ_width	stomach_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF544RGQ_signal
0	497097	Xkr4	ENSMUSG00000051951	838.0	4.64805	2236.0	4.70623	NaN	NaN	841.0	...	459.0	4.17547	2522.0	32.56543	2456.0	37.44113	1852.0	6.81303	NaN	NaN
1	384198	Fam47e	ENSMUSG00000057068	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	12492	Scarb2	ENSMUSG00000029426	2053.0	16.06083	NaN	NaN	4699.0	4.03960	787.0	...	797.0	4.94311	862.0	20.08811	3071.0	61.25575	2503.0	24.87381	NaN	NaN
3	269113	Nup54	ENSMUSG00000034826	1546.0	23.33510	NaN	NaN	8433.0	4.30511	462.0	...	215.0	2.45555	1376.0	35.42474	2128.0	66.67310	1165.0	28.39603	425.0	3.4231
4	15945	Cxcl10	ENSMUSG00000034855	NaN	NaN	984.0	4.95978	NaN	NaN	1086.0	...	641.0	3.41532	794.0	13.95355	661.0	8.53067	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
20395	21762	Psmd2	ENSMUSG00000006998	1431.0	10.42818	NaN	NaN	4157.0	3.39655	1512.0	...	NaN	NaN	1407.0	13.44737	1765.0	61.31395	1525.0	20.17010	NaN	NaN
20396	73047	Camk2n2	ENSMUSG00000051146	2117.0	10.87675	3875.0	5.10808	734.0	3.56510	3664.0	...	656.0	4.96696	3348.0	26.86646	3417.0	25.29333	1863.0	7.71053	NaN	NaN
20397	107522	Ece2	ENSMUSG00000022842	1041.0	7.72166	3514.0	5.33374	NaN	NaN	1335.0	...	301.0	2.85033	1312.0	30.50417	1046.0	31.87724	926.0	10.12819	NaN	NaN
20398	208624	Alg3	ENSMUSG00000033809	2342.0	17.18692	NaN	NaN	754.0	5.22586	1288.0	...	NaN	NaN	3259.0	44.11346	2597.0	66.85134	834.0	9.84724	NaN	NaN
20399	328643	Vwa5b2	ENSMUSG00000046613	666.0	1.94738	7033.0	14.60257	4385.0	7.32809	2502.0	...	724.0	3.69700	1864.0	32.48818	1716.0	28.13862	1324.0	7.19501	NaN	NaN

20400 rows × 997 columns

Normalise the data¶

Before running the VAE we might only want to do it on a subset, here I’m interested in marks at day E10.5 only in the brain.

[7]:

df = df.fillna(0)
# Get out columns with HM values
cols = [c for c in df.columns if '10' in c and 'brain' in c and 'signal' in c]  # i.e. only do brain at E10 samples
# Make sure we log2 the values since they're too diffuse
vae_df = pd.DataFrame()
vae_df['external_gene_name'] = df['external_gene_name'].values
new_cols = []
for c in cols:
    new_name = ' '.join(c.split('_')[:-3]).replace('embryonic', '')
    new_cols.append(new_name)
    vae_df[new_name] = np.log2(df[c] + 1)

dataset = vae_df[new_cols].values
# Create and train VAE

Train the VAE¶

We run the training of the VAE

[14]:

config = {"loss":
  {"loss_type": "mse",
    "distance_metric": "mmd",
    "mmd_weight": 1.0
  },
  "encoding": {
    "layers": [
      {"num_nodes": 64, "activation_fn": "relu"},
                {"num_nodes": 32, "activation_fn": "selu"}
        ]
  },
  "decoding": {
    "layers": [
                {"num_nodes": 32, "activation_fn": "selu"},
                {"num_nodes": 64, "activation_fn": "relu"}
        ]
  },
  "latent": {
    "num_nodes": 3
  },
  "optimiser": {
    "params": {"learning_rate": 0.001, "beta_1": 0.8, "beta_2": 0.97},
    "name": "adamax"
  }
}

vae = VAE(dataset, dataset, ["None"] * len(dataset), config, f'vae_rcm')
vae.encode('default', epochs=10, batch_size=50)

None
Model: "encoder"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
default_input (InputLayer)      [(None, 22)]         0
__________________________________________________________________________________________________
dense (Dense)                   (None, 64)           1472        default_input[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           2080        dense[0][0]
__________________________________________________________________________________________________
z_mean (Dense)                  (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z_log_sigma (Dense)             (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z (Lambda)                      (None, 3)            0           z_mean[0][0]
                                                                 z_log_sigma[0][0]
==================================================================================================
Total params: 3,750
Trainable params: 3,750
Non-trainable params: 0
__________________________________________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
z_sampling (InputLayer)      [(None, 3)]               0
_________________________________________________________________
dense_2 (Dense)              (None, 32)                128
_________________________________________________________________
dense_3 (Dense)              (None, 64)                2112
_________________________________________________________________
dense_4 (Dense)              (None, 22)                1430
=================================================================
Total params: 3,670
Trainable params: 3,670
Non-trainable params: 0
_________________________________________________________________
Model: "vae_rcm_scivae"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
default_input (InputLayer)      [(None, 22)]         0
__________________________________________________________________________________________________
encoder (Functional)            [(None, 3), (None, 3 3750        default_input[0][0]
__________________________________________________________________________________________________
decoder (Functional)            (None, 22)           3670        encoder[0][2]
__________________________________________________________________________________________________
dense (Dense)                   (None, 64)           1472        default_input[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           2080        dense[0][0]
__________________________________________________________________________________________________
z_mean (Dense)                  (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z_log_sigma (Dense)             (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z (Lambda)                      (None, 3)            0           z_mean[0][0]
                                                                 z_log_sigma[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Pack (TensorFlowOpL [(2,)]               0           tf_op_layer_strided_slice[0][0]
__________________________________________________________________________________________________
tf_op_layer_RandomStandardNorma [(None, 3)]          0           tf_op_layer_Pack[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul (TensorFlowOpLa [(None, 3)]          0           tf_op_layer_RandomStandardNormal[
__________________________________________________________________________________________________
tf_op_layer_Add (TensorFlowOpLa [(None, 3)]          0           tf_op_layer_Mul[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_1 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_3 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_2 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_4 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_6 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_5 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_7 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_9 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_8 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [()]                 0           tf_op_layer_Shape_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_3 (Te [()]                 0           tf_op_layer_Shape_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_2 (Te [()]                 0           tf_op_layer_Shape_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_4 (Te [()]                 0           tf_op_layer_Shape_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_6 (Te [()]                 0           tf_op_layer_Shape_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_5 (Te [()]                 0           tf_op_layer_Shape_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_7 (Te [()]                 0           tf_op_layer_Shape_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_9 (Te [()]                 0           tf_op_layer_Shape_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_8 (Te [()]                 0           tf_op_layer_Shape_8[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape/shape (Tens [(3,)]               0           tf_op_layer_strided_slice_1[0][0]
                                                                 tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_1/shape (Te [(3,)]               0           tf_op_layer_strided_slice_2[0][0]
                                                                 tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_2/shape (Te [(3,)]               0           tf_op_layer_strided_slice_4[0][0]
                                                                 tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_3/shape (Te [(3,)]               0           tf_op_layer_strided_slice_5[0][0]
                                                                 tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_4/shape (Te [(3,)]               0           tf_op_layer_strided_slice_7[0][0]
                                                                 tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_5/shape (Te [(3,)]               0           tf_op_layer_strided_slice_8[0][0]
                                                                 tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape (TensorFlow [(None, 1, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile/multiples (Ten [(3,)]               0           tf_op_layer_strided_slice_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_1 (TensorFl [(1, None, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape_1/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1/multiples (T [(3,)]               0           tf_op_layer_strided_slice_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_2 (TensorFl [(None, 1, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_2/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_2/multiples (T [(3,)]               0           tf_op_layer_strided_slice_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_3 (TensorFl [(1, None, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_3/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_3/multiples (T [(3,)]               0           tf_op_layer_strided_slice_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_4 (TensorFl [(None, 1, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape_4/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_4/multiples (T [(3,)]               0           tf_op_layer_strided_slice_8[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_5 (TensorFl [(1, None, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_5/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_5/multiples (T [(3,)]               0           tf_op_layer_strided_slice_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile (TensorFlowOpL [(None, None, None)] 0           tf_op_layer_Reshape[0][0]
                                                                 tf_op_layer_Tile/multiples[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_1[0][0]
                                                                 tf_op_layer_Tile_1/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_2 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_2[0][0]
                                                                 tf_op_layer_Tile_2/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_3 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_3[0][0]
                                                                 tf_op_layer_Tile_3/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_4 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_4[0][0]
                                                                 tf_op_layer_Tile_4/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_5 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_5[0][0]
                                                                 tf_op_layer_Tile_5/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Sub (TensorFlowOpLa [(None, None, None)] 0           tf_op_layer_Tile[0][0]
                                                                 tf_op_layer_Tile_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_1 (TensorFlowOp [(None, None, None)] 0           tf_op_layer_Tile_2[0][0]
                                                                 tf_op_layer_Tile_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_2 (TensorFlowOp [(None, None, None)] 0           tf_op_layer_Tile_4[0][0]
                                                                 tf_op_layer_Tile_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square (TensorFlowO [(None, None, None)] 0           tf_op_layer_Sub[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_1 (TensorFlo [(None, None, None)] 0           tf_op_layer_Sub_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_2 (TensorFlo [(None, None, None)] 0           tf_op_layer_Sub_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean (TensorFlowOpL [(None, None)]       0           tf_op_layer_Square[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_1 (TensorFlowO [(None, None)]       0           tf_op_layer_Square_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_2 (TensorFlowO [(None, None)]       0           tf_op_layer_Square_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg (TensorFlowOpLa [(None, None)]       0           tf_op_layer_Mean[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast (TensorFlowOpL [()]                 0           tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg_1 (TensorFlowOp [(None, None)]       0           tf_op_layer_Mean_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast_1 (TensorFlowO [()]                 0           tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg_2 (TensorFlowOp [(None, None)]       0           tf_op_layer_Mean_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast_2 (TensorFlowO [()]                 0           tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv (TensorFlow [(None, None)]       0           tf_op_layer_Neg[0][0]
                                                                 tf_op_layer_Cast[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv_1 (TensorFl [(None, None)]       0           tf_op_layer_Neg_1[0][0]
                                                                 tf_op_layer_Cast_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv_2 (TensorFl [(None, None)]       0           tf_op_layer_Neg_2[0][0]
                                                                 tf_op_layer_Cast_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp (TensorFlowOpLa [(None, None)]       0           tf_op_layer_RealDiv[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp_1 (TensorFlowOp [(None, None)]       0           tf_op_layer_RealDiv_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp_2 (TensorFlowOp [(None, None)]       0           tf_op_layer_RealDiv_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_3 (TensorFlowO [()]                 0           tf_op_layer_Exp[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_4 (TensorFlowO [()]                 0           tf_op_layer_Exp_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_5 (TensorFlowO [()]                 0           tf_op_layer_Exp_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_4 (TensorFlowOp [(None, 22)]         0           default_input[0][0]
                                                                 decoder[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [()]                 0           tf_op_layer_Mean_3[0][0]
                                                                 tf_op_layer_Mean_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_1 (TensorFlowOp [()]                 0           tf_op_layer_Mean_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_3 (TensorFlo [(None, 22)]         0           tf_op_layer_Sub_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_3 (TensorFlowOp [()]                 0           tf_op_layer_AddV2[0][0]
                                                                 tf_op_layer_Mul_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sum (TensorFlowOpLa [(None,)]            0           tf_op_layer_Square_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_2 (TensorFlowOp [()]                 0           tf_op_layer_Sub_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2_1 (TensorFlow [(None,)]            0           tf_op_layer_Sum[0][0]
                                                                 tf_op_layer_Mul_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_6 (TensorFlowO [()]                 0           tf_op_layer_AddV2_1[0][0]
__________________________________________________________________________________________________
add_loss (AddLoss)              ()                   0           tf_op_layer_Mean_6[0][0]
==================================================================================================
Total params: 7,420
Trainable params: 7,420
Non-trainable params: 0
__________________________________________________________________________________________________
None
Epoch 1/10
  1/347 [..............................] - ETA: 0s - loss: 3.4721WARNING:tensorflow:From /Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py:1277: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0018s vs `on_train_batch_end` time: 0.0139s). Check your callbacks.
347/347 [==============================] - 1s 1ms/step - loss: 1.1860 - val_loss: 0.4848
Epoch 2/10
347/347 [==============================] - 0s 858us/step - loss: 0.4239 - val_loss: 0.3916
Epoch 3/10
347/347 [==============================] - 0s 856us/step - loss: 0.3679 - val_loss: 0.3570
Epoch 4/10
347/347 [==============================] - 0s 854us/step - loss: 0.3421 - val_loss: 0.3354
Epoch 5/10
347/347 [==============================] - 0s 859us/step - loss: 0.3260 - val_loss: 0.3230
Epoch 6/10
347/347 [==============================] - 0s 855us/step - loss: 0.3101 - val_loss: 0.3098
Epoch 7/10
347/347 [==============================] - 0s 870us/step - loss: 0.3001 - val_loss: 0.3030
Epoch 8/10
347/347 [==============================] - 0s 865us/step - loss: 0.2926 - val_loss: 0.2974
Epoch 9/10
347/347 [==============================] - 0s 901us/step - loss: 0.2875 - val_loss: 0.2908
Epoch 10/10
347/347 [==============================] - 0s 932us/step - loss: 0.2822 - val_loss: 0.2883

Quality control!¶

Check that the nodes follow approximately a normal distribution.

[15]:

from scivae import Vis

vis = Vis(vae, vae.u, None)
vis.plot_node_hists(show_plt=True, save_fig=False)

<Figure size 216x216 with 0 Axes>

../_images/examples_histone_example_7_1.png

<Figure size 216x216 with 0 Axes>

../_images/examples_histone_example_7_3.png

<Figure size 216x216 with 0 Axes>

../_images/examples_histone_example_7_5.png

<Figure size 216x216 with 0 Axes>

Quality control 2: Visualise the correlation between features.¶

Since the VAE isn’t magic, just good at learning correlations/patterns between input features, it’s sensible to check that the correlations between features and nodes exists.

[16]:

vis.plot_node_feature_correlation(vae_df, 'external_gene_name', columns=new_cols, show_plt=True, save_fig=False)

1 forebrain 10.5-days  H3K27ac 0.4458199156559925 0.0
1 forebrain 10.5-days  H3K27me3 -0.36822856875234455 0.0
1 forebrain 10.5-days  H3K36me3 0.7182105590240476 0.0
1 forebrain 10.5-days  H3K4me1 0.08458726186139696 1.0252935071364138e-33
1 forebrain 10.5-days  H3K4me3 0.44107422217939335 0.0
1 forebrain 10.5-days  H3K9me3 0.3322304714711835 0.0
1 forebrain 13.5-days  H3K36me3 0.7569123576516704 0.0
1 forebrain 16.5-days  H3K9ac 0.39801675553415206 0.0
1 hindbrain 10.5-days  H3K27ac 0.44252869028028696 0.0
1 hindbrain 10.5-days  H3K27me3 -0.3771528991191724 0.0
1 hindbrain 10.5-days  H3K36me3 0.7321705949738145 0.0
1 hindbrain 10.5-days  H3K4me1 0.06973185019912057 2.036931562717694e-23
1 hindbrain 10.5-days  H3K4me3 0.45126012014510813 0.0
1 hindbrain 10.5-days  H3K9me3 0.3264817001379393 0.0
1 hindbrain 16.5-days  H3K36me3 0.7714628829696493 0.0
1 midbrain 10.5-days  H3K27ac 0.4503315853763465 0.0
1 midbrain 10.5-days  H3K27me3 -0.3666337452858759 0.0
1 midbrain 10.5-days  H3K36me3 0.7305575293886859 0.0
1 midbrain 10.5-days  H3K4me1 0.0840195102765061 2.7591686911311554e-33
1 midbrain 10.5-days  H3K4me3 0.44755223785962295 0.0
1 midbrain 10.5-days  H3K9me3 0.33900035619636754 0.0
1 midbrain 16.5-days  H3K36me3 0.7601814747175523 0.0
2 forebrain 10.5-days  H3K27ac 0.6993835507290372 0.0
2 forebrain 10.5-days  H3K27me3 0.20659462514255594 1.6867100459448565e-195
2 forebrain 10.5-days  H3K36me3 0.4361433127132986 0.0
2 forebrain 10.5-days  H3K4me1 0.5565358671909726 0.0
2 forebrain 10.5-days  H3K4me3 0.7238777304800306 0.0
2 forebrain 10.5-days  H3K9me3 -0.19534750429567438 1.3140794381319174e-174
2 forebrain 13.5-days  H3K36me3 0.3859214672795727 0.0
2 forebrain 16.5-days  H3K9ac 0.6951814583436485 0.0
2 hindbrain 10.5-days  H3K27ac 0.7005059927719833 0.0
2 hindbrain 10.5-days  H3K27me3 0.18556958021502687 1.8188890397226624e-157
2 hindbrain 10.5-days  H3K36me3 0.4300915377001582 0.0
2 hindbrain 10.5-days  H3K4me1 0.5656442933199602 0.0
2 hindbrain 10.5-days  H3K4me3 0.7504783215717985 0.0
2 hindbrain 10.5-days  H3K9me3 -0.19492769494427906 7.481605744627867e-174
2 hindbrain 16.5-days  H3K36me3 0.2815763124778623 0.0
2 midbrain 10.5-days  H3K27ac 0.6979357744327495 0.0
2 midbrain 10.5-days  H3K27me3 0.20427806575300653 4.301548566325711e-191
2 midbrain 10.5-days  H3K36me3 0.4248950750456577 0.0
2 midbrain 10.5-days  H3K4me1 0.5879522185404434 0.0
2 midbrain 10.5-days  H3K4me3 0.7517804636786536 0.0
2 midbrain 10.5-days  H3K9me3 -0.19917129159908467 1.4378527990397066e-181
2 midbrain 16.5-days  H3K36me3 0.28502688680921845 0.0
3 forebrain 10.5-days  H3K27ac 0.2890663800190854 0.0
3 forebrain 10.5-days  H3K27me3 0.5593864732716971 0.0
3 forebrain 10.5-days  H3K36me3 0.23123764557098217 9.403954879613397e-246
3 forebrain 10.5-days  H3K4me1 0.6216729280325257 0.0
3 forebrain 10.5-days  H3K4me3 0.23789807422300102 2.0641959594635796e-260
3 forebrain 10.5-days  H3K9me3 0.02547228690467184 0.0002741905345491083
3 forebrain 13.5-days  H3K36me3 0.2637598624109295 0.0
3 forebrain 16.5-days  H3K9ac 0.3161420021696647 0.0
3 hindbrain 10.5-days  H3K27ac 0.3047330189007952 0.0
3 hindbrain 10.5-days  H3K27me3 0.5625858889736766 0.0
3 hindbrain 10.5-days  H3K36me3 0.24594051955316365 9.835681231685088e-279
3 hindbrain 10.5-days  H3K4me1 0.6282999983783705 0.0
3 hindbrain 10.5-days  H3K4me3 0.22727626909374676 2.967658174597913e-237
3 hindbrain 10.5-days  H3K9me3 0.028915715334947162 3.618164152092494e-05
3 hindbrain 16.5-days  H3K36me3 0.2913522951162253 0.0
3 midbrain 10.5-days  H3K27ac 0.2910320635114295 0.0
3 midbrain 10.5-days  H3K27me3 0.5638753340469604 0.0
3 midbrain 10.5-days  H3K36me3 0.23914964254076615 3.223652555211334e-263
3 midbrain 10.5-days  H3K4me1 0.6065451621570106 0.0
3 midbrain 10.5-days  H3K4me3 0.22789701574539437 1.4169728260263464e-238
3 midbrain 10.5-days  H3K9me3 0.021854889024951515 0.0017981980661522779
3 midbrain 16.5-days  H3K36me3 0.2853159387224747 0.0

<Figure size 216x216 with 0 Axes>

../_images/examples_histone_example_9_2.png

[16]:

	Node 1	Node 2	Node 3	Node 3 padj	labels
0	0.445820	0.699384	0.289066	0.000	forebrain 10.5-days H3K27ac
1	-0.368229	0.206595	0.559386	0.000	forebrain 10.5-days H3K27me3
2	0.718211	0.436143	0.231238	0.000	forebrain 10.5-days H3K36me3
3	0.084587	0.556536	0.621673	0.000	forebrain 10.5-days H3K4me1
4	0.441074	0.723878	0.237898	0.000	forebrain 10.5-days H3K4me3
5	0.332230	-0.195348	0.025472	0.000	forebrain 10.5-days H3K9me3
6	0.756912	0.385921	0.263760	0.000	forebrain 13.5-days H3K36me3
7	0.398017	0.695181	0.316142	0.000	forebrain 16.5-days H3K9ac
8	0.442529	0.700506	0.304733	0.000	hindbrain 10.5-days H3K27ac
9	-0.377153	0.185570	0.562586	0.000	hindbrain 10.5-days H3K27me3
10	0.732171	0.430092	0.245941	0.000	hindbrain 10.5-days H3K36me3
11	0.069732	0.565644	0.628300	0.000	hindbrain 10.5-days H3K4me1
12	0.451260	0.750478	0.227276	0.000	hindbrain 10.5-days H3K4me3
13	0.326482	-0.194928	0.028916	0.000	hindbrain 10.5-days H3K9me3
14	0.771463	0.281576	0.291352	0.000	hindbrain 16.5-days H3K36me3
15	0.450332	0.697936	0.291032	0.000	midbrain 10.5-days H3K27ac
16	-0.366634	0.204278	0.563875	0.000	midbrain 10.5-days H3K27me3
17	0.730558	0.424895	0.239150	0.000	midbrain 10.5-days H3K36me3
18	0.084020	0.587952	0.606545	0.000	midbrain 10.5-days H3K4me1
19	0.447552	0.751780	0.227897	0.000	midbrain 10.5-days H3K4me3
20	0.339000	-0.199171	0.021855	0.002	midbrain 10.5-days H3K9me3
21	0.760181	0.285027	0.285316	0.000	midbrain 16.5-days H3K36me3

Having fun with inspecting the latent space¶

Now we are confident it’s learnt stuff, let’s look at how all our genes look on the latent space!

[19]:

vis.plot_feature_scatters(vae_df, 'external_gene_name', columns=new_cols, show_plt=True, fig_type="png",
                          save_fig=False,
                          title="latent space")

<Figure size 144x144 with 0 Axes>

../_images/examples_histone_example_11_1.png

../_images/examples_histone_example_11_2.png

../_images/examples_histone_example_11_3.png

../_images/examples_histone_example_11_4.png

../_images/examples_histone_example_11_5.png

../_images/examples_histone_example_11_6.png

../_images/examples_histone_example_11_7.png

../_images/examples_histone_example_11_8.png

../_images/examples_histone_example_11_9.png

../_images/examples_histone_example_11_10.png

../_images/examples_histone_example_11_11.png

../_images/examples_histone_example_11_12.png

../_images/examples_histone_example_11_13.png

../_images/examples_histone_example_11_14.png

../_images/examples_histone_example_11_15.png

../_images/examples_histone_example_11_16.png

../_images/examples_histone_example_11_17.png

../_images/examples_histone_example_11_18.png

../_images/examples_histone_example_11_19.png

../_images/examples_histone_example_11_20.png

../_images/examples_histone_example_11_21.png

../_images/examples_histone_example_11_22.png

Plot specific genes¶

Since we love certain genes, lets have a look at where they are on the latent space

[17]:

cool_genes = [['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6', 'Arx', 'Dlx1', 'Dlx2', 'Dlx5', 'Nr2e2', 'Otx2'],
              ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11', 'Hoxd12', 'Hoxd13', 'Hoxa7', 'Hoxa9', 'Hoxa10', 'Hoxa11',
              'Hoxa13',
              'Hoxb9', 'Hoxb13', 'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13'],
              ['Ccna1', 'Ccna2', 'Ccnd1', 'Ccnd2', 'Ccnd3', 'Ccne1', 'Ccne2', 'Cdc25a',
               'Cdc25b', 'Cdc25c', 'E2f1', 'E2f2', 'E2f3', 'Mcm10', 'Mcm5', 'Mcm3', 'Mcm2', 'Cip2a']
              ]

vis.plot_values_on_scatters(vae_df, "external_gene_name", ['Forebrain', 'Spinal cord', 'Pro. Prolif.'],
                            cool_genes, show_plt=True, fig_type=".png",
                            save_fig=False)

<Figure size 144x144 with 0 Axes>

../_images/examples_histone_example_13_1.png

[17]:

<Axes3DSubplot:xlabel='Node 1', ylabel='Node 2'>