Histone modification data

Example using Histone modification data downloaded from Encode

[5]:
import pandas as pd
import numpy as np
from scivae import VAE

# Set the location of the mnist data
data_dir ='~/Documents/code/scivae_public/tests/data/'
df = pd.read_csv(f'{data_dir}mouse_HM_var500_data.csv')
df
[5]:
entrezgene_id external_gene_name ensembl_gene_id embryonic-facial-prominence_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF003VMR_width embryonic-facial-prominence_10.5-days_embryonic_H3K27ac_ChIP-seq_ENCFF003VMR_signal embryonic-facial-prominence_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF310NGB_width embryonic-facial-prominence_10.5-days_embryonic_H3K27me3_ChIP-seq_ENCFF310NGB_signal embryonic-facial-prominence_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF565QAD_width embryonic-facial-prominence_10.5-days_embryonic_H3K36me3_ChIP-seq_ENCFF565QAD_signal embryonic-facial-prominence_10.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF053GHW_width ... stomach_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF814BNR_width stomach_16.5-days_embryonic_H3K4me1_ChIP-seq_ENCFF814BNR_signal stomach_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF501CJA_width stomach_16.5-days_embryonic_H3K4me2_ChIP-seq_ENCFF501CJA_signal stomach_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF569KWB_width stomach_16.5-days_embryonic_H3K4me3_ChIP-seq_ENCFF569KWB_signal stomach_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF068FWP_width stomach_16.5-days_embryonic_H3K9ac_ChIP-seq_ENCFF068FWP_signal stomach_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF544RGQ_width stomach_16.5-days_embryonic_H3K9me3_ChIP-seq_ENCFF544RGQ_signal
0 497097 Xkr4 ENSMUSG00000051951 838.0 4.64805 2236.0 4.70623 NaN NaN 841.0 ... 459.0 4.17547 2522.0 32.56543 2456.0 37.44113 1852.0 6.81303 NaN NaN
1 384198 Fam47e ENSMUSG00000057068 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 12492 Scarb2 ENSMUSG00000029426 2053.0 16.06083 NaN NaN 4699.0 4.03960 787.0 ... 797.0 4.94311 862.0 20.08811 3071.0 61.25575 2503.0 24.87381 NaN NaN
3 269113 Nup54 ENSMUSG00000034826 1546.0 23.33510 NaN NaN 8433.0 4.30511 462.0 ... 215.0 2.45555 1376.0 35.42474 2128.0 66.67310 1165.0 28.39603 425.0 3.4231
4 15945 Cxcl10 ENSMUSG00000034855 NaN NaN 984.0 4.95978 NaN NaN 1086.0 ... 641.0 3.41532 794.0 13.95355 661.0 8.53067 NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
20395 21762 Psmd2 ENSMUSG00000006998 1431.0 10.42818 NaN NaN 4157.0 3.39655 1512.0 ... NaN NaN 1407.0 13.44737 1765.0 61.31395 1525.0 20.17010 NaN NaN
20396 73047 Camk2n2 ENSMUSG00000051146 2117.0 10.87675 3875.0 5.10808 734.0 3.56510 3664.0 ... 656.0 4.96696 3348.0 26.86646 3417.0 25.29333 1863.0 7.71053 NaN NaN
20397 107522 Ece2 ENSMUSG00000022842 1041.0 7.72166 3514.0 5.33374 NaN NaN 1335.0 ... 301.0 2.85033 1312.0 30.50417 1046.0 31.87724 926.0 10.12819 NaN NaN
20398 208624 Alg3 ENSMUSG00000033809 2342.0 17.18692 NaN NaN 754.0 5.22586 1288.0 ... NaN NaN 3259.0 44.11346 2597.0 66.85134 834.0 9.84724 NaN NaN
20399 328643 Vwa5b2 ENSMUSG00000046613 666.0 1.94738 7033.0 14.60257 4385.0 7.32809 2502.0 ... 724.0 3.69700 1864.0 32.48818 1716.0 28.13862 1324.0 7.19501 NaN NaN

20400 rows × 997 columns

Normalise the data

Before running the VAE we might only want to do it on a subset, here I’m interested in marks at day E10.5 only in the brain.

[7]:
df = df.fillna(0)
# Get out columns with HM values
cols = [c for c in df.columns if '10' in c and 'brain' in c and 'signal' in c]  # i.e. only do brain at E10 samples
# Make sure we log2 the values since they're too diffuse
vae_df = pd.DataFrame()
vae_df['external_gene_name'] = df['external_gene_name'].values
new_cols = []
for c in cols:
    new_name = ' '.join(c.split('_')[:-3]).replace('embryonic', '')
    new_cols.append(new_name)
    vae_df[new_name] = np.log2(df[c] + 1)

dataset = vae_df[new_cols].values
# Create and train VAE

Train the VAE

We run the training of the VAE

[14]:
config = {"loss":
  {"loss_type": "mse",
    "distance_metric": "mmd",
    "mmd_weight": 1.0
  },
  "encoding": {
    "layers": [
      {"num_nodes": 64, "activation_fn": "relu"},
                {"num_nodes": 32, "activation_fn": "selu"}
        ]
  },
  "decoding": {
    "layers": [
                {"num_nodes": 32, "activation_fn": "selu"},
                {"num_nodes": 64, "activation_fn": "relu"}
        ]
  },
  "latent": {
    "num_nodes": 3
  },
  "optimiser": {
    "params": {"learning_rate": 0.001, "beta_1": 0.8, "beta_2": 0.97},
    "name": "adamax"
  }
}

vae = VAE(dataset, dataset, ["None"] * len(dataset), config, f'vae_rcm')
vae.encode('default', epochs=10, batch_size=50)
None
Model: "encoder"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
default_input (InputLayer)      [(None, 22)]         0
__________________________________________________________________________________________________
dense (Dense)                   (None, 64)           1472        default_input[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           2080        dense[0][0]
__________________________________________________________________________________________________
z_mean (Dense)                  (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z_log_sigma (Dense)             (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z (Lambda)                      (None, 3)            0           z_mean[0][0]
                                                                 z_log_sigma[0][0]
==================================================================================================
Total params: 3,750
Trainable params: 3,750
Non-trainable params: 0
__________________________________________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
z_sampling (InputLayer)      [(None, 3)]               0
_________________________________________________________________
dense_2 (Dense)              (None, 32)                128
_________________________________________________________________
dense_3 (Dense)              (None, 64)                2112
_________________________________________________________________
dense_4 (Dense)              (None, 22)                1430
=================================================================
Total params: 3,670
Trainable params: 3,670
Non-trainable params: 0
_________________________________________________________________
Model: "vae_rcm_scivae"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
default_input (InputLayer)      [(None, 22)]         0
__________________________________________________________________________________________________
encoder (Functional)            [(None, 3), (None, 3 3750        default_input[0][0]
__________________________________________________________________________________________________
decoder (Functional)            (None, 22)           3670        encoder[0][2]
__________________________________________________________________________________________________
dense (Dense)                   (None, 64)           1472        default_input[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           2080        dense[0][0]
__________________________________________________________________________________________________
z_mean (Dense)                  (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z_log_sigma (Dense)             (None, 3)            99          dense_1[0][0]
__________________________________________________________________________________________________
z (Lambda)                      (None, 3)            0           z_mean[0][0]
                                                                 z_log_sigma[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Pack (TensorFlowOpL [(2,)]               0           tf_op_layer_strided_slice[0][0]
__________________________________________________________________________________________________
tf_op_layer_RandomStandardNorma [(None, 3)]          0           tf_op_layer_Pack[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul (TensorFlowOpLa [(None, 3)]          0           tf_op_layer_RandomStandardNormal[
__________________________________________________________________________________________________
tf_op_layer_Add (TensorFlowOpLa [(None, 3)]          0           tf_op_layer_Mul[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_1 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_3 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_2 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_4 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_6 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_5 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_7 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_9 (TensorFlow [(2,)]               0           tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_8 (TensorFlow [(2,)]               0           z[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [()]                 0           tf_op_layer_Shape_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_3 (Te [()]                 0           tf_op_layer_Shape_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_2 (Te [()]                 0           tf_op_layer_Shape_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_4 (Te [()]                 0           tf_op_layer_Shape_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_6 (Te [()]                 0           tf_op_layer_Shape_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_5 (Te [()]                 0           tf_op_layer_Shape_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_7 (Te [()]                 0           tf_op_layer_Shape_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_9 (Te [()]                 0           tf_op_layer_Shape_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_8 (Te [()]                 0           tf_op_layer_Shape_8[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape/shape (Tens [(3,)]               0           tf_op_layer_strided_slice_1[0][0]
                                                                 tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_1/shape (Te [(3,)]               0           tf_op_layer_strided_slice_2[0][0]
                                                                 tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_2/shape (Te [(3,)]               0           tf_op_layer_strided_slice_4[0][0]
                                                                 tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_3/shape (Te [(3,)]               0           tf_op_layer_strided_slice_5[0][0]
                                                                 tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_4/shape (Te [(3,)]               0           tf_op_layer_strided_slice_7[0][0]
                                                                 tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_5/shape (Te [(3,)]               0           tf_op_layer_strided_slice_8[0][0]
                                                                 tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape (TensorFlow [(None, 1, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile/multiples (Ten [(3,)]               0           tf_op_layer_strided_slice_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_1 (TensorFl [(1, None, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape_1/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1/multiples (T [(3,)]               0           tf_op_layer_strided_slice_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_2 (TensorFl [(None, 1, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_2/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_2/multiples (T [(3,)]               0           tf_op_layer_strided_slice_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_3 (TensorFl [(1, None, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_3/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_3/multiples (T [(3,)]               0           tf_op_layer_strided_slice_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_4 (TensorFl [(None, 1, None)]    0           tf_op_layer_Add[0][0]
                                                                 tf_op_layer_Reshape_4/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_4/multiples (T [(3,)]               0           tf_op_layer_strided_slice_8[0][0]
__________________________________________________________________________________________________
tf_op_layer_Reshape_5 (TensorFl [(1, None, None)]    0           z[0][0]
                                                                 tf_op_layer_Reshape_5/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_5/multiples (T [(3,)]               0           tf_op_layer_strided_slice_7[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile (TensorFlowOpL [(None, None, None)] 0           tf_op_layer_Reshape[0][0]
                                                                 tf_op_layer_Tile/multiples[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_1[0][0]
                                                                 tf_op_layer_Tile_1/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_2 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_2[0][0]
                                                                 tf_op_layer_Tile_2/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_3 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_3[0][0]
                                                                 tf_op_layer_Tile_3/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_4 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_4[0][0]
                                                                 tf_op_layer_Tile_4/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Tile_5 (TensorFlowO [(None, None, None)] 0           tf_op_layer_Reshape_5[0][0]
                                                                 tf_op_layer_Tile_5/multiples[0][0
__________________________________________________________________________________________________
tf_op_layer_Sub (TensorFlowOpLa [(None, None, None)] 0           tf_op_layer_Tile[0][0]
                                                                 tf_op_layer_Tile_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_1 (TensorFlowOp [(None, None, None)] 0           tf_op_layer_Tile_2[0][0]
                                                                 tf_op_layer_Tile_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_2 (TensorFlowOp [(None, None, None)] 0           tf_op_layer_Tile_4[0][0]
                                                                 tf_op_layer_Tile_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square (TensorFlowO [(None, None, None)] 0           tf_op_layer_Sub[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_1 (TensorFlo [(None, None, None)] 0           tf_op_layer_Sub_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_2 (TensorFlo [(None, None, None)] 0           tf_op_layer_Sub_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean (TensorFlowOpL [(None, None)]       0           tf_op_layer_Square[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_1 (TensorFlowO [(None, None)]       0           tf_op_layer_Square_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_2 (TensorFlowO [(None, None)]       0           tf_op_layer_Square_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg (TensorFlowOpLa [(None, None)]       0           tf_op_layer_Mean[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast (TensorFlowOpL [()]                 0           tf_op_layer_strided_slice_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg_1 (TensorFlowOp [(None, None)]       0           tf_op_layer_Mean_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast_1 (TensorFlowO [()]                 0           tf_op_layer_strided_slice_6[0][0]
__________________________________________________________________________________________________
tf_op_layer_Neg_2 (TensorFlowOp [(None, None)]       0           tf_op_layer_Mean_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Cast_2 (TensorFlowO [()]                 0           tf_op_layer_strided_slice_9[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv (TensorFlow [(None, None)]       0           tf_op_layer_Neg[0][0]
                                                                 tf_op_layer_Cast[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv_1 (TensorFl [(None, None)]       0           tf_op_layer_Neg_1[0][0]
                                                                 tf_op_layer_Cast_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv_2 (TensorFl [(None, None)]       0           tf_op_layer_Neg_2[0][0]
                                                                 tf_op_layer_Cast_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp (TensorFlowOpLa [(None, None)]       0           tf_op_layer_RealDiv[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp_1 (TensorFlowOp [(None, None)]       0           tf_op_layer_RealDiv_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Exp_2 (TensorFlowOp [(None, None)]       0           tf_op_layer_RealDiv_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_3 (TensorFlowO [()]                 0           tf_op_layer_Exp[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_4 (TensorFlowO [()]                 0           tf_op_layer_Exp_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_5 (TensorFlowO [()]                 0           tf_op_layer_Exp_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_4 (TensorFlowOp [(None, 22)]         0           default_input[0][0]
                                                                 decoder[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [()]                 0           tf_op_layer_Mean_3[0][0]
                                                                 tf_op_layer_Mean_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_1 (TensorFlowOp [()]                 0           tf_op_layer_Mean_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Square_3 (TensorFlo [(None, 22)]         0           tf_op_layer_Sub_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sub_3 (TensorFlowOp [()]                 0           tf_op_layer_AddV2[0][0]
                                                                 tf_op_layer_Mul_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Sum (TensorFlowOpLa [(None,)]            0           tf_op_layer_Square_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_2 (TensorFlowOp [()]                 0           tf_op_layer_Sub_3[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2_1 (TensorFlow [(None,)]            0           tf_op_layer_Sum[0][0]
                                                                 tf_op_layer_Mul_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean_6 (TensorFlowO [()]                 0           tf_op_layer_AddV2_1[0][0]
__________________________________________________________________________________________________
add_loss (AddLoss)              ()                   0           tf_op_layer_Mean_6[0][0]
==================================================================================================
Total params: 7,420
Trainable params: 7,420
Non-trainable params: 0
__________________________________________________________________________________________________
None
Epoch 1/10
  1/347 [..............................] - ETA: 0s - loss: 3.4721WARNING:tensorflow:From /Users/ariane/opt/miniconda3/envs/ml/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py:1277: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.
WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0018s vs `on_train_batch_end` time: 0.0139s). Check your callbacks.
347/347 [==============================] - 1s 1ms/step - loss: 1.1860 - val_loss: 0.4848
Epoch 2/10
347/347 [==============================] - 0s 858us/step - loss: 0.4239 - val_loss: 0.3916
Epoch 3/10
347/347 [==============================] - 0s 856us/step - loss: 0.3679 - val_loss: 0.3570
Epoch 4/10
347/347 [==============================] - 0s 854us/step - loss: 0.3421 - val_loss: 0.3354
Epoch 5/10
347/347 [==============================] - 0s 859us/step - loss: 0.3260 - val_loss: 0.3230
Epoch 6/10
347/347 [==============================] - 0s 855us/step - loss: 0.3101 - val_loss: 0.3098
Epoch 7/10
347/347 [==============================] - 0s 870us/step - loss: 0.3001 - val_loss: 0.3030
Epoch 8/10
347/347 [==============================] - 0s 865us/step - loss: 0.2926 - val_loss: 0.2974
Epoch 9/10
347/347 [==============================] - 0s 901us/step - loss: 0.2875 - val_loss: 0.2908
Epoch 10/10
347/347 [==============================] - 0s 932us/step - loss: 0.2822 - val_loss: 0.2883

Quality control!

Check that the nodes follow approximately a normal distribution.

[15]:
from scivae import Vis

vis = Vis(vae, vae.u, None)
vis.plot_node_hists(show_plt=True, save_fig=False)
<Figure size 216x216 with 0 Axes>
../_images/examples_histone_example_7_1.png
<Figure size 216x216 with 0 Axes>
../_images/examples_histone_example_7_3.png
<Figure size 216x216 with 0 Axes>
../_images/examples_histone_example_7_5.png
<Figure size 216x216 with 0 Axes>

Quality control 2: Visualise the correlation between features.

Since the VAE isn’t magic, just good at learning correlations/patterns between input features, it’s sensible to check that the correlations between features and nodes exists.

[16]:
vis.plot_node_feature_correlation(vae_df, 'external_gene_name', columns=new_cols, show_plt=True, save_fig=False)
1 forebrain 10.5-days  H3K27ac 0.4458199156559925 0.0
1 forebrain 10.5-days  H3K27me3 -0.36822856875234455 0.0
1 forebrain 10.5-days  H3K36me3 0.7182105590240476 0.0
1 forebrain 10.5-days  H3K4me1 0.08458726186139696 1.0252935071364138e-33
1 forebrain 10.5-days  H3K4me3 0.44107422217939335 0.0
1 forebrain 10.5-days  H3K9me3 0.3322304714711835 0.0
1 forebrain 13.5-days  H3K36me3 0.7569123576516704 0.0
1 forebrain 16.5-days  H3K9ac 0.39801675553415206 0.0
1 hindbrain 10.5-days  H3K27ac 0.44252869028028696 0.0
1 hindbrain 10.5-days  H3K27me3 -0.3771528991191724 0.0
1 hindbrain 10.5-days  H3K36me3 0.7321705949738145 0.0
1 hindbrain 10.5-days  H3K4me1 0.06973185019912057 2.036931562717694e-23
1 hindbrain 10.5-days  H3K4me3 0.45126012014510813 0.0
1 hindbrain 10.5-days  H3K9me3 0.3264817001379393 0.0
1 hindbrain 16.5-days  H3K36me3 0.7714628829696493 0.0
1 midbrain 10.5-days  H3K27ac 0.4503315853763465 0.0
1 midbrain 10.5-days  H3K27me3 -0.3666337452858759 0.0
1 midbrain 10.5-days  H3K36me3 0.7305575293886859 0.0
1 midbrain 10.5-days  H3K4me1 0.0840195102765061 2.7591686911311554e-33
1 midbrain 10.5-days  H3K4me3 0.44755223785962295 0.0
1 midbrain 10.5-days  H3K9me3 0.33900035619636754 0.0
1 midbrain 16.5-days  H3K36me3 0.7601814747175523 0.0
2 forebrain 10.5-days  H3K27ac 0.6993835507290372 0.0
2 forebrain 10.5-days  H3K27me3 0.20659462514255594 1.6867100459448565e-195
2 forebrain 10.5-days  H3K36me3 0.4361433127132986 0.0
2 forebrain 10.5-days  H3K4me1 0.5565358671909726 0.0
2 forebrain 10.5-days  H3K4me3 0.7238777304800306 0.0
2 forebrain 10.5-days  H3K9me3 -0.19534750429567438 1.3140794381319174e-174
2 forebrain 13.5-days  H3K36me3 0.3859214672795727 0.0
2 forebrain 16.5-days  H3K9ac 0.6951814583436485 0.0
2 hindbrain 10.5-days  H3K27ac 0.7005059927719833 0.0
2 hindbrain 10.5-days  H3K27me3 0.18556958021502687 1.8188890397226624e-157
2 hindbrain 10.5-days  H3K36me3 0.4300915377001582 0.0
2 hindbrain 10.5-days  H3K4me1 0.5656442933199602 0.0
2 hindbrain 10.5-days  H3K4me3 0.7504783215717985 0.0
2 hindbrain 10.5-days  H3K9me3 -0.19492769494427906 7.481605744627867e-174
2 hindbrain 16.5-days  H3K36me3 0.2815763124778623 0.0
2 midbrain 10.5-days  H3K27ac 0.6979357744327495 0.0
2 midbrain 10.5-days  H3K27me3 0.20427806575300653 4.301548566325711e-191
2 midbrain 10.5-days  H3K36me3 0.4248950750456577 0.0
2 midbrain 10.5-days  H3K4me1 0.5879522185404434 0.0
2 midbrain 10.5-days  H3K4me3 0.7517804636786536 0.0
2 midbrain 10.5-days  H3K9me3 -0.19917129159908467 1.4378527990397066e-181
2 midbrain 16.5-days  H3K36me3 0.28502688680921845 0.0
3 forebrain 10.5-days  H3K27ac 0.2890663800190854 0.0
3 forebrain 10.5-days  H3K27me3 0.5593864732716971 0.0
3 forebrain 10.5-days  H3K36me3 0.23123764557098217 9.403954879613397e-246
3 forebrain 10.5-days  H3K4me1 0.6216729280325257 0.0
3 forebrain 10.5-days  H3K4me3 0.23789807422300102 2.0641959594635796e-260
3 forebrain 10.5-days  H3K9me3 0.02547228690467184 0.0002741905345491083
3 forebrain 13.5-days  H3K36me3 0.2637598624109295 0.0
3 forebrain 16.5-days  H3K9ac 0.3161420021696647 0.0
3 hindbrain 10.5-days  H3K27ac 0.3047330189007952 0.0
3 hindbrain 10.5-days  H3K27me3 0.5625858889736766 0.0
3 hindbrain 10.5-days  H3K36me3 0.24594051955316365 9.835681231685088e-279
3 hindbrain 10.5-days  H3K4me1 0.6282999983783705 0.0
3 hindbrain 10.5-days  H3K4me3 0.22727626909374676 2.967658174597913e-237
3 hindbrain 10.5-days  H3K9me3 0.028915715334947162 3.618164152092494e-05
3 hindbrain 16.5-days  H3K36me3 0.2913522951162253 0.0
3 midbrain 10.5-days  H3K27ac 0.2910320635114295 0.0
3 midbrain 10.5-days  H3K27me3 0.5638753340469604 0.0
3 midbrain 10.5-days  H3K36me3 0.23914964254076615 3.223652555211334e-263
3 midbrain 10.5-days  H3K4me1 0.6065451621570106 0.0
3 midbrain 10.5-days  H3K4me3 0.22789701574539437 1.4169728260263464e-238
3 midbrain 10.5-days  H3K9me3 0.021854889024951515 0.0017981980661522779
3 midbrain 16.5-days  H3K36me3 0.2853159387224747 0.0
<Figure size 216x216 with 0 Axes>
../_images/examples_histone_example_9_2.png
[16]:
Node 1 Node 1 padj Node 2 Node 2 padj Node 3 Node 3 padj labels
0 0.445820 0.0 0.699384 0.0 0.289066 0.000 forebrain 10.5-days H3K27ac
1 -0.368229 0.0 0.206595 0.0 0.559386 0.000 forebrain 10.5-days H3K27me3
2 0.718211 0.0 0.436143 0.0 0.231238 0.000 forebrain 10.5-days H3K36me3
3 0.084587 0.0 0.556536 0.0 0.621673 0.000 forebrain 10.5-days H3K4me1
4 0.441074 0.0 0.723878 0.0 0.237898 0.000 forebrain 10.5-days H3K4me3
5 0.332230 0.0 -0.195348 0.0 0.025472 0.000 forebrain 10.5-days H3K9me3
6 0.756912 0.0 0.385921 0.0 0.263760 0.000 forebrain 13.5-days H3K36me3
7 0.398017 0.0 0.695181 0.0 0.316142 0.000 forebrain 16.5-days H3K9ac
8 0.442529 0.0 0.700506 0.0 0.304733 0.000 hindbrain 10.5-days H3K27ac
9 -0.377153 0.0 0.185570 0.0 0.562586 0.000 hindbrain 10.5-days H3K27me3
10 0.732171 0.0 0.430092 0.0 0.245941 0.000 hindbrain 10.5-days H3K36me3
11 0.069732 0.0 0.565644 0.0 0.628300 0.000 hindbrain 10.5-days H3K4me1
12 0.451260 0.0 0.750478 0.0 0.227276 0.000 hindbrain 10.5-days H3K4me3
13 0.326482 0.0 -0.194928 0.0 0.028916 0.000 hindbrain 10.5-days H3K9me3
14 0.771463 0.0 0.281576 0.0 0.291352 0.000 hindbrain 16.5-days H3K36me3
15 0.450332 0.0 0.697936 0.0 0.291032 0.000 midbrain 10.5-days H3K27ac
16 -0.366634 0.0 0.204278 0.0 0.563875 0.000 midbrain 10.5-days H3K27me3
17 0.730558 0.0 0.424895 0.0 0.239150 0.000 midbrain 10.5-days H3K36me3
18 0.084020 0.0 0.587952 0.0 0.606545 0.000 midbrain 10.5-days H3K4me1
19 0.447552 0.0 0.751780 0.0 0.227897 0.000 midbrain 10.5-days H3K4me3
20 0.339000 0.0 -0.199171 0.0 0.021855 0.002 midbrain 10.5-days H3K9me3
21 0.760181 0.0 0.285027 0.0 0.285316 0.000 midbrain 16.5-days H3K36me3

Having fun with inspecting the latent space

Now we are confident it’s learnt stuff, let’s look at how all our genes look on the latent space!

[19]:
vis.plot_feature_scatters(vae_df, 'external_gene_name', columns=new_cols, show_plt=True, fig_type="png",
                          save_fig=False,
                          title="latent space")

<Figure size 144x144 with 0 Axes>
../_images/examples_histone_example_11_1.png
../_images/examples_histone_example_11_2.png
../_images/examples_histone_example_11_3.png
../_images/examples_histone_example_11_4.png
../_images/examples_histone_example_11_5.png
../_images/examples_histone_example_11_6.png
../_images/examples_histone_example_11_7.png
../_images/examples_histone_example_11_8.png
../_images/examples_histone_example_11_9.png
../_images/examples_histone_example_11_10.png
../_images/examples_histone_example_11_11.png
../_images/examples_histone_example_11_12.png
../_images/examples_histone_example_11_13.png
../_images/examples_histone_example_11_14.png
../_images/examples_histone_example_11_15.png
../_images/examples_histone_example_11_16.png
../_images/examples_histone_example_11_17.png
../_images/examples_histone_example_11_18.png
../_images/examples_histone_example_11_19.png
../_images/examples_histone_example_11_20.png
../_images/examples_histone_example_11_21.png
../_images/examples_histone_example_11_22.png

Plot specific genes

Since we love certain genes, lets have a look at where they are on the latent space

[17]:
cool_genes = [['Emx1', 'Eomes', 'Tbr1', 'Foxg1', 'Lhx6', 'Arx', 'Dlx1', 'Dlx2', 'Dlx5', 'Nr2e2', 'Otx2'],
              ['Hoxd8', 'Hoxd9', 'Hoxd10', 'Hoxd11', 'Hoxd12', 'Hoxd13', 'Hoxa7', 'Hoxa9', 'Hoxa10', 'Hoxa11',
              'Hoxa13',
              'Hoxb9', 'Hoxb13', 'Hoxc8', 'Hoxc9', 'Hoxc10', 'Hoxc11', 'Hoxc12', 'Hoxc13'],
              ['Ccna1', 'Ccna2', 'Ccnd1', 'Ccnd2', 'Ccnd3', 'Ccne1', 'Ccne2', 'Cdc25a',
               'Cdc25b', 'Cdc25c', 'E2f1', 'E2f2', 'E2f3', 'Mcm10', 'Mcm5', 'Mcm3', 'Mcm2', 'Cip2a']
              ]

vis.plot_values_on_scatters(vae_df, "external_gene_name", ['Forebrain', 'Spinal cord', 'Pro. Prolif.'],
                            cool_genes, show_plt=True, fig_type=".png",
                            save_fig=False)
<Figure size 144x144 with 0 Axes>
../_images/examples_histone_example_13_1.png
[17]:
<Axes3DSubplot:xlabel='Node 1', ylabel='Node 2'>