Spaces:

aagoluoglu
/

AI-Midterm-IDNN

Build error

App Files Files Community

Ashley Goluoglu commited on Mar 14, 2024

Commit

96283ff

1 Parent(s): 3a085d0

add files from pantelis/IDNN

Browse files

Files changed (39) hide show

Figure_3.png +0 -0
LICENSE +14 -0
Pipfile +23 -0
Pipfile.lock +0 -0
README 2.md +49 -0
compare_percent_mnist_5_AND_85_PERCENT_old.JPG +0 -0
data/MNIST_data/t10k-images-idx3-ubyte.gz +3 -0
data/MNIST_data/t10k-labels-idx1-ubyte.gz +3 -0
data/MNIST_data/train-images-idx3-ubyte.gz +3 -0
data/MNIST_data/train-labels-idx1-ubyte.gz +3 -0
data/g1.mat +0 -0
data/g2.mat +0 -0
data/var_u.mat +0 -0
docker-compose.yml +0 -0
docker/Dockerfile.pytorch +76 -0
docker/Dockerfile.tensorflow +67 -0
docker/docker-font.conf +13 -0
idnns/__init__.py +0 -0
idnns/information/__init__.py +0 -0
idnns/information/entropy_estimators.py +283 -0
idnns/information/information_process.py +201 -0
idnns/information/information_utilities.py +54 -0
idnns/information/mutual_info_estimation.py +178 -0
idnns/information/mutual_information_calculation.py +50 -0
idnns/networks/__init__.py +0 -0
idnns/networks/information_network.py +166 -0
idnns/networks/model.py +212 -0
idnns/networks/models.py +125 -0
idnns/networks/network.py +173 -0
idnns/networks/network_paramters.py +136 -0
idnns/networks/ops.py +72 -0
idnns/networks/utils.py +154 -0
idnns/plots/__init__.py +0 -0
idnns/plots/ops.py +20 -0
idnns/plots/plot_figures.py +678 -0
idnns/plots/plot_gradients.py +223 -0
idnns/plots/utils.py +243 -0
main.py +24 -0
test.py +14 -0

Figure_3.png ADDED Viewed

LICENSE ADDED Viewed

	@@ -0,0 +1,14 @@

+LICENSE CONDITIONS
+Copyright (2016) Ravid Shwartz-Ziv
+All rights reserved.
+For details, see the paper:
+Ravid Shwartz-Ziv, Naftali Tishby,
+Opening the Black Box of Deep Neural Networks via Information
+Arxiv, 2017
+Permission to use, copy, modify, and distribute this software and its documentation for educational, research, and non-commercial purposes, without fee and without a signed licensing agreement, is hereby granted, provided that the above copyright notice and this paragraph appear in all copies, modifications, and distributions.
+Any commercial use or any redistribution of this software requires a license. For further details, contact Ravid Shwartz-Ziv ([email protected]).
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Pipfile ADDED Viewed

	@@ -0,0 +1,23 @@

+[[source]]
+url = "https://pypi.org/simple"
+verify_ssl = false
+name = "pip_conf_index_global"
+[packages]
+ipywidgets = "*"
+optuna = "*"
+pydantic = "*"
+filterpy = "==1.4.5"
+transformers = "==4.35.1"
+datasets = "==2.14.6"
+accelerate = "==0.24.1"
+nvidia-ml-py3 = "==7.352.0"
+pygobject = "==3.40.1"
+pytest = "==6.2.5"
+prml = {git = "git+https://github.com/pantelis-classes/PRML.git#egg=prml"}
+seaborn = "*"
+[dev-packages]
+[requires]
+python_version = "3.10"

Pipfile.lock ADDED Viewed

The diff for this file is too large to render. See raw diff

README 2.md ADDED Viewed

	@@ -0,0 +1,49 @@

+# IDNNs
+## Description
+IDNNs is a python library that implements training and calculating of information in deep neural networks
+[\[Shwartz-Ziv & Tishby, 2017\]](#IDNNs) in TensorFlow. The library allows you to investigate how networks look on the
+information plane and how it changes during the learning.
+<img src="https://github.com/ravidziv/IDNNs/blob/master/compare_percent_mnist_5_AND_85_PERCENT_old.JPG" width="1000px"/>
+## Prerequisites
+- tensorflow r1.0 or higher version
+- numpy 1.11.0
+- matplotlib 2.0.2
+- multiprocessing
+- joblib
+## Usage
+All the code is under the `idnns/` directory.
+For training a network and calculate the MI and the gradients of it run the an example in [main.py](main.py).
+Off course you can also run only specific methods for running only the training procedure/calculating the MI.
+This file has command-line arguments as follow -
+ - `start_samples` - The number of the first sample for calculate the information
+ - `batch_size` - The size of the batch
+ - `learning_rate` - The learning rate of the network
+ - `num_repeat` - The number of times to run the network
+ - `num_epochs` - maximum number of epochs for training
+ - `net_arch` - The architecture of the networks
+ - `per_data` - The percent of the training data
+ - `name` - The name for saving the results
+ - `data_name` - The dataset name
+ - `num_samples` - The max number of indexes for calculate the information
+ - `save_ws` - True if we want to save the outputs of the network
+ - `calc_information` - 1 if we want to calculate the MI of the network
+ - `save_grads` - True if we want to save the gradients of the network
+ - `run_in_parallel` - True if we want to run all the networks in parallel mode
+ - `num_of_bins` - The number of bins that we divide the neurons' output
+ - `activation_function` - The activation function of the model 0 for thnh 1 for RelU'
+ - `interval_accuracy_display` - The interval for display accuracy
+ - `interval_information_display` - The interval for display the information calculation
+ - `cov_net` - True if we want covnet
+ - `rand_labels` - True if we want to set random labels
+ - `data_dir` - The directory for finding the data
+The results are save under the folder jobs. Each run create a directory with a name that contains the run properties. In this directory there are the data.pickle file with the data of run and python file that is a copy of the file that create this run.
+The data is under the data directory.
+For plotting the results we have the file [plot_figures.py](idnns/plot/plot_figures.py).
+This file contains methods for plotting diffrent aspects of the data (the information plane, the gradients,the norms, etc).
+## References
+1. <a name="IDNNs"></a> Ravid. Shwartz-Ziv, Naftali Tishby, [Opening the Black Box of Deep Neural Networks via Information](https://arxiv.org/abs/1703.00810), 2017, Arxiv.

compare_percent_mnist_5_AND_85_PERCENT_old.JPG ADDED Viewed

data/MNIST_data/t10k-images-idx3-ubyte.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8d422c7b0a1c1c79245a5bcf07fe86e33eeafee792b84584aec276f5a2dbc4e6
+size 1648877

data/MNIST_data/t10k-labels-idx1-ubyte.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f7ae60f92e00ec6debd23a6088c31dbd2371eca3ffa0defaefb259924204aec6
+size 4542

data/MNIST_data/train-images-idx3-ubyte.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:440fcabf73cc546fa21475e81ea370265605f56be210a4024d2ca8f203523609
+size 9912422

data/MNIST_data/train-labels-idx1-ubyte.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3552534a0a558bbed6aed32b30c495cca23d567ec52cac8be1a0730e8010255c
+size 28881

data/g1.mat ADDED Viewed

Binary file (54 kB). View file

data/g2.mat ADDED Viewed

Binary file (54 kB). View file

data/var_u.mat ADDED Viewed

Binary file (1 kB). View file

docker-compose.yml ADDED Viewed

File without changes

docker/Dockerfile.pytorch ADDED Viewed

	@@ -0,0 +1,76 @@

+FROM nvcr.io/nvidia/pytorch:23.05-py3
+# specify vscode as the user name in the docker
+# This user name should match that of the VS Code .devcontainer to allow seamless development inside the docker container via vscode
+ARG USERNAME=vscode
+ARG USER_UID=1001
+ARG USER_GID=$USER_UID
+# Create a non-root user
+RUN groupadd --gid $USER_GID $USERNAME \
+  && useradd -l -s /bin/bash --uid $USER_UID --gid $USER_GID -m $USERNAME \
+  # [Optional] Add sudo support for the non-root user - this is ok for development dockers only
+  && apt-get update \
+  && apt-get install -y sudo \
+  && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME\
+  && chmod 0440 /etc/sudoers.d/$USERNAME \
+  # Cleanup
+  && rm -rf /var/lib/apt/lists/* \
+  # Set up git completion.
+  && echo "source /usr/share/bash-completion/completions/git" >> /home/$USERNAME/.bashrc
+# Packages installation (eg git-lfs)
+RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
+    && apt-get -y install --no-install-recommends curl git-lfs ffmpeg libsm6 libxext6 graphviz libgraphviz-dev libsndfile1-dev xdg-utils swig gringo gobject-introspection libcairo2-dev libgirepository1.0-dev pkg-config python3-dev python3-gi
+COPY docker/docker-font.conf /etc/fonts/local.conf
+#ENV FREETYPE_PROPERTIES="truetype:interpreter-version=35"
+RUN echo "ttf-mscorefonts-installer msttcorefonts/accepted-mscorefonts-eula select true" | debconf-set-selections
+RUN apt-get update \
+	&& apt-get install -y --no-install-recommends fontconfig ttf-mscorefonts-installer
+USER vscode
+# ACT for executing locally Github workflows
+RUN curl -s https://raw.githubusercontent.com/nektos/act/master/install.sh |  sudo bash
+# NVM for managing npm versions
+RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh |  sudo bash
+# Git LFS repo configuration
+RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh |  sudo  bash
+# inscape installation for managing svg files
+RUN sudo apt-get update && export DEBIAN_FRONTEND=noninteractive \
+    && sudo apt-get -y install --no-install-recommends inkscape
+SHELL ["/bin/bash", "-c"]
+ENV PATH="/home/vscode/.local/bin:$PATH"
+ENV GI_TYPELIB_PATH="/usr/lib/x86_64-linux-gnu/girepository-1.0"
+# RUN python -m pip install pip-tools
+# https://github.com/jazzband/pip-tools/issues/1596
+#RUN --mount=type=cache,target=/root/.cache/pip python3 -m piptools compile -v
+# Install dependencies:
+# COPY requirements.txt /tmp/requirements.txt
+# RUN pip install -r /tmp/requirements.txt
+RUN pip install --upgrade pip
+# && \
+#     pip install pipenv  && \
+#     pipenv lock && \
+#     pipenv install --dev --system && \
+#     pipenv --clear
+RUN pip install pipenv
+COPY Pipfile.lock Pipfile /tmp/
+WORKDIR /tmp
+RUN pipenv install --system --deploy --ignore-pipfile
+WORKDIR /workspaces/artificial_intelligence
+COPY . .

docker/Dockerfile.tensorflow ADDED Viewed

	@@ -0,0 +1,67 @@

+#FROM nvcr.io/nvidia/tensorflow:23.07-tf2-py3
+FROM tensorflow/tensorflow:2.15.0
+# use the cpu image that the nvidia image was based on
+# specify vscode as the user name in the docker
+# This user name should match that of the VS Code .devcontainer to allow seamless development inside the docker container via vscode
+ARG USERNAME=vscode
+ARG USER_UID=1001
+ARG USER_GID=$USER_UID
+# Create a non-root user
+RUN groupadd --gid $USER_GID $USERNAME \
+  && useradd -s /bin/bash --uid $USER_UID --gid $USER_GID -m $USERNAME \
+  # [Optional] Add sudo support for the non-root user - this is ok for development dockers only
+  && apt-get update \
+  && apt-get install -y sudo \
+  && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME\
+  && chmod 0440 /etc/sudoers.d/$USERNAME \
+  # Cleanup
+  && rm -rf /var/lib/apt/lists/* \
+  # Set up git completion.
+  && echo "source /usr/share/bash-completion/completions/git" >> /home/$USERNAME/.bashrc
+# Packages installation (eg git-lfs)
+RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
+    && apt-get -y install --no-install-recommends curl git-lfs ffmpeg libsm6 libxext6 graphviz libsndfile1-dev libgraphviz-dev xdg-utils swig gringo gobject-introspection libcairo2-dev libgirepository1.0-dev pkg-config python3-dev python3-gi python3-tk
+#COPY docker/docker-font.conf /etc/fonts/local.conf
+#ENV FREETYPE_PROPERTIES="truetype:interpreter-version=35"
+#RUN echo "ttf-mscorefonts-installer msttcorefonts/accepted-mscorefonts-eula select true" | debconf-set-selections
+#RUN apt-get update \
+	#&& apt-get install -y --no-install-recommends fontconfig ttf-mscorefonts-installer
+USER vscode
+# ACT for executing locally Github workflows
+RUN curl -s https://raw.githubusercontent.com/nektos/act/master/install.sh |  sudo bash
+# NVM for managing npm versions
+RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh |  sudo bash
+# Git LFS repo configuration
+RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh |  sudo  bash
+# inscape installation for managing svg files
+RUN sudo apt-get update && export DEBIAN_FRONTEND=noninteractive \
+    && sudo apt-get -y install --no-install-recommends inkscape
+SHELL ["/bin/bash", "-c"]
+ENV PATH="/home/vscode/.local/bin:$PATH"
+ENV GI_TYPELIB_PATH="/usr/lib/x86_64-linux-gnu/girepository-1.0"
+# Install dependencies:
+# COPY . .
+# RUN pip install -r requirements.txt
+RUN pip install --upgrade pip
+RUN pip install pipenv
+COPY Pipfile.lock Pipfile /tmp/
+WORKDIR /tmp
+RUN pipenv install --system --deploy --ignore-pipfile
+WORKDIR /workspaces/artificial_intelligence
+COPY . .

docker/docker-font.conf ADDED Viewed

	@@ -0,0 +1,13 @@

+<?xml version="1.0"?>
+<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
+<fontconfig>
+  <match target="font">
+    <edit name="antialias" mode="assign"><bool>true</bool></edit>
+  </match>
+  <match target="font">
+    <edit name="hintstyle" mode="assign"><const>hintnone</const></edit>
+  </match>
+  <match target="font">
+   <edit mode="assign" name="hinting"><bool>false</bool></edit>
+  </match>
+</fontconfig>

idnns/__init__.py ADDED Viewed

File without changes

idnns/information/__init__.py ADDED Viewed

File without changes

idnns/information/entropy_estimators.py ADDED Viewed

	@@ -0,0 +1,283 @@

+#!/usr/bin/env python
+# Written by Greg Ver Steeg
+# See readme.pdf for documentation
+# Or go to http://www.isi.edu/~gregv/npeet.html
+import scipy.spatial as ss
+from scipy.special import digamma
+from math import log
+import numpy.random as nr
+import numpy as np
+import random
+# CONTINUOUS ESTIMATORS
+def entropy(x, k=3, base=2):
+    """ The classic K-L k-nearest neighbor continuous entropy estimator
+        x should be a list of vectors, e.g. x = [[1.3], [3.7], [5.1], [2.4]]
+        if x is a one-dimensional scalar and we have four samples
+    """
+    assert k <= len(x) - 1, "Set k smaller than num. samples - 1"
+    d = len(x[0])
+    N = len(x)
+    intens = 1e-10  # small noise to break degeneracy, see doc.
+    x = [list(p + intens * nr.rand(len(x[0]))) for p in x]
+    tree = ss.cKDTree(x)
+    nn = [tree.query(point, k + 1, p=float('inf'))[0][k] for point in x]
+    const = digamma(N) - digamma(k) + d * log(2)
+    return (const + d * np.mean(map(log, nn))) / log(base)
+def centropy(x, y, k=3, base=2):
+  """ The classic K-L k-nearest neighbor continuous entropy estimator for the
+      entropy of X conditioned on Y.
+  """
+  hxy = entropy([xi + yi for (xi, yi) in zip(x, y)], k, base)
+  hy = entropy(y, k, base)
+  return hxy - hy
+def column(xs, i):
+  return [[x[i]] for x in xs]
+def tc(xs, k=3, base=2):
+  xis = [entropy(column(xs, i), k, base) for i in range(0, len(xs[0]))]
+  return np.sum(xis) - entropy(xs, k, base)
+def ctc(xs, y, k=3, base=2):
+  xis = [centropy(column(xs, i), y, k, base) for i in range(0, len(xs[0]))]
+  return np.sum(xis) - centropy(xs, y, k, base)
+def corex(xs, ys, k=3, base=2):
+  cxis = [mi(column(xs, i), ys, k, base) for i in range(0, len(xs[0]))]
+  return np.sum(cxis) - mi(xs, ys, k, base)
+def mi(x, y, k=3, base=2):
+    """ Mutual information of x and y
+        x, y should be a list of vectors, e.g. x = [[1.3], [3.7], [5.1], [2.4]]
+        if x is a one-dimensional scalar and we have four samples
+    """
+    assert len(x) == len(y), "Lists should have same length"
+    assert k <= len(x) - 1, "Set k smaller than num. samples - 1"
+    intens = 1e-10  # small noise to break degeneracy, see doc.
+    x = [list(p + intens * nr.rand(len(x[0]))) for p in x]
+    y = [list(p + intens * nr.rand(len(y[0]))) for p in y]
+    points = zip2(x, y)
+    # Find nearest neighbors in joint space, p=inf means max-norm
+    tree = ss.cKDTree(points)
+    dvec = [tree.query(point, k + 1, p=float('inf'))[0][k] for point in points]
+    a, b, c, d = avgdigamma(x, dvec), avgdigamma(y, dvec), digamma(k), digamma(len(x))
+    return (-a - b + c + d) / log(base)
+def cmi(x, y, z, k=3, base=2):
+    """ Mutual information of x and y, conditioned on z
+        x, y, z should be a list of vectors, e.g. x = [[1.3], [3.7], [5.1], [2.4]]
+        if x is a one-dimensional scalar and we have four samples
+    """
+    assert len(x) == len(y), "Lists should have same length"
+    assert k <= len(x) - 1, "Set k smaller than num. samples - 1"
+    intens = 1e-10  # small noise to break degeneracy, see doc.
+    x = [list(p + intens * nr.rand(len(x[0]))) for p in x]
+    y = [list(p + intens * nr.rand(len(y[0]))) for p in y]
+    z = [list(p + intens * nr.rand(len(z[0]))) for p in z]
+    points = zip2(x, y, z)
+    # Find nearest neighbors in joint space, p=inf means max-norm
+    tree = ss.cKDTree(points)
+    dvec = [tree.query(point, k + 1, p=float('inf'))[0][k] for point in points]
+    a, b, c, d = avgdigamma(zip2(x, z), dvec), avgdigamma(zip2(y, z), dvec), avgdigamma(z, dvec), digamma(k)
+    return (-a - b + c + d) / log(base)
+def kldiv(x, xp, k=3, base=2):
+    """ KL Divergence between p and q for x~p(x), xp~q(x)
+        x, xp should be a list of vectors, e.g. x = [[1.3], [3.7], [5.1], [2.4]]
+        if x is a one-dimensional scalar and we have four samples
+    """
+    assert k <= len(x) - 1, "Set k smaller than num. samples - 1"
+    assert k <= len(xp) - 1, "Set k smaller than num. samples - 1"
+    assert len(x[0]) == len(xp[0]), "Two distributions must have same dim."
+    d = len(x[0])
+    n = len(x)
+    m = len(xp)
+    const = log(m) - log(n - 1)
+    tree = ss.cKDTree(x)
+    treep = ss.cKDTree(xp)
+    nn = [tree.query(point, k + 1, p=float('inf'))[0][k] for point in x]
+    nnp = [treep.query(point, k, p=float('inf'))[0][k - 1] for point in x]
+    return (const + d * np.mean(map(log, nnp)) - d * np.mean(map(log, nn))) / log(base)
+# DISCRETE ESTIMATORS
+def entropyd(sx, base=2):
+    """ Discrete entropy estimator
+        Given a list of samples which can be any hashable object
+    """
+    return entropyfromprobs(hist(sx), base=base)
+def midd(x, y, base=2):
+    """ Discrete mutual information estimator
+        Given a list of samples which can be any hashable object
+    """
+    return -entropyd(zip(x, y), base) + entropyd(x, base) + entropyd(y, base)
+def cmidd(x, y, z):
+    """ Discrete mutual information estimator
+        Given a list of samples which can be any hashable object
+    """
+    return entropyd(zip(y, z)) + entropyd(zip(x, z)) - entropyd(zip(x, y, z)) - entropyd(z)
+def centropyd(x, y, base=2):
+  """ The classic K-L k-nearest neighbor continuous entropy estimator for the
+      entropy of X conditioned on Y.
+  """
+  return entropyd(zip(x, y), base) - entropyd(y, base)
+def tcd(xs, base=2):
+  xis = [entropyd(column(xs, i), base) for i in range(0, len(xs[0]))]
+  hx = entropyd(xs, base)
+  return np.sum(xis) - hx
+def ctcd(xs, y, base=2):
+  xis = [centropyd(column(xs, i), y, base) for i in range(0, len(xs[0]))]
+  return np.sum(xis) - centropyd(xs, y, base)
+def corexd(xs, ys, base=2):
+  cxis = [midd(column(xs, i), ys, base) for i in range(0, len(xs[0]))]
+  return np.sum(cxis) - midd(xs, ys, base)
+def hist(sx):
+    sx = discretize(sx)
+    # Histogram from list of samples
+    d = dict()
+    for s in sx:
+        if type(s) == list:
+          s = tuple(s)
+        d[s] = d.get(s, 0) + 1
+    return map(lambda z: float(z) / len(sx), d.values())
+def entropyfromprobs(probs, base=2):
+    # Turn a normalized list of probabilities of discrete outcomes into entropy (base 2)
+    return -sum(map(elog, probs)) / log(base)
+def elog(x):
+    # for entropy, 0 log 0 = 0. but we get an error for putting log 0
+    if x <= 0. or x >= 1.:
+        return 0
+    else:
+        return x * log(x)
+# MIXED ESTIMATORS
+def micd(x, y, k=3, base=2, warning=True):
+    """ If x is continuous and y is discrete, compute mutual information
+    """
+    overallentropy = entropy(x, k, base)
+    n = len(y)
+    word_dict = dict()
+    for i in range(len(y)):
+      if type(y[i]) == list:
+        y[i] = tuple(y[i])
+    for sample in y:
+        word_dict[sample] = word_dict.get(sample, 0) + 1. / n
+    yvals = list(set(word_dict.keys()))
+    mi = overallentropy
+    for yval in yvals:
+        xgiveny = [x[i] for i in range(n) if y[i] == yval]
+        if k <= len(xgiveny) - 1:
+            mi -= word_dict[yval] * entropy(xgiveny, k, base)
+        else:
+            if warning:
+                print("Warning, after conditioning, on y=", yval, " insufficient data. Assuming maximal entropy in this case.")
+            mi -= word_dict[yval] * overallentropy
+    return np.abs(mi)  # units already applied
+def midc(x, y, k=3, base=2, warning=True):
+  return micd(y, x, k, base, warning)
+def centropydc(x, y, k=3, base=2, warning=True):
+  return entropyd(x, base) - midc(x, y, k, base, warning)
+def centropycd(x, y, k=3, base=2, warning=True):
+  return entropy(x, k, base) - micd(x, y, k, base, warning)
+def ctcdc(xs, y, k=3, base=2, warning=True):
+  xis = [centropydc(column(xs, i), y, k, base, warning) for i in range(0, len(xs[0]))]
+  return np.sum(xis) - centropydc(xs, y, k, base, warning)
+def ctccd(xs, y, k=3, base=2, warning=True):
+  xis = [centropycd(column(xs, i), y, k, base, warning) for i in range(0, len(xs[0]))]
+  return np.sum(xis) - centropycd(xs, y, k, base, warning)
+def corexcd(xs, ys, k=3, base=2, warning=True):
+  cxis = [micd(column(xs, i), ys, k, base, warning) for i in range(0, len(xs[0]))]
+  return np.sum(cxis) - micd(xs, ys, k, base, warning)
+def corexdc(xs, ys, k=3, base=2, warning=True):
+  #cxis = [midc(column(xs, i), ys, k, base, warning) for i in range(0, len(xs[0]))]
+  #joint = midc(xs, ys, k, base, warning)
+  #return np.sum(cxis) - joint
+  return tcd(xs, base) - ctcdc(xs, ys, k, base, warning)
+# UTILITY FUNCTIONS
+def vectorize(scalarlist):
+    """ Turn a list of scalars into a list of one-d vectors
+    """
+    return [[x] for x in scalarlist]
+def shuffle_test(measure, x, y, z=False, ns=200, ci=0.95, **kwargs):
+    """ Shuffle test
+        Repeatedly shuffle the x-values and then estimate measure(x, y, [z]).
+        Returns the mean and conf. interval ('ci=0.95' default) over 'ns' runs.
+        'measure' could me mi, cmi, e.g. Keyword arguments can be passed.
+        Mutual information and CMI should have a mean near zero.
+    """
+    xp = x[:]  # A copy that we can shuffle
+    outputs = []
+    for i in range(ns):
+        random.shuffle(xp)
+        if z:
+            outputs.append(measure(xp, y, z, **kwargs))
+        else:
+            outputs.append(measure(xp, y, **kwargs))
+    outputs.sort()
+    return np.mean(outputs), (outputs[int((1. - ci) / 2 * ns)], outputs[int((1. + ci) / 2 * ns)])
+# INTERNAL FUNCTIONS
+def avgdigamma(points, dvec):
+    # This part finds number of neighbors in some radius in the marginal space
+    # returns expectation value of <psi(nx)>
+    N = len(points)
+    tree = ss.cKDTree(points)
+    avg = 0.
+    for i in range(N):
+        dist = dvec[i]
+        # subtlety, we don't include the boundary point,
+        # but we are implicitly adding 1 to kraskov def bc center point is included
+        num_points = len(tree.query_ball_point(points[i], dist - 1e-15, p=float('inf')))
+        avg += digamma(num_points) / N
+    return avg
+def zip2(*args):
+    # zip2(x, y) takes the lists of vectors and makes it a list of vectors in a joint space
+    # E.g. zip2([[1], [2], [3]], [[4], [5], [6]]) = [[1, 4], [2, 5], [3, 6]]
+    return [sum(sublist, []) for sublist in zip(*args)]
+def discretize(xs):
+    def discretize_one(x):
+        if len(x) > 1:
+            return tuple(x)
+        else:
+            return x[0]
+    # discretize(xs) takes a list of vectors and makes it a list of tuples or scalars
+    return [discretize_one(x) for x in xs]
+if __name__ == "__main__":
+    print("NPEET: Non-parametric entropy estimation toolbox. See readme.pdf for details on usage.")

idnns/information/information_process.py ADDED Viewed

	@@ -0,0 +1,201 @@

+'''
+Calculate the information in the network
+Can be by the full distribution rule (for small netowrk) or bt diffrenet approximation method
+'''
+import multiprocessing
+import warnings
+import numpy as np
+import tensorflow as tf
+import idnns.information.information_utilities as inf_ut
+from idnns.networks import model as mo
+from idnns.information.mutual_info_estimation import calc_varitional_information
+warnings.filterwarnings("ignore")
+from joblib import Parallel, delayed
+NUM_CORES = multiprocessing.cpu_count()
+from idnns.information.mutual_information_calculation import *
+import numpy as np
+def calc_information_for_layer(data, bins, unique_inverse_x, unique_inverse_y, pxs, pys1):
+	bins = bins.astype(np.float32)
+	digitized = bins[np.digitize(np.squeeze(data.reshape(1, -1)), bins) - 1].reshape(len(data), -1)
+	b2 = np.ascontiguousarray(digitized).view(
+		np.dtype((np.void, digitized.dtype.itemsize * digitized.shape[1])))
+	unique_array, unique_inverse_t, unique_counts = \
+		np.unique(b2, return_index=False, return_inverse=True, return_counts=True)
+	p_ts = unique_counts / float(sum(unique_counts))
+	PXs, PYs = np.asarray(pxs).T, np.asarray(pys1).T
+	local_IXT, local_ITY = calc_information_from_mat(PXs, PYs, p_ts, digitized, unique_inverse_x, unique_inverse_y,
+	                                                 unique_array)
+	return local_IXT, local_ITY
+def calc_information_sampling(data, bins, pys1, pxs, label, b, b1, len_unique_a, p_YgX, unique_inverse_x,
+                              unique_inverse_y, calc_DKL=False):
+	bins = bins.astype(np.float32)
+	num_of_bins = bins.shape[0]
+	# bins = stats.mstats.mquantiles(np.squeeze(data.reshape(1, -1)), np.linspace(0,1, num=num_of_bins))
+	# hist, bin_edges = np.histogram(np.squeeze(data.reshape(1, -1)), normed=True)
+	digitized = bins[np.digitize(np.squeeze(data.reshape(1, -1)), bins) - 1].reshape(len(data), -1)
+	b2 = np.ascontiguousarray(digitized).view(
+		np.dtype((np.void, digitized.dtype.itemsize * digitized.shape[1])))
+	unique_array, unique_inverse_t, unique_counts = \
+		np.unique(b2, return_index=False, return_inverse=True, return_counts=True)
+	p_ts = unique_counts / float(sum(unique_counts))
+	PXs, PYs = np.asarray(pxs).T, np.asarray(pys1).T
+	if calc_DKL:
+		pxy_given_T = np.array(
+			[calc_probs(i, unique_inverse_t, label, b, b1, len_unique_a) for i in range(0, len(unique_array))]
+		)
+		p_XgT = np.vstack(pxy_given_T[:, 0])
+		p_YgT = pxy_given_T[:, 1]
+		p_YgT = np.vstack(p_YgT).T
+		DKL_YgX_YgT = np.sum([inf_ut.KL(c_p_YgX, p_YgT.T) for c_p_YgX in p_YgX.T], axis=0)
+		H_Xgt = np.nansum(p_XgT * np.log2(p_XgT), axis=1)
+	local_IXT, local_ITY = calc_information_from_mat(PXs, PYs, p_ts, digitized, unique_inverse_x, unique_inverse_y,
+	                                                 unique_array)
+	return local_IXT, local_ITY
+def calc_information_for_layer_with_other(data, bins, unique_inverse_x, unique_inverse_y, label,
+                                          b, b1, len_unique_a, pxs, p_YgX, pys1,
+                                          percent_of_sampling=50):
+	local_IXT, local_ITY = calc_information_sampling(data, bins, pys1, pxs, label, b, b1,
+	                                                 len_unique_a, p_YgX, unique_inverse_x,
+	                                                 unique_inverse_y)
+	number_of_indexs = int(data.shape[1] * (1. / 100 * percent_of_sampling))
+	indexs_of_sampls = np.random.choice(data.shape[1], number_of_indexs, replace=False)
+	if percent_of_sampling != 100:
+		sampled_data = data[:, indexs_of_sampls]
+		sampled_local_IXT, sampled_local_ITY = calc_information_sampling(
+			sampled_data, bins, pys1, pxs, label, b, b1, len_unique_a, p_YgX, unique_inverse_x, unique_inverse_y)
+	params = {}
+	params['local_IXT'] = local_IXT
+	params['local_ITY'] = local_ITY
+	return params
+def calc_by_sampling_neurons(ws_iter_index, num_of_samples, label, sigma, bins, pxs):
+	iter_infomration = []
+	for j in range(len(ws_iter_index)):
+		data = ws_iter_index[j]
+		new_data = np.zeros((num_of_samples * data.shape[0], data.shape[1]))
+		labels = np.zeros((num_of_samples * label.shape[0], label.shape[1]))
+		x = np.zeros((num_of_samples * data.shape[0], 2))
+		for i in range(data.shape[0]):
+			cov_matrix = np.eye(data[i, :].shape[0]) * sigma
+			t_i = np.random.multivariate_normal(data[i, :], cov_matrix, num_of_samples)
+			new_data[num_of_samples * i:(num_of_samples * (i + 1)), :] = t_i
+			labels[num_of_samples * i:(num_of_samples * (i + 1)), :] = label[i, :]
+			x[num_of_samples * i:(num_of_samples * (i + 1)), 0] = i
+		b = np.ascontiguousarray(x).view(np.dtype((np.void, x.dtype.itemsize * x.shape[1])))
+		unique_array, unique_indices, unique_inverse_x, unique_counts = \
+			np.unique(b, return_index=True, return_inverse=True, return_counts=True)
+		b_y = np.ascontiguousarray(labels).view(np.dtype((np.void, labels.dtype.itemsize * labels.shape[1])))
+		unique_array_y, unique_indices_y, unique_inverse_y, unique_counts_y = \
+			np.unique(b_y, return_index=True, return_inverse=True, return_counts=True)
+		pys1 = unique_counts_y / float(np.sum(unique_counts_y))
+		iter_infomration.append(
+			calc_information_for_layer(data=new_data, bins=bins, unique_inverse_x=unique_inverse_x,
+			                           unique_inverse_y=unique_inverse_y, pxs=pxs, pys1=pys1))
+		params = np.array(iter_infomration)
+		return params
+def calc_information_for_epoch(iter_index, interval_information_display, ws_iter_index, bins, unique_inverse_x,
+                               unique_inverse_y, label, b, b1,
+                               len_unique_a, pys, pxs, py_x, pys1, model_path, input_size, layerSize,
+                               calc_vartional_information=False, calc_information_by_sampling=False,
+                               calc_full_and_vartional=False, calc_regular_information=True, num_of_samples=100,
+                               sigma=0.5, ss=[], ks=[]):
+	"""Calculate the information for all the layers for specific epoch"""
+	np.random.seed(None)
+	if calc_full_and_vartional:
+		# Vartional information
+		params_vartional = [
+			calc_varitional_information(ws_iter_index[i], label, model_path, i, len(ws_iter_index) - 1, iter_index,
+			                            input_size, layerSize, ss[i], pys, ks[i], search_sigma=False) for i in
+			range(len(ws_iter_index))]
+		# Full plug-in infomration
+		params_original = np.array(
+			[calc_information_for_layer_with_other(data=ws_iter_index[i], bins=bins, unique_inverse_x=unique_inverse_x,
+			                                       unique_inverse_y=unique_inverse_y, label=label,
+			                                       b=b, b1=b1, len_unique_a=len_unique_a, pxs=pxs,
+			                                       p_YgX=py_x, pys1=pys1)
+			 for i in range(len(ws_iter_index))])
+		# Combine them
+		params = []
+		for i in range(len(ws_iter_index)):
+			current_params = params_original[i]
+			current_params_vartional = params_vartional[i]
+			current_params['IXT_vartional'] = current_params_vartional['local_IXT']
+			current_params['ITY_vartional'] = current_params_vartional['local_ITY']
+			params.append(current_params)
+	elif calc_vartional_information:
+		params = [
+			calc_varitional_information(ws_iter_index[i], label, model_path, i, len(ws_iter_index) - 1, iter_index,
+			                            input_size, layerSize, ss[i], pys, ks[i], search_sigma=True) for i in
+			range(len(ws_iter_index))]
+	# Calc infomration of only subset of the neurons
+	elif calc_information_by_sampling:
+		parmas = calc_by_sampling_neurons(ws_iter_index=ws_iter_index, num_of_samples=num_of_samples, label=label,
+		                                  sigma=sigma, bins=bins, pxs=pxs)
+	elif calc_regular_information:
+		params = np.array(
+			[calc_information_for_layer_with_other(data=ws_iter_index[i], bins=bins, unique_inverse_x=unique_inverse_x,
+			                                       unique_inverse_y=unique_inverse_y, label=label,
+			                                       b=b, b1=b1, len_unique_a=len_unique_a, pxs=pxs,
+			                                       p_YgX=py_x, pys1=pys1)
+			 for i in range(len(ws_iter_index))])
+	if np.mod(iter_index, interval_information_display) == 0:
+		print('Calculated The information of epoch number - {0}'.format(iter_index))
+	return params
+def extract_probs(label, x):
+	"""calculate the probabilities of the given data and labels p(x), p(y) and (y|x)"""
+	pys = np.sum(label, axis=0) / float(label.shape[0])
+	b = np.ascontiguousarray(x).view(np.dtype((np.void, x.dtype.itemsize * x.shape[1])))
+	unique_array, unique_indices, unique_inverse_x, unique_counts = \
+		np.unique(b, return_index=True, return_inverse=True, return_counts=True)
+	unique_a = x[unique_indices]
+	b1 = np.ascontiguousarray(unique_a).view(np.dtype((np.void, unique_a.dtype.itemsize * unique_a.shape[1])))
+	pxs = unique_counts / float(np.sum(unique_counts))
+	p_y_given_x = []
+	for i in range(0, len(unique_array)):
+		indexs = unique_inverse_x == i
+		py_x_current = np.mean(label[indexs, :], axis=0)
+		p_y_given_x.append(py_x_current)
+	p_y_given_x = np.array(p_y_given_x).T
+	b_y = np.ascontiguousarray(label).view(np.dtype((np.void, label.dtype.itemsize * label.shape[1])))
+	unique_array_y, unique_indices_y, unique_inverse_y, unique_counts_y = \
+		np.unique(b_y, return_index=True, return_inverse=True, return_counts=True)
+	pys1 = unique_counts_y / float(np.sum(unique_counts_y))
+	return pys, pys1, p_y_given_x, b1, b, unique_a, unique_inverse_x, unique_inverse_y, pxs
+def get_information(ws, x, label, num_of_bins, interval_information_display, model, layerSize,
+                    calc_parallel=True, py_hats=0):
+	"""Calculate the information for the network for all the epochs and all the layers"""
+	print('Start calculating the information...')
+	bins = np.linspace(-1, 1, num_of_bins)
+	label = np.array(label).astype(float)
+	pys, pys1, p_y_given_x, b1, b, unique_a, unique_inverse_x, unique_inverse_y, pxs = extract_probs(label, x)
+	if calc_parallel:
+		params = np.array(Parallel(n_jobs=NUM_CORES
+		                           )(delayed(calc_information_for_epoch)
+		                             (i, interval_information_display, ws[i], bins, unique_inverse_x, unique_inverse_y,
+		                              label,
+		                              b, b1, len(unique_a), pys,
+		                              pxs, p_y_given_x, pys1, model.save_file, x.shape[1], layerSize)
+		                             for i in range(len(ws))))
+	else:
+		params = np.array([calc_information_for_epoch
+		                   (i, interval_information_display, ws[i], bins, unique_inverse_x, unique_inverse_y,
+		                    label, b, b1, len(unique_a), pys,
+		                    pxs, p_y_given_x, pys1, model.save_file, x.shape[1], layerSize)
+		                   for i in range(len(ws))])
+	return params

idnns/information/information_utilities.py ADDED Viewed

	@@ -0,0 +1,54 @@

+import numpy as np
+num = 1
+def KL(a, b):
+    """Calculate the Kullback Leibler divergence between a and b """
+    D_KL = np.nansum(np.multiply(a, np.log(np.divide(a, b+np.spacing(1)))), axis=1)
+    return D_KL
+def calc_information(probTgivenXs, PYgivenTs, PXs, PYs):
+    """Calculate the MI - I(X;T) and I(Y;T)"""
+    PTs = np.nansum(probTgivenXs*PXs, axis=1)
+    Ht = np.nansum(-np.dot(PTs, np.log2(PTs)))
+    Htx = - np.nansum((np.dot(np.multiply(probTgivenXs, np.log2(probTgivenXs)), PXs)))
+    Hyt = - np.nansum(np.dot(PYgivenTs*np.log2(PYgivenTs+np.spacing(1)), PTs))
+    Hy = np.nansum(-PYs * np.log2(PYs+np.spacing(1)))
+    IYT = Hy - Hyt
+    ITX = Ht - Htx
+    return ITX, IYT
+def calc_information_1(probTgivenXs, PYgivenTs, PXs, PYs, PTs):
+    """Calculate the MI - I(X;T) and I(Y;T)"""
+    #PTs = np.nansum(probTgivenXs*PXs, axis=1)
+    Ht = np.nansum(-np.dot(PTs, np.log2(PTs+np.spacing(1))))
+    Htx = - np.nansum((np.dot(np.multiply(probTgivenXs, np.log2(probTgivenXs+np.spacing(1))), PXs)))
+    Hyt = - np.nansum(np.dot(PYgivenTs*np.log2(PYgivenTs+np.spacing(1)), PTs))
+    Hy = np.nansum(-PYs * np.log2(PYs+np.spacing(1)))
+    IYT = Hy - Hyt
+    ITX = Ht - Htx
+    return ITX, IYT
+def calc_information(probTgivenXs, PYgivenTs, PXs, PYs, PTs):
+    """Calculate the MI - I(X;T) and I(Y;T)"""
+    #PTs = np.nansum(probTgivenXs*PXs, axis=1)
+    t_indeces = np.nonzero(PTs)
+    Ht = np.nansum(-np.dot(PTs, np.log2(PTs+np.spacing(1))))
+    Htx = - np.nansum((np.dot(np.multiply(probTgivenXs, np.log2(probTgivenXs)), PXs)))
+    Hyt = - np.nansum(np.dot(PYgivenTs*np.log2(PYgivenTs+np.spacing(1)), PTs))
+    Hy = np.nansum(-PYs * np.log2(PYs+np.spacing(1)))
+    IYT = Hy - Hyt
+    ITX = Ht - Htx
+    return ITX, IYT
+def t_calc_information(p_x_given_t, PYgivenTs, PXs, PYs):
+    """Calculate the MI - I(X;T) and I(Y;T)"""
+    Hx = np.nansum(-np.dot(PXs, np.log2(PXs)))
+    Hxt = - np.nansum((np.dot(np.multiply(p_x_given_t, np.log2(p_x_given_t)), PXs)))
+    Hyt = - np.nansum(np.dot(PYgivenTs*np.log2(PYgivenTs+np.spacing(1)), PTs))
+    Hy = np.nansum(-PYs * np.log2(PYs+np.spacing(1)))
+    IYT = Hy - Hyt
+    ITX = Hx - Hxt
+    return ITX, IYT

idnns/information/mutual_info_estimation.py ADDED Viewed

	@@ -0,0 +1,178 @@

+import numpy as np
+from scipy.optimize import minimize
+import sys
+import tensorflow as tf
+from idnns.networks import model as mo
+import contextlib
+import idnns.information.entropy_estimators as ee
+@contextlib.contextmanager
+def printoptions(*args, **kwargs):
+    original = np.get_printoptions()
+    np.set_printoptions(*args, **kwargs)
+    try:
+        yield
+    finally:
+        np.set_printoptions(**original)
+def optimiaze_func(s, diff_mat, d, N):
+    diff_mat1 = (1. / (np.sqrt(2. * np.pi) * (s ** 2) ** (d / 2.))) * np.exp(-diff_mat / (2. * s ** 2))
+    np.fill_diagonal(diff_mat1, 0)
+    diff_mat2 = (1. / (N - 1)) * np.sum(diff_mat1, axis=0)
+    diff_mat3 = np.sum(np.log2(diff_mat2), axis=0)
+    return -diff_mat3
+def calc_all_sigams(data, sigmas):
+    batchs = 128
+    num_of_bins = 8
+    # bins = np.linspace(-1, 1, num_of_bins).astype(np.float32)
+    # bins = stats.mstats.mquantiles(np.squeeze(data.reshape(1, -1)), np.linspace(0,1, num=num_of_bins))
+    # data = bins[np.digitize(np.squeeze(data.reshape(1, -1)), bins) - 1].reshape(len(data), -1)
+    batch_points = np.rint(np.arange(0, data.shape[0] + 1, batchs)).astype(dtype=np.int32)
+    I_XT = []
+    num_of_rand = min(800, data.shape[1])
+    for sigma in sigmas:
+        # print sigma
+        I_XT_temp = 0
+        for i in range(0, len(batch_points) - 1):
+            new_data = data[batch_points[i]:batch_points[i + 1], :]
+            rand_indexs = np.random.randint(0, new_data.shape[1], num_of_rand)
+            new_data = new_data[:, :]
+            N = new_data.shape[0]
+            d = new_data.shape[1]
+            diff_mat = np.linalg.norm(((new_data[:, np.newaxis, :] - new_data)), axis=2)
+            # print diff_mat.shape, new_data.shape
+            s0 = 0.2
+            # DOTO -add leaveoneout validation
+            res = minimize(optimiaze_func, s0, args=(diff_mat, d, N), method='nelder-mead',
+                           options={'xtol': 1e-8, 'disp': False, 'maxiter': 6})
+            eta = res.x
+            diff_mat0 = - 0.5 * (diff_mat / (sigma ** 2 + eta ** 2))
+            diff_mat1 = np.sum(np.exp(diff_mat0), axis=0)
+            diff_mat2 = -(1.0 / N) * np.sum(np.log2((1.0 / N) * diff_mat1))
+            I_XT_temp += diff_mat2 - d * np.log2((sigma ** 2) / (eta ** 2 + sigma ** 2))
+            # print diff_mat2 - d*np.log2((sigma**2)/(eta**2+sigma**2))
+        I_XT_temp /= len(batch_points)
+        I_XT.append(I_XT_temp)
+    sys.stdout.flush()
+    return I_XT
+def estimate_IY_by_network(data, labels, from_layer=0):
+    if len(data.shape) > 2:
+        input_size = data.shape[1:]
+    else:
+        input_size = data.shape[1]
+    p_y_given_t_i = data
+    acc_all = [0]
+    if from_layer < 5:
+        acc_all = []
+        g1 = tf.Graph()  ## This is one graph
+        with g1.as_default():
+            # For each epoch and for each layer we calculate the best decoder - we train a 2 lyaer network
+            cov_net = 4
+            model = mo.Model(input_size, [400, 100, 50], labels.shape[1], 0.0001, '', cov_net=cov_net,
+                             from_layer=from_layer)
+            if from_layer < 5:
+                optimizer = model.optimize
+            init = tf.global_variables_initializer()
+            num_of_ephocs = 50
+            batch_size = 51
+            batch_points = np.rint(np.arange(0, data.shape[0] + 1, batch_size)).astype(dtype=np.int32)
+            if data.shape[0] not in batch_points:
+                batch_points = np.append(batch_points, [data.shape[0]])
+        with tf.Session(graph=g1) as sess:
+            sess.run(init)
+            if from_layer < 5:
+                for j in range(0, num_of_ephocs):
+                    for i in range(0, len(batch_points) - 1):
+                        batch_xs = data[batch_points[i]:batch_points[i + 1], :]
+                        batch_ys = labels[batch_points[i]:batch_points[i + 1], :]
+                        feed_dict = {model.x: batch_xs, model.labels: batch_ys}
+                        if cov_net == 1:
+                            feed_dict[model.drouput] = 0.5
+                        optimizer.run(feed_dict)
+            p_y_given_t_i = []
+            batch_size = 256
+            batch_points = np.rint(np.arange(0, data.shape[0] + 1, batch_size)).astype(dtype=np.int32)
+            if data.shape[0] not in batch_points:
+                batch_points = np.append(batch_points, [data.shape[0]])
+            for i in range(0, len(batch_points) - 1):
+                batch_xs = data[batch_points[i]:batch_points[i + 1], :]
+                batch_ys = labels[batch_points[i]:batch_points[i + 1], :]
+                feed_dict = {model.x: batch_xs, model.labels: batch_ys}
+                if cov_net == 1:
+                    feed_dict[model.drouput] = 1
+                p_y_given_t_i_local, acc = sess.run([model.prediction, model.accuracy],
+                                                    feed_dict=feed_dict)
+                acc_all.append(acc)
+                if i == 0:
+                    p_y_given_t_i = np.array(p_y_given_t_i_local)
+                else:
+                    p_y_given_t_i = np.concatenate((p_y_given_t_i, np.array(p_y_given_t_i_local)), axis=0)
+                    # print ("The accuracy of layer number - {}  - {}".format(from_layer, np.mean(acc_all)))
+    max_indx = len(p_y_given_t_i)
+    labels_cut = labels[:max_indx, :]
+    true_label_index = np.argmax(labels_cut, 1)
+    s = np.log2(p_y_given_t_i[np.arange(len(p_y_given_t_i)), true_label_index])
+    I_TY = np.mean(s[np.isfinite(s)])
+    PYs = np.sum(labels_cut, axis=0) / labels_cut.shape[0]
+    Hy = np.nansum(-PYs * np.log2(PYs + np.spacing(1)))
+    I_TY = Hy + I_TY
+    I_TY = I_TY if I_TY >= 0 else 0
+    acc = np.mean(acc_all)
+    sys.stdout.flush()
+    return I_TY, acc
+def calc_varitional_information(data, labels, model_path, layer_numer, num_of_layers, epoch_index, input_size,
+                                layerSize, sigma, pys, ks,
+                                search_sigma=False, estimate_y_by_network=False):
+    """Calculate estimation of the information using vartional IB"""
+    # Assumpations
+    estimate_y_by_network = True
+    # search_sigma = False
+    data_x = data.reshape(data.shape[0], -1)
+    if search_sigma:
+        sigmas = np.linspace(0.2, 10, 20)
+        sigmas = [0.2]
+    else:
+        sigmas = [sigma]
+    if False:
+        I_XT = calc_all_sigams(data_x, sigmas)
+    else:
+        I_XT = 0
+    if estimate_y_by_network:
+        I_TY, acc = estimate_IY_by_network(data, labels, from_layer=layer_numer)
+    else:
+        I_TY = 0
+    with printoptions(precision=3, suppress=True, formatter={'float': '{: 0.3f}'.format}):
+        print('[{0}:{1}] - I(X;T) - {2}, I(X;Y) - {3}, accuracy - {4}'.format(epoch_index, layer_numer,
+                                                                              np.array(I_XT).flatten(), I_TY, acc))
+    sys.stdout.flush()
+    # I_est = mutual_inform[ation((data, labels[:, 0][:, None]), PYs, k=ks)
+    # I_est,I_XT = 0, 0
+    params = {}
+    # params['DKL_YgX_YgT'] = DKL_YgX_YgT
+    # params['pts'] = p_ts
+    # params['H_Xgt'] = H_Xgt
+    params['local_IXT'] = I_XT
+    params['local_ITY'] = I_TY
+    return params
+def estimate_Information(Xs, Ys, Ts):
+	"""Estimation of the MI from missing data based on k-means clustring"""
+	estimate_IXT = ee.mi(Xs, Ts)
+	estimate_IYT = ee.mi(Ys, Ts)
+	# estimate_IXT1 = ee.mi(Xs, Ts)
+	# estimate_IYT1 = ee.mi(Ys, Ts)
+	return estimate_IXT, estimate_IYT

idnns/information/mutual_information_calculation.py ADDED Viewed

	@@ -0,0 +1,50 @@

+'Calculation of the full plug-in distribuation'
+import numpy as np
+import multiprocessing
+from joblib import Parallel, delayed
+NUM_CORES = multiprocessing.cpu_count()
+def calc_entropy_for_specipic_t(current_ts, px_i):
+	"""Calc entropy for specipic t"""
+	b2 = np.ascontiguousarray(current_ts).view(
+		np.dtype((np.void, current_ts.dtype.itemsize * current_ts.shape[1])))
+	unique_array, unique_inverse_t, unique_counts = \
+		np.unique(b2, return_index=False, return_inverse=True, return_counts=True)
+	p_current_ts = unique_counts / float(sum(unique_counts))
+	p_current_ts = np.asarray(p_current_ts, dtype=np.float64).T
+	H2X = px_i * (-np.sum(p_current_ts * np.log2(p_current_ts)))
+	return H2X
+def calc_condtion_entropy(px, t_data, unique_inverse_x):
+	# Condition entropy of t given x
+	H2X_array = np.array(
+		Parallel(n_jobs=NUM_CORES)(delayed(calc_entropy_for_specipic_t)(t_data[unique_inverse_x == i, :], px[i])
+		                           for i in range(px.shape[0])))
+	H2X = np.sum(H2X_array)
+	return H2X
+def calc_information_from_mat(px, py, ps2, data, unique_inverse_x, unique_inverse_y, unique_array):
+	"""Calculate the MI based on binning of the data"""
+	H2 = -np.sum(ps2 * np.log2(ps2))
+	H2X = calc_condtion_entropy(px, data, unique_inverse_x)
+	H2Y = calc_condtion_entropy(py.T, data, unique_inverse_y)
+	IY = H2 - H2Y
+	IX = H2 - H2X
+	return IX, IY
+def calc_probs(t_index, unique_inverse, label, b, b1, len_unique_a):
+	"""Calculate the p(x|T) and p(y|T)"""
+	indexs = unique_inverse == t_index
+	p_y_ts = np.sum(label[indexs], axis=0) / label[indexs].shape[0]
+	unique_array_internal, unique_counts_internal = \
+		np.unique(b[indexs], return_index=False, return_inverse=False, return_counts=True)
+	indexes_x = np.where(np.in1d(b1, b[indexs]))
+	p_x_ts = np.zeros(len_unique_a)
+	p_x_ts[indexes_x] = unique_counts_internal / float(sum(unique_counts_internal))
+	return p_x_ts, p_y_ts

idnns/networks/__init__.py ADDED Viewed

File without changes

idnns/networks/information_network.py ADDED Viewed

	@@ -0,0 +1,166 @@

+import _pickle as cPickle
+import multiprocessing
+import os
+import sys
+import numpy as np
+from joblib import Parallel, delayed
+import idnns.networks.network as nn
+from idnns.information import information_process  as inn
+from idnns.plots import plot_figures as plt_fig
+from idnns.networks import network_paramters as netp
+from idnns.networks.utils import load_data
+# from idnns.network import utils
+# import idnns.plots.plot_gradients as plt_grads
+NUM_CORES = multiprocessing.cpu_count()
+class informationNetwork():
+	"""A class that store the network, train it and calc it's information (can be several of networks) """
+	def __init__(self, rand_int=0, num_of_samples=None, args=None):
+		if args == None:
+			args = netp.get_default_parser(num_of_samples)
+		self.cov_net = args.cov_net
+		self.calc_information = args.calc_information
+		self.run_in_parallel = args.run_in_parallel
+		self.num_ephocs = args.num_ephocs
+		self.learning_rate = args.learning_rate
+		self.batch_size = args.batch_size
+		self.activation_function = args.activation_function
+		self.interval_accuracy_display = args.interval_accuracy_display
+		self.save_grads = args.save_grads
+		self.num_of_repeats = args.num_of_repeats
+		self.calc_information_last = args.calc_information_last
+		self.num_of_bins = args.num_of_bins
+		self.interval_information_display = args.interval_information_display
+		self.save_ws = args.save_ws
+		self.name = args.data_dir + args.data_name
+		# The arch of the networks
+		self.layers_sizes = netp.select_network_arch(args.net_type)
+		# The percents of the train data samples
+		self.train_samples = np.linspace(1, 100, 199)[[[x * 2 - 2 for x in index] for index in args.inds]]
+		# The indexs that we want to calculate the information for them in logspace interval
+		self.epochs_indexes = np.unique(
+			np.logspace(np.log2(args.start_samples), np.log2(args.num_ephocs), args.num_of_samples, dtype=int,
+			            base=2)) - 1
+		max_size = np.max([len(layers_size) for layers_size in self.layers_sizes])
+		# load data
+		self.data_sets = load_data(self.name, args.random_labels)
+		# create arrays for saving the data
+		self.ws, self.grads, self.information, self.models, self.names, self.networks, self.weights = [
+			[[[[None] for k in range(len(self.train_samples))] for j in range(len(self.layers_sizes))]
+			 for i in range(self.num_of_repeats)] for _ in range(7)]
+		self.loss_train, self.loss_test, self.test_error, self.train_error, self.l1_norms, self.l2_norms = \
+			[np.zeros((self.num_of_repeats, len(self.layers_sizes), len(self.train_samples), len(self.epochs_indexes)))
+			 for _ in range(6)]
+		params = {'sampleLen': len(self.train_samples),
+		          'nDistSmpls': args.nDistSmpls,
+		          'layerSizes': ",".join(str(i) for i in self.layers_sizes[0]), 'nEpoch': args.num_ephocs, 'batch': args.batch_size,
+		          'nRepeats': args.num_of_repeats, 'nEpochInds': len(self.epochs_indexes),
+		          'LastEpochsInds': self.epochs_indexes[-1], 'DataName': args.data_name,
+		          'lr': args.learning_rate}
+		self.name_to_save = args.name + "_" + "_".join([str(i) + '=' + str(params[i]) for i in params])
+		params['train_samples'], params['CPUs'], params[
+			'directory'], params['epochsInds'] = self.train_samples, NUM_CORES, self.name_to_save, self.epochs_indexes
+		self.params = params
+		self.rand_int = rand_int
+		# If we trained already the network
+		self.traind_network = False
+	def save_data(self, parent_dir='jobs/', file_to_save='data.pickle'):
+		"""Save the data to the file """
+		directory = '{0}/{1}{2}/'.format(os.getcwd(), parent_dir, self.params['directory'])
+		data = {'information': self.information,
+		        'test_error': self.test_error, 'train_error': self.train_error, 'var_grad_val': self.grads,
+		        'loss_test': self.loss_test, 'loss_train': self.loss_train, 'params': self.params
+			, 'l1_norms': self.l1_norms, 'weights': self.weights, 'ws': self.ws}
+		if not os.path.exists(directory):
+			os.makedirs(directory)
+		self.dir_saved = directory
+		with open(self.dir_saved + file_to_save, 'wb') as f:
+			cPickle.dump(data, f, protocol=2)
+	def run_network(self):
+		"""Train and calculated the network's information"""
+		if self.run_in_parallel:
+			results = Parallel(n_jobs=NUM_CORES)(delayed(nn.train_network)
+			                                     (self.layers_sizes[j],
+			                                      self.num_ephocs, self.learning_rate, self.batch_size,
+			                                      self.epochs_indexes, self.save_grads, self.data_sets,
+			                                      self.activation_function,
+			                                      self.train_samples, self.interval_accuracy_display,
+			                                      self.calc_information,
+			                                      self.calc_information_last, self.num_of_bins,
+			                                      self.interval_information_display, self.save_ws, self.rand_int,
+			                                      self.cov_net)
+			                                     for i in range(len(self.train_samples)) for j in
+			                                     range(len(self.layers_sizes)) for k in range(self.num_of_repeats))
+		else:
+			results = [nn.train_and_calc_inf_network(i, j, k,
+			                                         self.layers_sizes[j],
+			                                         self.num_ephocs, self.learning_rate, self.batch_size,
+			                                         self.epochs_indexes, self.save_grads, self.data_sets,
+			                                         self.activation_function,
+			                                         self.train_samples, self.interval_accuracy_display,
+			                                         self.calc_information,
+			                                         self.calc_information_last, self.num_of_bins,
+			                                         self.interval_information_display,
+			                                         self.save_ws, self.rand_int, self.cov_net)
+			           for i in range(len(self.train_samples)) for j in range(len(self.layers_sizes)) for k in
+			           range(self.num_of_repeats)]
+		# Extract all the measures and orgainze it
+		for i in range(len(self.train_samples)):
+			for j in range(len(self.layers_sizes)):
+				for k in range(self.num_of_repeats):
+					index = i * len(self.layers_sizes) * self.num_of_repeats + j * self.num_of_repeats + k
+					current_network = results[index]
+					self.networks[k][j][i] = current_network
+					self.ws[k][j][i] = current_network['ws']
+					self.weights[k][j][i] = current_network['weights']
+					self.information[k][j][i] = current_network['information']
+					self.grads[k][i][i] = current_network['gradients']
+					self.test_error[k, j, i, :] = current_network['test_prediction']
+					self.train_error[k, j, i, :] = current_network['train_prediction']
+					self.loss_test[k, j, i, :] = current_network['loss_test']
+					self.loss_train[k, j, i, :] = current_network['loss_train']
+		self.traind_network = True
+	def print_information(self):
+		"""Print the networks params"""
+		for val in self.params:
+			if val != 'epochsInds':
+				print (val, self.params[val])
+	def calc_information(self):
+		"""Calculate the infomration of the network for all the epochs - only valid if we save the activation values and trained the network"""
+		if self.traind_network and self.save_ws:
+			self.information = np.array(
+				[inn.get_information(self.ws[k][j][i], self.data_sets.data, self.data_sets.labels,
+				                     self.args.num_of_bins, self.args.interval_information_display, self.epochs_indexes)
+				 for i in range(len(self.train_samples)) for j in
+				 range(len(self.layers_sizes)) for k in range(self.args.num_of_repeats)])
+		else:
+			print ('Cant calculate the infomration of the networks!!!')
+	def calc_information_last(self):
+		"""Calculate the information of the last epoch"""
+		if self.traind_network and self.save_ws:
+			return np.array([inn.get_information([self.ws[k][j][i][-1]], self.data_sets.data, self.data_sets.labels,
+			                                     self.args.num_of_bins, self.args.interval_information_display,
+			                                     self.epochs_indexes)
+			                 for i in range(len(self.train_samples)) for j in
+			                 range(len(self.layers_sizes)) for k in range(self.args.num_of_repeats)])
+	def plot_network(self):
+		str_names = [[self.dir_saved]]
+		mode = 2
+		save_name = 'figure'
+		plt_fig.plot_figures(str_names, mode, save_name)

idnns/networks/model.py ADDED Viewed

	@@ -0,0 +1,212 @@

+import functools
+import tensorflow as tf
+import numpy as np
+from idnns.networks.utils import _convert_string_dtype
+from idnns.networks.models import multi_layer_perceptron
+from idnns.networks.models import deepnn
+from idnns.networks.ops import *
+import tensorflow.compat.v1 as tf
+tf.disable_v2_behavior()
+def lazy_property(function):
+	attribute = '_cache_' + function.__name__
+	@property
+	@functools.wraps(function)
+	def decorator(self):
+		# print hasattr(self, attribute)
+		if not hasattr(self, attribute):
+			setattr(self, attribute, function(self))
+		return getattr(self, attribute)
+	return decorator
+class Model:
+	"""A class that represent model of network"""
+	def __init__(self, input_size, layerSize, num_of_classes, learning_rate_local=0.001, save_file='',
+	             activation_function=0, cov_net=False):
+		self.covnet = cov_net
+		self.input_size = input_size
+		self.layerSize = layerSize
+		self.all_layer_sizes = np.copy(layerSize)
+		self.all_layer_sizes = np.insert(self.all_layer_sizes, 0, input_size)
+		self.num_of_classes = num_of_classes
+		self._num_of_layers = len(layerSize) + 1
+		self.learning_rate_local = learning_rate_local
+		self._save_file = save_file
+		self.hidden = None
+		self.savers = []
+		if activation_function == 1:
+			self.activation_function = tf.nn.relu
+		elif activation_function == 2:
+			self.activation_function = None
+		else:
+			self.activation_function = tf.nn.tanh
+		self.prediction
+		self.optimize
+		self.accuracy
+	def initilizae_layer(self, name_scope, row_size, col_size, activation_function, last_hidden):
+		# Bulid layer of the network with weights and biases
+		weights = get_scope_variable(name_scope=name_scope, var="weights",
+		                             shape=[row_size, col_size],
+		                             initializer=tf.truncated_normal_initializer(mean=0.0, stddev=1.0 / np.sqrt(
+			                             float(row_size))))
+		biases = get_scope_variable(name_scope=name_scope, var='biases', shape=[col_size],
+		                            initializer=tf.constant_initializer(0.0))
+		self.weights_all.append(weights)
+		self.biases_all.append(biases)
+		variable_summaries(weights)
+		variable_summaries(biases)
+		with tf.variable_scope(name_scope) as scope:
+			input = tf.matmul(last_hidden, weights) + biases
+			if activation_function == None:
+				output = input
+			else:
+				output = activation_function(input, name='output')
+		self.inputs.append(input)
+		self.hidden.append(output)
+		return output
+	@property
+	def num_of_layers(self):
+		return self._num_of_layers
+	@property
+	def hidden_layers(self):
+		"""The hidden layers of the netowrk"""
+		if self.hidden is None:
+			self.hidden, self.inputs, self.weights_all, self.biases_all = [], [], [], []
+			last_hidden = self.x
+			if self.covnet == 1:
+				y_conv, self._drouput, self.hidden, self.inputs = deepnn(self.x)
+			elif self.covnet == 2:
+				y_c, self.hidden, self.inputs = multi_layer_perceptron(self.x, self.input_size, self.num_of_classes,
+				                                                       self.layerSize[0], self.layerSize[1])
+			else:
+				self._drouput = 'dr'
+				# self.hidden.append(self.x)
+				for i in range(1, len(self.all_layer_sizes)):
+					name_scope = 'hidden' + str(i - 1)
+					row_size, col_size = self.all_layer_sizes[i - 1], self.all_layer_sizes[i]
+					activation_function = self.activation_function
+					last_hidden = self.initilizae_layer(name_scope, row_size, col_size, activation_function,
+					                                    last_hidden)
+				name_scope = 'final_layer'
+				row_size, col_size = self.layerSize[-1], self.num_of_classes
+				activation_function = tf.nn.softmax
+				last_hidden = self.initilizae_layer(name_scope, row_size, col_size, activation_function, last_hidden)
+		return self.hidden
+	@lazy_property
+	def prediction(self):
+		logits = self.hidden_layers[-1]
+		return logits
+	@lazy_property
+	def drouput(self):
+		return self._drouput
+	@property
+	def optimize(self):
+		optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate_local).minimize(self.cross_entropy)
+		return optimizer
+	@lazy_property
+	def x(self):
+		return tf.placeholder(tf.float32, shape=[None, self.input_size], name='x')
+	@lazy_property
+	def labels(self):
+		return tf.placeholder(tf.float32, shape=[None, self.num_of_classes], name='y_true')
+	@lazy_property
+	def accuracy(self):
+		correct_prediction = tf.equal(tf.argmax(self.prediction, 1), tf.argmax(self.labels, 1))
+		accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
+		tf.summary.scalar('accuracy', accuracy)
+		return accuracy
+	@lazy_property
+	def cross_entropy(self):
+		cross_entropy = tf.reduce_mean(
+			-tf.reduce_sum(self.labels * tf.log(tf.clip_by_value(self.prediction, 1e-50, 1.0)), reduction_indices=[1]))
+		tf.summary.scalar('cross_entropy', cross_entropy)
+		return cross_entropy
+	@property
+	def save_file(self):
+		return self._save_file
+	def inference(self, data):
+		"""Return the predication of the network with the given data"""
+		with tf.Session() as sess:
+			self.saver.restore(sess, './' + self.save_file)
+			feed_dict = {self.x: data}
+			pred = sess.run(self.prediction, feed_dict=feed_dict)
+		return pred
+	def inference_default(self, data):
+		session = tf.get_default_session()
+		feed_dict = {self.x: data}
+		pred = session.run(self.prediction, feed_dict=feed_dict)
+		return pred
+	def get_layer_with_inference(self, data, layer_index, epoch_index):
+		"""Return the layer activation's values with the results of the network"""
+		with tf.Session() as sess:
+			self.savers[epoch_index].restore(sess, './' + self.save_file + str(epoch_index))
+			feed_dict = {self.hidden_layers[layer_index]: data[:, 0:self.hidden_layers[layer_index]._shape[1]]}
+			pred, layer_values = sess.run([self.prediction, self.hidden_layers[layer_index]], feed_dict=feed_dict)
+		return pred, layer_values
+	def calc_layer_values(self, X, layer_index):
+		"""Return the layer's values"""
+		with tf.Session() as sess:
+			self.savers[-1].restore(sess, './' + self.save_file)
+			feed_dict = {self.x: X}
+			layer_values = sess.run(self.hidden_layers[layer_index], feed_dict=feed_dict)
+		return layer_values
+	def update_weights_and_calc_values_temp(self, d_w_i_j, layer_to_perturbe, i, j, X):
+		"""Update the weights of the given layer cacl the output and return it to the original values"""
+		if layer_to_perturbe + 1 >= len(self.hidden_layers):
+			scope_name = 'softmax_linear'
+		else:
+			scope_name = "hidden" + str(layer_to_perturbe)
+		weights = get_scope_variable(scope_name, "weights", shape=None, initializer=None)
+		session = tf.get_default_session()
+		weights_values = weights.eval(session=session)
+		weights_values_pert = weights_values
+		weights_values_pert[i, j] += d_w_i_j
+		set_value(weights, weights_values_pert)
+		feed_dict = {self.x: X}
+		layer_values = session.run(self.hidden_layers[layer_to_perturbe], feed_dict=feed_dict)
+		set_value(weights, weights_values)
+		return layer_values
+	def update_weights(self, d_w0, layer_to_perturbe):
+		"""Update the weights' values of the given layer"""
+		weights = get_scope_variable("hidden" + str(layer_to_perturbe), "weights", shape=None, initializer=None)
+		session = tf.get_default_session()
+		weights_values = weights.eval(session=session)
+		set_value(weights, weights_values + d_w0)
+	def get_wights_size(self, layer_to_perturbe):
+		"""Return the size of the given layer"""
+		weights = get_scope_variable("hidden" + str(layer_to_perturbe), "weights", shape=None, initializer=None)
+		return weights._initial_value.shape[1].value, weights._initial_value.shape[0].value
+	def get_layer_input(self, layer_to_perturbe, X):
+		"""Return the input of the given layer for the given data"""
+		session = tf.get_default_session()
+		inputs = self.inputs[layer_to_perturbe]
+		feed_dict = {self.x: X}
+		layer_values = session.run(inputs, feed_dict=feed_dict)
+		return layer_values

idnns/networks/models.py ADDED Viewed

	@@ -0,0 +1,125 @@

+import tensorflow as tf
+from idnns.networks.ops import *
+def multi_layer_perceptron(x, n_input, n_classes, n_hidden_1, n_hidden_2):
+	hidden = []
+	input = []
+	hidden.append(x)
+	# Network Parameters
+	# n_input = x.shape[0]  # MNIST data input (img shape: 28*28)
+	# n_classes = 10  # MNIST total classes (0-9 digits)
+	weights = {
+		'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
+		'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
+		'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
+	}
+	biases = {
+		'b1': tf.Variable(tf.random_normal([n_hidden_1])),
+		'b2': tf.Variable(tf.random_normal([n_hidden_2])),
+		'out': tf.Variable(tf.random_normal([n_classes]))
+	}
+	# Hidden layer with RELU activation
+	layer_1_input = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
+	layer_1 = tf.nn.relu(layer_1_input)
+	input.append(layer_1)
+	hidden.append(layer_1)
+	# Hidden layer with RELU activation
+	layer_2_input = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
+	layer_2 = tf.nn.relu(layer_2_input)
+	input.append(layer_2_input)
+	hidden.append(layer_2)
+	# Output layer with linear activation
+	input_y = tf.matmul(layer_2, weights['out']) + biases['out']
+	y_output = tf.nn.softmax(input_y)
+	input.append(y_output)
+	hidden.append(y_output)
+	return y_output, hidden, input
+def deepnn(x):
+	"""deepnn builds the graph for a deep net for classifying digits.
+	Args:
+	  x: an input tensor with the dimensions (N_examples, 784), where 784 is the
+	  number of pixels in a standard MNIST image.
+	Returns:
+	  A tuple (y, keep_prob). y is a tensor of shape (N_examples, 10), with values
+	  equal to the logits of classifying the digit into one of 10 classes (the
+	  digits 0-9). keep_prob is a scalar placeholder for the probability of
+	  dropout.
+	"""
+	hidden = []
+	input = []
+	x_image = tf.reshape(x, [-1, 28, 28, 1])
+	hidden.append(x)
+	# First convolutional layer - maps one grayscale image to 32 feature maps.
+	with tf.name_scope('conv1'):
+		with tf.name_scope('weights'):
+			W_conv1 = weight_variable([5, 5, 1, 32])
+			variable_summaries(W_conv1)
+		with tf.name_scope('biases'):
+			b_conv1 = bias_variable([32])
+			variable_summaries(b_conv1)
+		with tf.name_scope('activation'):
+			input_con1 = conv2d(x_image, W_conv1) + b_conv1
+			h_conv1 = tf.nn.relu(input_con1)
+			tf.summary.histogram('activations', h_conv1)
+		with tf.name_scope('max_pol'):
+			# Pooling layer - downsamples by 2X.
+			h_pool1 = max_pool_2x2(h_conv1)
+		input.append(input_con1)
+		hidden.append(h_pool1)
+	with tf.name_scope('conv2'):
+		# Second convolutional layer -- maps 32 feature maps to 64.
+		with tf.name_scope('weights'):
+			W_conv2 = weight_variable([5, 5, 32, 64])
+			variable_summaries(W_conv2)
+		with tf.name_scope('biases'):
+			b_conv2 = bias_variable([64])
+			variable_summaries(b_conv2)
+		with tf.name_scope('activation'):
+			input_con2 = conv2d(h_pool1, W_conv2) + b_conv2
+			h_conv2 = tf.nn.relu(input_con2)
+			tf.summary.histogram('activations', h_conv2)
+		with tf.name_scope('max_pol'):
+			# Second pooling layer.
+			h_pool2 = max_pool_2x2(h_conv2)
+		input.append(input_con2)
+		hidden.append(h_pool2)
+	# Fully connected layer 1 -- after 2 round of downsampling, our 28x28 image
+	# is down to 7x7x64 feature maps -- maps this to 1024 features.
+	with tf.name_scope('FC1'):
+		with tf.name_scope('weights'):
+			W_fc1 = weight_variable([7 * 7 * 64, 1024])
+			variable_summaries(W_fc1)
+		with tf.name_scope('biases'):
+			b_fc1 = bias_variable([1024])
+			variable_summaries(b_fc1)
+		h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
+		with tf.name_scope('activation'):
+			input_fc1 = tf.matmul(h_pool2_flat, W_fc1) + b_fc1
+			h_fc1 = tf.nn.relu(input_fc1)
+			tf.summary.histogram('activations', h_fc1)
+	with tf.name_scope('drouput'):
+		keep_prob = tf.placeholder(tf.float32)
+		tf.summary.scalar('dropout_keep_probability', keep_prob)
+		h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
+		input.append(input_fc1)
+		hidden.append(h_fc1_drop)
+	# Map the 1024 features to 10 classes, one for each digit
+	with tf.name_scope('FC2'):
+		with tf.name_scope('weights'):
+			W_fc2 = weight_variable([1024, 10])
+			variable_summaries(W_fc2)
+		with tf.name_scope('biases'):
+			b_fc2 = bias_variable([10])
+			variable_summaries(b_fc2)
+	input_y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
+	y_conv = tf.nn.softmax(input_y_conv)
+	input.append(input_y_conv)
+	hidden.append(y_conv)
+	return y_conv, keep_prob, hidden, input

idnns/networks/network.py ADDED Viewed

	@@ -0,0 +1,173 @@

+"""Train and calculate the information of network"""
+import multiprocessing
+import os
+import sys
+import warnings
+import numpy as np
+import tensorflow as tf
+from idnns.information import information_process  as inn
+from idnns.networks.utils import data_shuffle
+from idnns.networks import model as mo
+import tensorflow.compat.v1 as tf
+tf.disable_v2_behavior()
+warnings.filterwarnings("ignore")
+summaries_dir = 'summaries'
+NUM_CORES = multiprocessing.cpu_count()
+def build_model(activation_function, layerSize, input_size, num_of_classes, learning_rate_local, save_file, covn_net):
+	"""Bulid specipic model of the network
+	Return the network model
+	"""
+	model = mo.Model(input_size, layerSize, num_of_classes, learning_rate_local, save_file, int(activation_function),
+	                 cov_net=covn_net)
+	return model
+def train_and_calc_inf_network(i, j, k, layerSize, num_of_ephocs, learning_rate_local, batch_size, indexes, save_grads,
+                               data_sets_org,
+                               model_type, percent_of_train, interval_accuracy_display, calc_information,
+                               calc_information_last, num_of_bins,
+                               interval_information_display, save_ws, rand_int, cov_net):
+	"""Train the network and calculate it's information"""
+	network_name = '{0}_{1}_{2}_{3}'.format(i, j, k, rand_int)
+	print ('Training network  - {0}'.format(network_name))
+	network = train_network(layerSize, num_of_ephocs, learning_rate_local, batch_size, indexes, save_grads,
+	                        data_sets_org, model_type, percent_of_train, interval_accuracy_display, network_name,
+	                        cov_net)
+	network['information'] = []
+	if calc_information:
+		print ('Calculating the infomration')
+		infomration = np.array([inn.get_information(network['ws'], data_sets_org.data, data_sets_org.labels,
+		                                            num_of_bins, interval_information_display, network['model'],
+		                                            layerSize)])
+		network['information'] = infomration
+	elif calc_information_last:
+		print ('Calculating the infomration for the last epoch')
+		infomration = np.array([inn.get_information([network['ws'][-1]], data_sets_org.data, data_sets_org.labels,
+		                                            num_of_bins, interval_information_display,
+		                                            network['model'], layerSize)])
+		network['information'] = infomration
+	# If we dont want to save layer's output
+	if not save_ws:
+		network['weights'] = 0
+	return network
+def exctract_activity(sess, batch_points_all, model, data_sets_org):
+	"""Get the activation values of the layers for the input"""
+	w_temp = []
+	for i in range(0, len(batch_points_all) - 1):
+		batch_xs = data_sets_org.data[batch_points_all[i]:batch_points_all[i + 1]]
+		batch_ys = data_sets_org.labels[batch_points_all[i]:batch_points_all[i + 1]]
+		feed_dict_temp = {model.x: batch_xs, model.labels: batch_ys}
+		w_temp_local = sess.run([model.hidden_layers],
+		                        feed_dict=feed_dict_temp)
+		for s in range(len(w_temp_local[0])):
+			if i == 0:
+				w_temp.append(w_temp_local[0][s])
+			else:
+				w_temp[s] = np.concatenate((w_temp[s], w_temp_local[0][s]), axis=0)
+	""""
+	  infomration[k] = inn.calc_information_for_epoch(k, interval_information_display, ws_t, params['bins'],
+										params['unique_inverse_x'],
+										params['unique_inverse_y'],
+										params['label'], estimted_labels,
+										params['b'], params['b1'], params['len_unique_a'],
+										params['pys'], py_hats_temp, params['pxs'], params['py_x'],
+										params['pys1'])
+	"""
+	return w_temp
+def print_accuracy(batch_points_test, data_sets, model, sess, j, acc_train_array):
+	"""Calc the test acc and print the train and test accuracy"""
+	acc_array = []
+	for i in range(0, len(batch_points_test) - 1):
+		batch_xs = data_sets.test.data[batch_points_test[i]:batch_points_test[i + 1]]
+		batch_ys = data_sets.test.labels[batch_points_test[i]:batch_points_test[i + 1]]
+		feed_dict_temp = {model.x: batch_xs, model.labels: batch_ys}
+		acc = sess.run([model.accuracy],
+		               feed_dict=feed_dict_temp)
+		acc_array.append(acc)
+	print ('Epoch {0} - Test Accuracy: {1:.3f} Train Accuracy: {2:.3f}'.format(j, np.mean(np.array(acc_array)),
+	                                                                           np.mean(np.array(acc_train_array))))
+def train_network(layerSize, num_of_ephocs, learning_rate_local, batch_size, indexes, save_grads,
+                  data_sets_org, model_type, percent_of_train, interval_accuracy_display,
+                  name, covn_net):
+	"""Train the nework"""
+	tf.reset_default_graph()
+	data_sets = data_shuffle(data_sets_org, percent_of_train)
+	ws, estimted_label, gradients, infomration, models, weights = [[None] * len(indexes) for _ in range(6)]
+	loss_func_test, loss_func_train, test_prediction, train_prediction = [np.zeros((len(indexes))) for _ in range(4)]
+	input_size = data_sets_org.data.shape[1]
+	num_of_classes = data_sets_org.labels.shape[1]
+	batch_size = np.min([batch_size, data_sets.train.data.shape[0]])
+	batch_points = np.rint(np.arange(0, data_sets.train.data.shape[0] + 1, batch_size)).astype(dtype=np.int32)
+	batch_points_test = np.rint(np.arange(0, data_sets.test.data.shape[0] + 1, batch_size)).astype(dtype=np.int32)
+	batch_points_all = np.rint(np.arange(0, data_sets_org.data.shape[0] + 1, batch_size)).astype(dtype=np.int32)
+	if data_sets_org.data.shape[0] not in batch_points_all:
+		batch_points_all = np.append(batch_points_all, [data_sets_org.data.shape[0]])
+	if data_sets.train.data.shape[0] not in batch_points:
+		batch_points = np.append(batch_points, [data_sets.train.data.shape[0]])
+	if data_sets.test.data.shape[0] not in batch_points_test:
+		batch_points_test = np.append(batch_points_test, [data_sets.test.data.shape[0]])
+	# Build the network
+	model = build_model(model_type, layerSize, input_size, num_of_classes, learning_rate_local, name, covn_net)
+	optimizer = model.optimize
+	saver = tf.train.Saver(max_to_keep=0)
+	init = tf.global_variables_initializer()
+	grads = tf.gradients(model.cross_entropy, tf.trainable_variables())
+	# Train the network
+	with tf.Session() as sess:
+		sess.run(init)
+		# Go over the epochs
+		k = 0
+		acc_train_array = []
+		for j in range(0, num_of_ephocs):
+			epochs_grads = []
+			if j in indexes:
+				ws[k] = exctract_activity(sess, batch_points_all, model, data_sets_org)
+			# Print accuracy
+			if np.mod(j, interval_accuracy_display) == 1 or interval_accuracy_display == 1:
+				print_accuracy(batch_points_test, data_sets, model, sess, j, acc_train_array)
+			# Go over the batch_points
+			acc_train_array = []
+			current_weights = [[] for _ in range(len(model.weights_all))]
+			for i in range(0, len(batch_points) - 1):
+				batch_xs = data_sets.train.data[batch_points[i]:batch_points[i + 1]]
+				batch_ys = data_sets.train.labels[batch_points[i]:batch_points[i + 1]]
+				feed_dict = {model.x: batch_xs, model.labels: batch_ys}
+				_, tr_err = sess.run([optimizer, model.accuracy], feed_dict=feed_dict)
+				acc_train_array.append(tr_err)
+				if j in indexes:
+					epochs_grads_temp, loss_tr, weights_local = sess.run(
+						[grads, model.cross_entropy, model.weights_all],
+						feed_dict=feed_dict)
+					epochs_grads.append(epochs_grads_temp)
+					for ii in range(len(current_weights)):
+						current_weights[ii].append(weights_local[ii])
+			if j in indexes:
+				if save_grads:
+					gradients[k] = epochs_grads
+					current_weights_mean = []
+					for ii in range(len(current_weights)):
+						current_weights_mean.append(np.mean(np.array(current_weights[ii]), axis=0))
+					weights[k] = current_weights_mean
+				# Save the model
+				write_meta = True if k == 0 else False
+				# saver.save(sess, model.save_file, global_step=k, write_meta_graph=write_meta)
+				k += 1
+	network = {}
+	network['ws'] = ws
+	network['test_prediction'] = test_prediction
+	network['train_prediction'] = train_prediction
+	network['loss_test'] = loss_func_test
+	network['loss_train'] = loss_func_train
+	network['gradients'] = gradients
+	network['model'] = model
+	return network

idnns/networks/network_paramters.py ADDED Viewed

	@@ -0,0 +1,136 @@

+import argparse
+import re
+def str2bool(v):
+	if v.lower() in ('yes', 'true', 't', 'y', '1') or v == True:
+		return True
+	elif v.lower() in ('no', 'false', 'f', 'n', '0'):
+		return False
+	else:
+		raise argparse.ArgumentTypeError('Boolean value expected.')
+def get_default_parser(num_of_samples=None):
+	parser = argparse.ArgumentParser()
+	parser.add_argument('-start_samples',
+	                    '-ss', dest="start_samples", default=1,
+	                    type=int, help='The number of the first sample that we calculate the information')
+	parser.add_argument('-batch_size',
+	                    '-b', dest="batch_size", default=512,
+	                    type=int, help='The size of the batch')
+	parser.add_argument('-learning_rate',
+	                    '-l', dest="learning_rate", default=0.0004,
+	                    type=float,
+	                    help='The learning rate of the network')
+	parser.add_argument('-num_repeat',
+	                    '-r', dest="num_of_repeats", default=1,
+	                    type=int, help='The number of times to run the network')
+	parser.add_argument('-num_epochs',
+	                    '-e', dest="num_ephocs", default=8000,
+	                    type=int, help='max number of epochs')
+	parser.add_argument('-net',
+	                    '-n', dest="net_type", default='1',
+	                    help='The architecture of the networks')
+	parser.add_argument('-inds',
+	                    '-i', dest="inds", default='[80]',
+	                    help='The percent of the training data')
+	parser.add_argument('-name',
+	                    '-na', dest="name", default='net',
+	                    help='The name to save the results')
+	parser.add_argument('-d_name',
+	                    '-dna', dest="data_name", default='var_u',
+	                    help='The dataset that we want to run ')
+	parser.add_argument('-num_samples',
+	                    '-ns', dest="num_of_samples", default=400,
+	                    type=int,
+	                    help='The max number of indexes for calculate information')
+	parser.add_argument('-nDistSmpls',
+	                    '-nds', dest="nDistSmpls", default=1,
+	                    type=int, help='S')
+	parser.add_argument('-save_ws',
+	                    '-sws', dest="save_ws", type=str2bool, nargs='?', const=False, default=False,
+	                    help='if we want to save the output of the layers')
+	parser.add_argument('-calc_information',
+	                    '-cinf', dest="calc_information", type=str2bool, nargs='?', const=True, default=True,
+	                    help='if we want to calculate the MI in the network for all the epochs')
+	parser.add_argument('-calc_information_last',
+	                    '-cinfl', dest="calc_information_last", type=str2bool, nargs='?', const=False, default=False,
+	                    help='if we want to calculate the MI in the network only for the last epoch')
+	parser.add_argument('-save_grads',
+	                    '-sgrad', dest="save_grads", type=str2bool, nargs='?', const=False, default=False,
+	                    help='if we want to save the gradients in the network')
+	parser.add_argument('-run_in_parallel',
+	                    '-par', dest="run_in_parallel", type=str2bool, nargs='?', const=False, default=False,
+	                    help='If we want to run all the networks in parallel mode')
+	parser.add_argument('-num_of_bins',
+	                    '-nbins', dest="num_of_bins", default=30, type=int,
+	                    help='The number of bins that we divide the output of the neurons')
+	parser.add_argument('-activation_function',
+	                    '-af', dest="activation_function", default=0, type=int,
+	                    help='The activation function of the model 0 for thnh 1 for RelU')
+	parser.add_argument('-iad', dest="interval_accuracy_display", default=499, type=int,
+	                    help='The interval for display accuracy')
+	parser.add_argument('-interval_information_display',
+	                    '-iid', dest="interval_information_display", default=30, type=int,
+	                    help='The interval for display the information calculation')
+	parser.add_argument('-cov_net',
+	                    '-cov', dest="cov_net", type=int, default=0,
+	                    help='True if we want covnet')
+	parser.add_argument('-rl',
+	                    '-rand_labels', dest="random_labels", type=str2bool, nargs='?', const=False, default=False,
+	                    help='True if we want to set random labels')
+	parser.add_argument('-data_dir',
+	                    '-dd', dest="data_dir", default='data/',
+	                    help='The directory for finding the data')
+	args = parser.parse_args()
+	args.inds = [map(int, inner.split(',')) for inner in re.findall("\[(.*?)\]", args.inds)]
+	if num_of_samples != None:
+		args.inds = [[num_of_samples]]
+	return args
+def select_network_arch(type_net):
+	"""Selcet the architectures of the networks according to their type
+	we can choose also costume network for example type_net=[size_1, size_2, size_3]"""
+	if type_net == '1':
+		layers_sizes = [[10, 7, 5, 4, 3]]
+	elif type_net == '1-2-3':
+		layers_sizes = [[10, 9, 7, 7, 3], [10, 9, 7, 5, 3], [10, 9, 7, 3, 3]]
+	elif type_net == '11':
+		layers_sizes = [[10, 7, 7, 4, 3]]
+	elif type_net == '2':
+		layers_sizes = [[10, 7, 5, 4]]
+	elif type_net == '3':
+		layers_sizes = [[10, 7, 5]]
+	elif type_net == '4':
+		layers_sizes = [[10, 7]]
+	elif type_net == '5':
+		layers_sizes = [[10]]
+	elif type_net == '6':
+		layers_sizes = [[1, 1, 1, 1]]
+	else:
+		# Custom network
+		layers_sizes = [map(int, inner.split(',')) for inner in re.findall("\[(.*?)\]", type_net)]
+	return layers_sizes

idnns/networks/ops.py ADDED Viewed

	@@ -0,0 +1,72 @@

+import tensorflow as tf
+import numpy as np
+from idnns.networks.utils import _convert_string_dtype
+import tensorflow.compat.v1 as tf
+tf.disable_v2_behavior()
+def conv2d(x, W):
+	"""conv2d returns a 2d convolution layer with full stride."""
+	return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
+def max_pool_2x2(x):
+	"""max_pool_2x2 downsamples a feature map by 2X."""
+	return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
+	                      strides=[1, 2, 2, 1], padding='SAME')
+def weight_variable(shape):
+	"""weight_variable generates a weight variable of a given shape."""
+	initial = tf.truncated_normal(shape, stddev=0.1)
+	return tf.Variable(initial)
+def bias_variable(shape):
+	"""bias_variable generates a bias variable of a given shape."""
+	initial = tf.constant(0.1, shape=shape)
+	return tf.Variable(initial)
+def set_value(x, value):
+	"""Sets the value of a variable, from a Numpy array.
+	# Arguments
+		x: Tensor to set to a new value.
+		value: Value to set the tensor to, as a Numpy array
+			(of the same shape).
+	"""
+	value = np.asarray(value)
+	tf_dtype = _convert_string_dtype(x.dtype.name.split('_')[0])
+	if hasattr(x, '_assign_placeholder'):
+		assign_placeholder = x._assign_placeholder
+		assign_op = x._assign_op
+	else:
+		assign_placeholder = tf.placeholder(tf_dtype, shape=value.shape)
+		assign_op = x.assign(assign_placeholder)
+		x._assign_placeholder = assign_placeholder
+		x._assign_op = assign_op
+	session = tf.get_default_session()
+	session.run(assign_op, feed_dict={assign_placeholder: value})
+def variable_summaries(var):
+	"""Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
+	with tf.name_scope('summaries'):
+		mean = tf.reduce_mean(var)
+		tf.summary.scalar('mean', mean)
+		with tf.name_scope('stddev'):
+			stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
+		tf.summary.scalar('stddev', stddev)
+		tf.summary.scalar('max', tf.reduce_max(var))
+		tf.summary.scalar('min', tf.reduce_min(var))
+		tf.summary.histogram('histogram', var)
+def get_scope_variable(name_scope, var, shape=None, initializer=None):
+	with tf.variable_scope(name_scope) as scope:
+		try:
+			v = tf.get_variable(var, shape, initializer=initializer)
+		except ValueError:
+			scope.reuse_variables()
+			v = tf.get_variable(var)
+	return v

idnns/networks/utils.py ADDED Viewed

	@@ -0,0 +1,154 @@

+import numpy as np
+import scipy.io as sio
+import os
+import sys
+import tensorflow as tf
+import tensorflow.compat.v1 as tf
+tf.disable_v2_behavior()
+# TF2 doesn't include the mnist module in tensorflow.examples.tutorials
+# Use tf.keras.datasets.mnist instead
+def load_data(name, random_labels=False):
+    """Load the data
+    name - the name of the dataset
+    random_labels - True if we want to return random labels to the dataset
+    return object with data and labels"""
+    print('Loading Data...')
+    C = type('type_C', (object,), {})
+    data_sets = C()
+    if name.split('/')[-1] == 'MNIST':
+        (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
+        data_sets.data = np.concatenate((x_train, x_test), axis=0).reshape((-1, 28*28)) / 255.0  # Normalize
+        data_sets.labels = np.concatenate((tf.one_hot(y_train, depth=10), tf.one_hot(y_test, depth=10)), axis=0)
+    else:
+        d = sio.loadmat(os.path.join(os.path.dirname(sys.argv[0]), name + '.mat'))
+        F = d['F']
+        y = d['y']
+        data_sets = C()
+        data_sets.data = F
+        data_sets.labels = np.squeeze(np.concatenate((y[None, :], 1 - y[None, :]), axis=0).T)
+    # If we want to assign random labels to the data
+    if random_labels:
+        labels = np.zeros(data_sets.labels.shape)
+        labels_index = np.random.randint(low=0, high=labels.shape[1], size=labels.shape[0])
+        labels[np.arange(len(labels)), labels_index] = 1
+        data_sets.labels = labels
+    return data_sets
+def shuffle_in_unison_inplace(a, b):
+    """Shuffle the arrays randomly"""
+    assert len(a) == len(b)
+    p = np.random.permutation(len(a))
+    return a[p], b[p]
+def data_shuffle(data_sets_org, percent_of_train, min_test_data=80, shuffle_data=False):
+	"""Divided the data to train and test and shuffle it"""
+	perc = lambda i, t: np.rint((i * t) / 100).astype(np.int32)
+	C = type('type_C', (object,), {})
+	data_sets = C()
+	stop_train_index = int(perc(percent_of_train, data_sets_org.data.shape[0]))
+	start_test_index = int(stop_train_index)
+	if percent_of_train > min_test_data:
+		start_test_index = int(perc(min_test_data, data_sets_org.data.shape[0]))
+	data_sets.train = C()
+	data_sets.test = C()
+	if shuffle_data:
+		shuffled_data, shuffled_labels = shuffle_in_unison_inplace(data_sets_org.data, data_sets_org.labels)
+	else:
+		shuffled_data, shuffled_labels = data_sets_org.data, data_sets_org.labels
+	data_sets.train.data = shuffled_data[:stop_train_index, :]
+	data_sets.train.labels = shuffled_labels[:stop_train_index, :]
+	data_sets.test.data = shuffled_data[start_test_index:, :]
+	data_sets.test.labels = shuffled_labels[start_test_index:, :]
+	return data_sets
+# This function was used for dtype conversion, which might not be necessary in the simplified context
+# However, if needed, TF2 supports these types directly without conversion
+# import numpy as np
+# from tensorflow.examples.tutorials.mnist import input_data
+# import scipy.io as sio
+# import os
+# import sys
+# import tensorflow as tf
+# def load_data(name, random_labels=False):
+# 	"""Load the data
+# 	name - the name of the dataset
+# 	random_labels - True if we want to return random labels to the dataset
+# 	return object with data and labels"""
+# 	print ('Loading Data...')
+# 	C = type('type_C', (object,), {})
+# 	data_sets = C()
+# 	if name.split('/')[-1] == 'MNIST':
+# 		data_sets_temp = input_data.read_data_sets(os.path.dirname(sys.argv[0]) + "/data/MNIST_data/", one_hot=True)
+# 		data_sets.data = np.concatenate((data_sets_temp.train.images, data_sets_temp.test.images), axis=0)
+# 		data_sets.labels = np.concatenate((data_sets_temp.train.labels, data_sets_temp.test.labels), axis=0)
+# 	else:
+# 		d = sio.loadmat(os.path.join(os.path.dirname(sys.argv[0]), name + '.mat'))
+# 		F = d['F']
+# 		y = d['y']
+# 		C = type('type_C', (object,), {})
+# 		data_sets = C()
+# 		data_sets.data = F
+# 		data_sets.labels = np.squeeze(np.concatenate((y[None, :], 1 - y[None, :]), axis=0).T)
+# 	# If we want to assign random labels to the  data
+# 	if random_labels:
+# 		labels = np.zeros(data_sets.labels.shape)
+# 		labels_index = np.random.randint(low=0, high=labels.shape[1], size=labels.shape[0])
+# 		labels[np.arange(len(labels)), labels_index] = 1
+# 		data_sets.labels = labels
+# 	return data_sets
+# def shuffle_in_unison_inplace(a, b):
+# 	"""Shuffle the arrays randomly"""
+# 	assert len(a) == len(b)
+# 	p = np.random.permutation(len(a))
+# 	return a[p], b[p]
+# def data_shuffle(data_sets_org, percent_of_train, min_test_data=80, shuffle_data=False):
+# 	"""Divided the data to train and test and shuffle it"""
+# 	perc = lambda i, t: np.rint((i * t) / 100).astype(np.int32)
+# 	C = type('type_C', (object,), {})
+# 	data_sets = C()
+# 	stop_train_index = perc(percent_of_train[0], data_sets_org.data.shape[0])
+# 	start_test_index = stop_train_index
+# 	if percent_of_train > min_test_data:
+# 		start_test_index = perc(min_test_data, data_sets_org.data.shape[0])
+# 	data_sets.train = C()
+# 	data_sets.test = C()
+# 	if shuffle_data:
+# 		shuffled_data, shuffled_labels = shuffle_in_unison_inplace(data_sets_org.data, data_sets_org.labels)
+# 	else:
+# 		shuffled_data, shuffled_labels = data_sets_org.data, data_sets_org.labels
+# 	data_sets.train.data = shuffled_data[:stop_train_index, :]
+# 	data_sets.train.labels = shuffled_labels[:stop_train_index, :]
+# 	data_sets.test.data = shuffled_data[start_test_index:, :]
+# 	data_sets.test.labels = shuffled_labels[start_test_index:, :]
+# 	return data_sets
+def _convert_string_dtype(dtype):
+	if dtype == 'float16':
+		return tf.float16
+	if dtype == 'float32':
+		return tf.float32
+	elif dtype == 'float64':
+		return tf.float64
+	elif dtype == 'int16':
+		return tf.int16
+	elif dtype == 'int32':
+		return tf.int32
+	elif dtype == 'int64':
+		return tf.int64
+	elif dtype == 'uint8':
+		return tf.int8
+	elif dtype == 'uint16':
+		return tf.uint16
+	else:
+		raise ValueError('Unsupported dtype:', dtype)

idnns/plots/__init__.py ADDED Viewed

File without changes

idnns/plots/ops.py ADDED Viewed

	@@ -0,0 +1,20 @@

+import math
+def sampleStandardDeviation(x):
+	"""calculates the sample standard deviation"""
+	sumv = 0.0
+	for i in x:
+		sumv += (i) ** 2
+	return math.sqrt(sumv / (len(x) - 1))
+def pearson(x, y):
+	"""calculates the PCC"""
+	scorex, scorey = [], []
+	for i in x:
+		scorex.append((i) / sampleStandardDeviation(x))
+	for j in y:
+		scorey.append((j) / sampleStandardDeviation(y))
+	# multiplies both lists together into 1 list (hence zip) and sums the whole list
+	return (sum([i * j for i, j in zip(scorex, scorey)])) / (len(x) - 1)

idnns/plots/plot_figures.py ADDED Viewed

	@@ -0,0 +1,678 @@

+"""Plot the networks in the information plane"""
+import matplotlib
+matplotlib.use("TkAgg")
+import numpy as np
+import _pickle as cPickle
+# import cPickle
+from scipy.interpolate import interp1d
+import matplotlib.pyplot as plt
+from matplotlib.collections import LineCollection
+from matplotlib.colors import ListedColormap, BoundaryNorm
+import scipy.io as sio
+import scipy.stats as sis
+import os
+import matplotlib.animation as animation
+import math
+import os.path
+import idnns.plots.utils as utils
+import tkinter as tk
+from numpy import linalg as LA
+from tkinter import filedialog
+LAYERS_COLORS  = ['red', 'blue', 'green', 'yellow', 'pink', 'orange']
+def plot_all_epochs(gen_data, I_XT_array, I_TY_array, axes, epochsInds, f, index_i, index_j, size_ind,
+                    font_size, y_ticks, x_ticks, colorbar_axis, title_str, axis_font, bar_font, save_name, plot_error = True,index_to_emphasis=1000):
+    """Plot the infomration plane with the epochs in diffrnet colors """
+    #If we want to plot the train and test error
+    if plot_error:
+        fig_strs = ['train_error','test_error','loss_train','loss_test' ]
+        fig_data = [np.squeeze(gen_data[fig_str]) for fig_str in fig_strs]
+        f1 = plt.figure(figsize=(12, 8))
+        ax1 = f1.add_subplot(111)
+        mean_sample = False if len(fig_data[0].shape)==1 else True
+        if mean_sample:
+            fig_data = [ np.mean(fig_data_s, axis=0) for fig_data_s in fig_data]
+        for i in range(len(fig_data)):
+            ax1.plot(epochsInds, fig_data[i],':', linewidth = 3 , label = fig_strs[i])
+        ax1.legend(loc='best')
+    f = plt.figure(figsize=(12, 8))
+    axes = f.add_subplot(111)
+    axes = np.array([[axes]])
+    I_XT_array = np.squeeze(I_XT_array)
+    I_TY_array = np.squeeze(I_TY_array)
+    if len(I_TY_array[0].shape) >1:
+        I_XT_array = np.mean(I_XT_array, axis=0)
+        I_TY_array = np.mean(I_TY_array, axis=0)
+    max_index = size_ind if size_ind != -1 else I_XT_array.shape[0]
+    cmap = plt.get_cmap('gnuplot')
+    #For each epoch we have diffrenet color
+    colors = [cmap(i) for i in np.linspace(0, 1, epochsInds[max_index-1]+1)]
+    #Change this if we have more then one network arch
+    nums_arc= -1
+    #Go over all the epochs and plot then with the right color
+    for index_in_range in range(0, max_index):
+        XT = I_XT_array[index_in_range, :]
+        TY = I_TY_array[index_in_range, :]
+        #If this is the index that we want to emphsis
+        if epochsInds[index_in_range] ==index_to_emphasis:
+            axes[index_i, index_j].plot(XT, TY, marker='o', linestyle=None, markersize=19, markeredgewidth=0.04,
+                                        linewidth=2.1,
+                                        color='g',zorder=10)
+        else:
+                axes[index_i, index_j].plot(XT[:], TY[:], marker='o', linestyle='-', markersize=12, markeredgewidth=0.01, linewidth=0.2,
+                                color=colors[int(epochsInds[index_in_range])])
+    utils.adjustAxes(axes[index_i, index_j], axis_font=axis_font, title_str=title_str, x_ticks=x_ticks,
+                     y_ticks=y_ticks, x_lim=[0, 25.1], y_lim=None,
+                     set_xlabel=index_i == axes.shape[0] - 1, set_ylabel=index_j == 0, x_label='$I(X;T)$',
+                     y_label='$I(T;Y)$', set_xlim=False,
+                     set_ylim=False, set_ticks=True, label_size=font_size)
+    #Save the figure and add color bar
+    if index_i ==axes.shape[0]-1 and index_j ==axes.shape[1]-1:
+        utils.create_color_bar(f, cmap, colorbar_axis, bar_font, epochsInds, title='Epochs')
+        f.savefig(save_name+'.jpg', dpi=500, format='jpg')
+def plot_by_training_samples(I_XT_array, I_TY_array, axes, epochsInds, f, index_i, index_j, size_ind, font_size, y_ticks, x_ticks, colorbar_axis, title_str, axis_font, bar_font, save_name, samples_labels):
+    """Print the final epoch of all the diffrenet training samples size """
+    max_index = size_ind if size_ind!=-1 else I_XT_array.shape[2]-1
+    cmap = plt.get_cmap('gnuplot')
+    colors = [cmap(i) for i in np.linspace(0, 1, max_index+1)]
+    #Print the final epoch
+    nums_epoch= -1
+    #Go over all the samples size and plot them with the right color
+    for index_in_range in range(0, max_index):
+        XT, TY = [], []
+        for layer_index in range(0, I_XT_array.shape[4]):
+                XT.append(np.mean(I_XT_array[:, -1, index_in_range, nums_epoch, layer_index], axis=0))
+                TY.append(np.mean(I_TY_array[:, -1, index_in_range,nums_epoch, layer_index], axis=0))
+        axes[index_i, index_j].plot(XT, TY, marker='o', linestyle='-', markersize=12, markeredgewidth=0.2, linewidth=0.5,
+                         color=colors[index_in_range])
+    utils.adjustAxes(axes[index_i, index_j], axis_font=axis_font, title_str=title_str, x_ticks=x_ticks, y_ticks=y_ticks,
+                     x_lim=None, y_lim=None,
+                     set_xlabel=index_i == axes.shape[0] - 1, set_ylabel=index_j == 0, x_label='$I(X;T)$',
+                     y_label='$I(T;Y)$', set_xlim=True,
+                     set_ylim=True, set_ticks=True, label_size=font_size)
+    #Create color bar and save it
+    if index_i == axes.shape[0] - 1 and index_j == axes.shape[1] - 1:
+        utils.create_color_bar(f, cmap, colorbar_axis, bar_font, epochsInds, title='Training Data')
+        f.savefig(save_name + '.jpg', dpi=150, format='jpg')
+def calc_velocity(data, epochs):
+    """Calculate the velocity (both in X and Y) for each layer"""
+    vXs, vYs = [], []
+    for layer_index in range(data.shape[5]):
+        curernt_vXs = []
+        current_VYs = []
+        for epoch_index in range(len(epochs)-1):
+            vx = np.mean(data[0,:,-1, -1, epoch_index+1,layer_index], axis=0) - np.mean(data[0,:,-1, -1, epoch_index,layer_index], axis=0)
+            vx/= (epochs[epoch_index+1] - epochs[epoch_index])
+            vy = np.mean(data[1, :, -1, -1, epoch_index + 1, layer_index], axis=0) - np.mean(data[1, :, -1, -1, epoch_index, layer_index],                                                                      axis=0)
+            vy /= (epochs[epoch_index + 1] - epochs[epoch_index])
+            current_VYs.append(vy)
+            curernt_vXs.append(vx)
+        vXs.append(curernt_vXs)
+        vYs.append(current_VYs)
+    return vXs,vYs
+def update_line_specipic_points(nums, data, axes, to_do, font_size, axis_font):
+    """Update the lines in the axes for snapshot of the whole process"""
+    colors =LAYERS_COLORS
+    x_ticks = [0, 2, 4, 6, 8, 10]
+    #Go over all the snapshot
+    for i in range(len(nums)):
+        num = nums[i]
+        #Plot the right layer
+        for layer_num in range(data.shape[3]):
+            axes[i].scatter(data[0, :, num, layer_num], data[1, :, num, layer_num], color = colors[layer_num], s = 105,edgecolors = 'black',alpha = 0.85)
+        utils.adjustAxes(axes[i], axis_font=axis_font, title_str='', x_ticks=x_ticks, y_ticks=[], x_lim=None,
+                         y_lim=None,
+                         set_xlabel=to_do[i][0], set_ylabel=to_do[i][1], x_label='$I(X;T)$', y_label='$I(T;Y)$',
+                         set_xlim=True, set_ylim=True,
+                         set_ticks=True, label_size=font_size)
+def update_line_each_neuron(num, print_loss, Ix, axes, Iy, train_data, accuracy_test, epochs_bins, loss_train_data, loss_test_data, colors, epochsInds,
+                            font_size = 18, axis_font = 16, x_lim = [0,12.2], y_lim=[0, 1.08],x_ticks = [], y_ticks = []):
+    """Update the figure of the infomration plane for the movie"""
+    #Print the line between the points
+    axes[0].clear()
+    if len(axes)>1:
+        axes[1].clear()
+    #Print the points
+    for layer_num in range(Ix.shape[2]):
+        for net_ind in range(Ix.shape[0]):
+            axes[0].scatter(Ix[net_ind,num, layer_num], Iy[net_ind,num, layer_num], color = colors[layer_num], s = 35,edgecolors = 'black',alpha = 0.85)
+    title_str = 'Information Plane - Epoch number - ' + str(epochsInds[num])
+    utils.adjustAxes(axes[0], axis_font, title_str, x_ticks, y_ticks, x_lim, y_lim, set_xlabel=True, set_ylabel=True,
+                     x_label='$I(X;T)$', y_label='$I(T;Y)$')
+    #Print the loss function and the error
+    if len(axes)>1:
+        axes[1].plot(epochsInds[:num], 1 - np.mean(accuracy_test[:, :num], axis=0), color='g')
+        if print_loss:
+            axes[1].plot(epochsInds[:num], np.mean(loss_test_data[:, :num], axis=0), color='y')
+        nereast_val = np.searchsorted(epochs_bins, epochsInds[num], side='right')
+        axes[1].set_xlim([0,epochs_bins[nereast_val]])
+        axes[1].legend(('Accuracy', 'Loss Function'), loc='best')
+def update_line(num, print_loss, data, axes, epochsInds, test_error, test_data, epochs_bins, loss_train_data, loss_test_data, colors,
+                font_size = 18, axis_font=16, x_lim = [0,12.2], y_lim=[0, 1.08], x_ticks = [], y_ticks = []):
+    """Update the figure of the infomration plane for the movie"""
+    #Print the line between the points
+    cmap = ListedColormap(LAYERS_COLORS)
+    segs = []
+    for i in range(0, data.shape[1]):
+        x = data[0, i, num, :]
+        y = data[1, i, num, :]
+        points = np.array([x, y]).T.reshape(-1, 1, 2)
+        segs.append(np.concatenate([points[:-1], points[1:]], axis=1))
+    segs = np.array(segs).reshape(-1, 2, 2)
+    axes[0].clear()
+    if len(axes)>1:
+        axes[1].clear()
+    lc = LineCollection(segs, cmap=cmap, linestyles='solid',linewidths = 0.3, alpha = 0.6)
+    lc.set_array(np.arange(0,5))
+    #Print the points
+    for layer_num in range(data.shape[3]):
+        axes[0].scatter(data[0, :, num, layer_num], data[1, :, num, layer_num], color = colors[layer_num], s = 35,edgecolors = 'black',alpha = 0.85)
+    axes[1].plot(epochsInds[:num], 1 - np.mean(test_error[:, :num], axis=0), color ='r')
+    title_str = 'Information Plane - Epoch number - ' + str(epochsInds[num])
+    utils.adjustAxes(axes[0], axis_font, title_str, x_ticks, y_ticks, x_lim, y_lim, set_xlabel=True, set_ylabel=True,
+                     x_label='$I(X;T)$', y_label='$I(T;Y)$')
+    title_str = 'Precision as function of the epochs'
+    utils.adjustAxes(axes[1], axis_font, title_str, x_ticks, y_ticks, x_lim, y_lim, set_xlabel=True, set_ylabel=True,
+                     x_label='# Epochs', y_label='Precision')
+def plot_animation(name_s, save_name):
+    """Plot the movie for all the networks in the information plane"""
+    # If we want to print the loss function also
+    print_loss  = False
+    #The bins that we extened the x axis of the accuracy each time
+    epochs_bins = [0, 500, 1500, 3000, 6000, 10000, 20000]
+    data_array = utils.get_data(name_s[0][0])
+    data = data_array['infomration']
+    epochsInds = data_array['epochsInds']
+    loss_train_data = data_array['loss_train']
+    loss_test_data = data_array['loss_test_data']
+    f, (axes) = plt.subplots(2, 1)
+    f.subplots_adjust(left=0.14, bottom=0.1, right=.928, top=0.94, wspace=0.13, hspace=0.55)
+    colors = LAYERS_COLORS
+    #new/old version
+    if False:
+        Ix = np.squeeze(data[0,:,-1,-1, :, :])
+        Iy = np.squeeze(data[1,:,-1,-1, :, :])
+    else:
+        Ix = np.squeeze(data[0, :, -1, -1, :, :])[np.newaxis,:,:]
+        Iy = np.squeeze(data[1, :, -1, -1, :, :])[np.newaxis,:,:]
+    #Interploation of the samplings (because we don't cauclaute the infomration in each epoch)
+    interp_data_x = interp1d(epochsInds,  Ix, axis=1)
+    interp_data_y = interp1d(epochsInds,  Iy, axis=1)
+    new_x = np.arange(0,epochsInds[-1])
+    new_data  = np.array([interp_data_x(new_x), interp_data_y(new_x)])
+    """"
+    train_data = interp1d(epochsInds,  np.squeeze(train_data), axis=1)(new_x)
+    test_data = interp1d(epochsInds,  np.squeeze(test_data), axis=1)(new_x)
+    """
+    if print_loss:
+        loss_train_data =  interp1d(epochsInds,  np.squeeze(loss_train_data), axis=1)(new_x)
+        loss_test_data=interp1d(epochsInds,  np.squeeze(loss_test_data), axis=1)(new_x)
+    line_ani = animation.FuncAnimation(f, update_line, len(new_x), repeat=False,
+                                       interval=1, blit=False, fargs=(print_loss, new_data, axes,new_x,train_data,test_data,epochs_bins, loss_train_data,loss_test_data, colors))
+    Writer = animation.writers['ffmpeg']
+    writer = Writer(fps=100)
+    #Save the movie
+    line_ani.save(save_name+'_movie2.mp4',writer=writer,dpi=250)
+    plt.show()
+def plot_animation_each_neuron(name_s, save_name, print_loss=False):
+    """Plot the movie for all the networks in the information plane"""
+    # If we want to print the loss function also
+    #The bins that we extened the x axis of the accuracy each time
+    epochs_bins = [0, 500, 1500, 3000, 6000, 10000, 20000]
+    data_array = utils.get_data(name_s[0][0])
+    data = np.squeeze(data_array['information'])
+    f, (axes) = plt.subplots(1, 1)
+    axes = [axes]
+    f.subplots_adjust(left=0.14, bottom=0.1, right=.928, top=0.94, wspace=0.13, hspace=0.55)
+    colors = LAYERS_COLORS
+    #new/old version
+    Ix = np.squeeze(data[0,:, :, :])
+    Iy = np.squeeze(data[1,:, :, :])
+    #Interploation of the samplings (because we don't cauclaute the infomration in each epoch)
+    #interp_data_x = interp1d(epochsInds,  Ix, axis=1)
+    #interp_data_y = interp1d(epochsInds,  Iy, axis=1)
+    #new_x = np.arange(0,epochsInds[-1])
+    #new_data  = np.array([interp_data_x(new_x), interp_data_y(new_x)])
+    """"
+    train_data = interp1d(epochsInds,  np.squeeze(train_data), axis=1)(new_x)
+    test_data = interp1d(epochsInds,  np.squeeze(test_data), axis=1)(new_x)
+    if print_loss:
+        loss_train_data =  interp1d(epochsInds,  np.squeeze(loss_train_data), axis=1)(new_x)
+        loss_test_data=interp1d(epochsInds,  np.squeeze(loss_test_data), axis=1)(new_x)
+    """
+    line_ani = animation.FuncAnimation(f, update_line_each_neuron, Ix.shape[1], repeat=False,
+                                       interval=1, blit=False, fargs=(print_loss, Ix, axes,Iy,train_data,test_data,epochs_bins, loss_train_data,loss_test_data, colors,epochsInds))
+    Writer = animation.writers['ffmpeg']
+    writer = Writer(fps=100)
+    #Save the movie
+    line_ani.save(save_name+'_movie.mp4',writer=writer,dpi=250)
+    plt.show()
+def plot_snapshots(name_s, save_name, i, time_stemps=[13, 180, 963],font_size = 36,axis_font = 28,fig_size = (14, 6)):
+    """Plot snapshots of the given network"""
+    f, (axes) = plt.subplots(1, len(time_stemps), sharey=True, figsize=fig_size)
+    f.subplots_adjust(left=0.095, bottom=0.18, right=.99, top=0.97, wspace=0.03, hspace=0.03)
+    #Adding axes labels
+    to_do = [[True, True], [True, False], [True, False]]
+    data_array = utils.get_data(name_s)
+    data = np.squeeze(data_array['information'])
+    update_line_specipic_points(time_stemps, data, axes, to_do, font_size, axis_font)
+    f.savefig(save_name + '.jpg', dpi=200, format='jpg')
+def load_figures(mode, str_names=None):
+    """Creaet new figure based on the mode of it
+    This function is really messy and need to rewrite """
+    if mode == 0:
+        font_size = 34
+        axis_font = 28
+        bar_font = 28
+        fig_size = (14, 6.5)
+        title_strs = [['', '']]
+        f, (axes) = plt.subplots(1, 2, sharey=True, figsize=fig_size)
+        sizes = [[-1, -1]]
+        colorbar_axis = [0.935, 0.14, 0.027, 0.75]
+        axes = np.vstack(axes).T
+        f.subplots_adjust(left=0.09, bottom=0.15, right=.928, top=0.94, wspace=0.03, hspace=0.04)
+        yticks = [0, 1, 2, 3]
+        xticks = [3, 6, 9, 12, 15]
+    # for 3 rows with 2 colmus
+    if mode == 1:
+        font_size = 34
+        axis_font = 26
+        bar_font = 28
+        fig_size = (14, 19)
+        xticks = [1, 3, 5, 7, 9, 11]
+        yticks = [0, 0.2, 0.4, 0.6, 0.8, 1]
+        title_strs = [['One hidden layer', 'Two hidden layers'], ['Three hidden layers', 'Four hidden layers'],
+                      ['Five hidden layers', 'Six hidden layers']]
+        title_strs = [['5 bins', '12 bins'], ['18 bins', '25 bins'],
+                      ['35 bins', '50 bins']]
+        f, (axes) = plt.subplots(3, 2, sharex=True, sharey=True, figsize=fig_size)
+        f.subplots_adjust(left=0.09, bottom=0.08, right=.92, top=0.94, wspace=0.03, hspace=0.15)
+        colorbar_axis = [0.93, 0.1, 0.035, 0.76]
+        sizes = [[1010, 1010], [1017, 1020], [1700, 920]]
+    # for 2 rows with 3 colmus
+    if mode == 11:
+        font_size = 34
+        axis_font = 26
+        bar_font = 28
+        fig_size = (14, 9)
+        xticks = [1, 3, 5, 7, 9, 11]
+        yticks = [0, 0.2, 0.4, 0.6, 0.8, 1]
+        title_strs = [['One hidden layer', 'Two hidden layers','Three hidden layers'], ['Four hidden layers',
+                      'Five hidden layers', 'Six hidden layers']]
+        title_strs = [['5 Bins', '10 Bins', '15 Bins'], ['20 Bins',
+                                                                                         '25 Bins',
+                                                                                         '35 Bins']]
+        f, (axes) = plt.subplots(2, 3, sharex=True, sharey=True, figsize=fig_size)
+        f.subplots_adjust(left=0.09, bottom=0.1, right=.92, top=0.94, wspace=0.03, hspace=0.15)
+        colorbar_axis = [0.93, 0.1, 0.035, 0.76]
+        sizes = [[1010, 1010, 1017], [1020, 1700, 920]]
+    # one figure
+    if mode == 2 or mode ==6:
+        axis_font = 28
+        bar_font = 28
+        fig_size = (14, 10)
+        font_size = 34
+        f, (axes) = plt.subplots(1, len(str_names), sharey=True, figsize=fig_size)
+        if len(str_names) == 1:
+            axes = np.vstack(np.array([axes]))
+        f.subplots_adjust(left=0.097, bottom=0.12, right=.87, top=0.99, wspace=0.03, hspace=0.03)
+        colorbar_axis = [0.905, 0.12, 0.03, 0.82]
+        xticks = [1, 3, 5, 7, 9, 11]
+        yticks = [0, 0.2, 0.4, 0.6, 0.8, 1]
+        #yticks = [0, 1, 2, 3, 3.5]
+        #xticks = [2, 5, 8, 11, 14, 17]
+        sizes = [[-1]]
+        title_strs = [['', '']]
+    # one figure with error bar
+    if mode == 3:
+        fig_size = (14, 10)
+        font_size = 36
+        axis_font = 28
+        bar_font = 25
+        title_strs = [['']]
+        f, (axes) = plt.subplots(1, len(str_names), sharey=True, figsize=fig_size)
+        if len(str_names) == 1:
+            axes = np.vstack(np.array([axes]))
+        f.subplots_adjust(left=0.097, bottom=0.12, right=.90, top=0.99, wspace=0.03, hspace=0.03)
+        sizes = [[-1]]
+        colorbar_axis = [0.933, 0.125, 0.03, 0.83]
+        xticks = [0, 2, 4, 6, 8, 10, 12]
+        yticks = [0.3, 0.4, 0.6, 0.8, 1]
+        # two figures second
+    if mode == 4:
+        font_size = 27
+        axis_font = 18
+        bar_font = 23
+        fig_size = (14, 6.5)
+        title_strs = [['', '']]
+        f, (axes) = plt.subplots(1, 2, figsize=fig_size)
+        sizes = [[-1, -1]]
+        colorbar_axis = [0.948, 0.08, 0.025, 0.81]
+        axes = np.vstack(axes).T
+        f.subplots_adjust(left=0.07, bottom=0.15, right=.933, top=0.94, wspace=0.12, hspace=0.04)
+        #yticks = [0, 0.2, 0.4, 0.6, 0.8, 1]
+        #xticks = [1, 3, 5, 7, 9, 11]
+        yticks = [0,  1,  2, 3, 3]
+        xticks = [2, 5, 8, 11,14, 17]
+    return font_size, axis_font, bar_font, colorbar_axis, sizes, yticks, xticks,title_strs, f, axes
+def plot_figures(str_names, mode, save_name):
+    """Plot the data in the given names with the given mode"""
+    [font_size, axis_font, bar_font, colorbar_axis, sizes, yticks, xticks,title_strs, f, axes] = load_figures(mode, str_names)
+    #Go over all the files
+    for i in range(len(str_names)):
+        for j in range(len(str_names[i])):
+            name_s = str_names[i][j]
+            #Load data for the given file
+            data_array = utils.get_data(name_s)
+            data  = np.squeeze(np.array(data_array['information']))
+            I_XT_array = np.array(extract_array(data, 'local_IXT'))
+            I_TY_array = np.array(extract_array(data, 'local_ITY'))
+            #I_XT_array = np.array(extract_array(data, 'IXT_vartional'))
+            #I_TY_array = np.array(extract_array(data, 'ITY_vartional'))
+            epochsInds = data_array['params']['epochsInds']
+            #I_XT_array = np.squeeze(np.array(data))[:, :, 0]
+            #I_TY_array = np.squeeze(np.array(data))[:, :, 1]
+            #Plot it
+            if mode ==3:
+                plot_by_training_samples(I_XT_array, I_TY_array, axes, epochsInds, f, i, j, sizes[i][j], font_size, yticks, xticks, colorbar_axis, title_strs[i][j], axis_font, bar_font, save_name)
+            elif mode ==6:
+                plot_norms(axes, epochsInds,data_array['norms1'],data_array['norms2'])
+            else:
+                plot_all_epochs(data_array, I_XT_array, I_TY_array, axes, epochsInds, f, i, j, sizes[i][j], font_size, yticks, xticks,
+                                colorbar_axis, title_strs[i][j], axis_font, bar_font, save_name)
+    plt.show()
+def plot_norms(axes, epochsInds, norms1, norms2):
+    """Plot the norm l1 and l2 of the given name"""
+    axes.plot(epochsInds, np.mean(norms1[:,0,0,:], axis=0), color='g')
+    axes.plot(epochsInds, np.mean(norms2[:,0,0,:], axis=0), color='b')
+    axes.legend(('L1 norm', 'L2 norm'))
+    axes.set_xlabel('Epochs')
+def plot_pearson(name):
+    """Plot the pearsin coeff of  the neurons for each layer"""
+    data_array = utils.get_data(name)
+    ws = data_array['weights']
+    f = plt.figure(figsize=(12, 8))
+    axes = f.add_subplot(111)
+    #The number of neurons in each layer -
+    #TODO need to change it to be auto
+    sizes =[10,7, 5, 4,3,2 ]
+    #The mean of pearson coeffs of all the layers
+    pearson_mean =[]
+    #Go over all the layers
+    for layer in range(len(sizes)):
+        inner_pearson_mean =[]
+        #Go over all the weights in the layer
+        for k in range(len(ws)):
+            ws_current = np.squeeze(ws[k][0][0][-1])
+            #Go over the neurons
+            for neuron in range(len(ws_current[layer])):
+                person_t = []
+                #Go over the rest of the neurons
+                for neuron_second in range(neuron+1, len(ws_current[layer])):
+                    pearson_c, p_val =sis.pearsonr(ws_current[layer][neuron], ws_current[layer][neuron_second])
+                    person_t.append(pearson_c)
+            inner_pearson_mean.append(np.mean(person_t))
+        pearson_mean.append(np.mean(inner_pearson_mean))
+    #Plot the coeff
+    axes.bar(np.arange(1,7), np.abs(np.array(pearson_mean))*np.sqrt(sizes), align='center')
+    axes.set_xlabel('Layer')
+    axes.set_ylabel('Abs(Pearson)*sqrt(N_i)')
+    rects = axes.patches
+    # Now make some labels
+    labels = ["L%d (%d nuerons)" % (i, j) for i, j in zip(range(len(rects)), sizes)]
+    plt.xticks(np.arange(1,7), labels)
+def update_axes(axes, xlabel, ylabel, xlim, ylim, title, xscale, yscale, x_ticks, y_ticks, p_0, p_1
+                ,font_size = 30, axis_font = 25,legend_font = 16 ):
+    """adjust the axes to the ight scale/ticks and labels"""
+    categories =6*['']
+    labels = ['$10^{-5}$', '$10^{-4}$', '$10^{-3}$', '$10^{-2}$', '$10^{-1}$', '$10^0$', '$10^1$']
+    #The legents of the mean and the std
+    leg1 = plt.legend(p_0, categories, title=r'$\|Mean\left(\nabla{W_i}\right)\|$', loc='best',fontsize = legend_font,markerfirst = False, handlelength = 5)
+    leg2 = plt.legend(p_1, categories, title=r'$STD\left(\nabla{W_i}\right)$', loc='best',fontsize = legend_font ,markerfirst = False,handlelength = 5)
+    leg1.get_title().set_fontsize('21')  # legend 'Title' fontsize
+    leg2.get_title().set_fontsize('21')  # legend 'Title' fontsize
+    plt.gca().add_artist(leg1)
+    plt.gca().add_artist(leg2)
+    utils.adjustAxes(axes, axis_font=20, title_str='', x_ticks=x_ticks, y_ticks=y_ticks, x_lim=xlim, y_lim=ylim,
+                     set_xlabel=True, set_ylabel=True, x_label=xlabel, y_label=ylabel, set_xlim=True, set_ylim=True,
+                     set_ticks=True, label_size=font_size, set_yscale=True,
+                     set_xscale=True, yscale=yscale, xscale=xscale, ytick_labels=labels, genreal_scaling=True)
+def extract_array(data, name):
+    results = [[data[j,k][name] for k in range(data.shape[1])] for j in range(data.shape[0])]
+    return results
+def update_bars_num_of_ts(num, p_ts, H_Xgt,DKL_YgX_YgT, axes, ind_array):
+    axes[1].clear()
+    axes[2].clear()
+    axes[0].clear()
+    current_pts =p_ts[num]
+    current_H_Xgt = H_Xgt[num]
+    current_DKL_YgX_YgT = DKL_YgX_YgT[num]
+    num_of_t = [c_pts.shape[0] for c_pts in current_pts]
+    x = range(len(num_of_t))
+    axes[0].bar(x, num_of_t)
+    axes[0].set_title('Number of Ts in every layer - Epoch number - {0}'.format(ind_array[num]))
+    axes[0].set_xlabel('Layer Number')
+    axes[0].set_ylabel('# of Ts')
+    axes[0].set_ylim([0, 800])
+    h_list, dkl_list = [], []
+    for i in range(len(current_pts)):
+        h_list.append(-np.dot(current_H_Xgt[i],current_pts[i]))
+        dkl_list.append(np.dot(current_DKL_YgX_YgT[i].T, current_pts[i]))
+    axes[1].bar(x,h_list)
+    axes[2].bar(x,dkl_list)
+    axes[2].bar(x, dkl_list)
+    axes[1].set_title('H(X|T)', title_size = 16)
+    axes[1].set_xlabel('Layer Number')
+    axes[1].set_ylabel('H(X|T)')
+    axes[2].set_title('DKL[p(y|x)||p(y|t)]',fontsize = 16)
+    axes[2].set_xlabel('Layer Number')
+    axes[2].set_ylabel('DKL[p(y|x)||p(y|t)]',fontsize = 16)
+def update_bars_entropy(num, H_Xgt,DKL_YgX_YgT, axes, ind_array):
+    axes[0].clear()
+    current_H_Xgt =np.mean(H_Xgt[num], axis=0)
+    x = range(len(current_H_Xgt))
+    axes[0].bar(x, current_H_Xgt)
+    axes[0].set_title('H(X|T) in every layer - Epoch number - {0}'.format(ind_array[num]))
+    axes[0].set_xlabel('Layer Number')
+    axes[0].set_ylabel('# of Ts')
+def plot_hist(str_name, save_name='dist'):
+    data_array = utils.get_data(str_name)
+    params = np.squeeze(np.array(data_array['information']))
+    ind_array = data_array['params']['epochsInds']
+    DKL_YgX_YgT = utils.extract_array(params, 'DKL_YgX_YgT')
+    p_ts = utils.extract_array(params, 'pts')
+    H_Xgt = utils.extract_array(params, 'H_Xgt')
+    f, (axes) = plt.subplots(3, 1)
+    #axes = [axes]
+    f.subplots_adjust(left=0.14, bottom=0.1, right=.928, top=0.94, wspace=0.13, hspace=0.55)
+    colors = LAYERS_COLORS
+    line_ani = animation.FuncAnimation(f, update_bars_num_of_ts, len(p_ts), repeat=False,
+                                       interval=1, blit=False, fargs=[p_ts,H_Xgt,DKL_YgX_YgT, axes,ind_array])
+    Writer = animation.writers['ffmpeg']
+    writer = Writer(fps=50)
+    #Save the movie
+    line_ani.save(save_name+'_movie.mp4',writer=writer,dpi=250)
+    plt.show()
+def plot_alphas(str_name, save_name='dist'):
+    data_array = utils.get_data(str_name)
+    params = np.squeeze(np.array(data_array['information']))
+    I_XT_array = np.squeeze(np.array(extract_array(params, 'local_IXT')))
+    """"
+    for i in range(I_XT_array.shape[2]):
+        f1, axes1 = plt.subplots(1, 1)
+        axes1.plot(I_XT_array[:,:,i])
+    plt.show()
+    return
+    """
+    I_XT_array_var = np.squeeze(np.array(extract_array(params, 'IXT_vartional')))
+    I_TY_array_var = np.squeeze(np.array(extract_array(params, 'ITY_vartional')))
+    I_TY_array = np.squeeze(np.array(extract_array(params, 'local_ITY')))
+    """
+    f1, axes1 = plt.subplots(1, 1)
+    #axes1.plot(I_XT_array,I_TY_array)
+    f1, axes2 = plt.subplots(1, 1)
+    axes1.plot(I_XT_array ,I_TY_array_var)
+    axes2.plot(I_XT_array ,I_TY_array)
+    f1, axes1 = plt.subplots(1, 1)
+    axes1.plot(I_TY_array, I_TY_array_var)
+    axes1.plot([0, 1.1], [0, 1.1], transform=axes1.transAxes)
+    #axes1.set_title('Sigmma=' + str(sigmas[i]))
+    axes1.set_ylim([0, 1.1])
+    axes1.set_xlim([0, 1.1])
+    plt.show()
+    return
+    """
+    #for i in range()
+    sigmas = np.linspace(0, 0.3, 20)
+    for i in range(0,20):
+        print (i, sigmas[i])
+        f1, axes1 = plt.subplots(1, 1)
+        axes1.plot(I_XT_array, I_XT_array_var[:,:,i], linewidth=5)
+        axes1.plot([0, 15.1], [0, 15.1], transform=axes1.transAxes)
+        axes1.set_title('Sigmma=' +str(sigmas[i]))
+        axes1.set_ylim([0,15.1])
+        axes1.set_xlim([0,15.1])
+    plt.show()
+    return
+    epochs_s = data_array['params']['epochsInds']
+    f, axes = plt.subplots(1, 1)
+    #epochs_s = []
+    colors = LAYERS_COLORS
+    linestyles  = [ '--', '-.', '-','', ' ',':', '']
+    epochs_s =[0, -1]
+    for j in epochs_s:
+        for i  in range(0, I_XT_array.shape[1]):
+            axes.plot(sigmas, I_XT_array_var[j,i,:],color = colors[i], linestyle = linestyles[j], label='Layer-'+str(i) +' Epoch - ' +str(epochs_s[j]))
+    title_str = 'I(X;T) for different layers as function of $\sigma$ (The width of the gaussian)'
+    x_label = '$\sigma$'
+    y_label = '$I(X;T)$'
+    x_lim = [0, 3]
+    utils.adjustAxes(axes, axis_font=20, title_str=title_str, x_ticks=[], y_ticks=[], x_lim=x_lim, y_lim=None,
+                     set_xlabel=True, set_ylabel=True, x_label=x_label, y_label=y_label, set_xlim=True, set_ylim=False,
+                     set_ticks=False,
+                     label_size=20, set_yscale=False,
+                     set_xscale=False, yscale=None, xscale=None, ytick_labels='', genreal_scaling=False)
+    axes.legend()
+    plt.show()
+if __name__ == '__main__':
+    #The action the you want to plot
+    #Plot snapshots of all the networks
+    TIME_STEMPS = 'time-stemp'
+    #create movie of the networks
+    MOVIE = 'movie'
+    #plot networks with diffrenet number of layers
+    ALL_LAYERS = 'all_layers'
+    #plot networks with 5% of the data and with 80%
+    COMPRAED_PERCENT = 'compare_percent'
+    #plot the infomration curves for the networks with diffrenet percent of the data
+    ALL_SAMPLES = 'all_samples'
+    #Choose whice figure to plot
+    action = COMPRAED_PERCENT
+    prex = 'jobsFiles/'
+    sofix = '.pickle'
+    prex2 = '/Users/ravidziv/PycharmProjects/IDNNs/jobs/'
+    #plot above action, the norms, the gradients and the pearson coeffs
+    do_plot_action, do_plot_norms, do_plot_pearson = True, False, False
+    do_plot_eig = False
+    plot_movie = False
+    do_plot_time_stepms = False
+    #str_names = [[prex2+'fo_layerSizes=10,7,5,4,3_LastEpochsInds=9998_nRepeats=1_batch=3563_DataName=reg_1_nEpoch=10000_lr=0.0004_nEpochInds=964_samples=1_nDistSmpls=1/']]
+    if action == TIME_STEMPS or action == MOVIE:
+        index = 1
+        name_s = prex2+ 'g_layerSizes=10,7,5,4,3_LastEpochsInds=9998_nRepeats=40_batch=3563_DataName=var_u_nEpoch=10000_lr=0.0002_nEpochInds=964_samples=1_nDistSmpls=1/'
+        name_s = prex2 +'r_DataName=MNIST_sampleLen=1_layerSizes=400,200,150,60,50,40,30_lr=0.0002_nEpochInds=677_nRepeats=1_LastEpochsInds=1399_nDistSmpls=1_nEpoch=1400_batch=2544/'
+        if action ==TIME_STEMPS:
+            save_name = '3_time_series'
+            #plot_snapshots(name_s, save_name, index)
+        else:
+            save_name  = 'genreal'
+            plot_animation(name_s, save_name)
+    else:
+        if action ==ALL_LAYERS:
+            mode =11
+            save_name = ALL_LAYERS
+            str_names = [[prex + 'ff3_5_198.pickle', prex+ 'ff3_4_198.pickle',prex + 'ff3_3_198.pickle'],[prex + 'ff3_2_198.pickle',prex + 'ff3_1_198.pickle',prex + 'ff4_1_10.pickle']]
+            str_names[1][2] = prex2+'g_layerSizes=10,7,5,4,4,3_LastEpochsInds=9998_nRepeats=20_batch=3563_DataName=var_u_nEpoch=10000_lr=0.0004_nEpochInds=964_samples=1_nDistSmpls=1/'
+            str_names = [[prex2 +'nbins8_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4096/',
+                          prex2 +'nbins12_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4096/',
+                          prex2 +'nbins18_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4096/']
+                         ,[prex2 +'nbins25_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistrSmpls=1_nEpoch=10000_batch=4096/',
+                           prex2 +'nbins35_DataName=var_u_sampleLen=1_layerSizes=10,7,5, 4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4096/',
+                           prex2 + 'nbins50_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=964_nRepeats=5_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4096/'                         ]]
+        elif action == COMPRAED_PERCENT:
+            save_name = COMPRAED_PERCENT
+            #mode =0
+            mode = 2
+            str_names    = [[prex + 'ff4_1_10.pickle', prex + 'ff3_1_198.pickle']]
+        elif action == ALL_SAMPLES:
+            save_name = ALL_SAMPLES
+            mode =3
+            str_names = [[prex+'t_32_1.pickle']]
+        root = tk.Tk()
+        root.withdraw()
+        file_path = filedialog.askopenfilename()
+        str_names = [[('/').join(file_path.split('/')[:-1]) + '/']]
+        if do_plot_action:
+            plot_figures(str_names, mode, save_name)
+        if do_plot_norms:
+            plot_norms(str_names)
+        if do_plot_pearson:
+            plot_pearson(str_names)
+        if plot_movie:
+            plot_animation_each_neuron(str_names, save_name)
+        if do_plot_eig:
+            pass
+        if do_plot_time_stepms:
+            plot_snapshots(str_names[0][0], save_name, 1)
+            #plot_eigs_movie(str_names)
+    plt.show()

idnns/plots/plot_gradients.py ADDED Viewed

	@@ -0,0 +1,223 @@

+'Calculate and plot the gradients (the mean and std of the mini-batch gradients) of the trained network'
+import matplotlib
+matplotlib.use("TkAgg")
+import numpy as np
+import idnns.plots.utils as plt_ut
+import matplotlib.pyplot as plt
+import tkinter as tk
+from tkinter import filedialog
+from numpy import linalg as LA
+import os
+import sys
+import statsmodels
+colors = ['red', 'c', 'blue', 'green', 'orange', 'purple']
+def plot_gradients(name_s=None, data_array=None, figures_dir=''):
+    """Plot the gradients and the means of the networks over the batches"""
+    if data_array == None:
+        data_array= plt_ut.get_data(name_s[0][0])
+    #plot_loss_figures(data_array, xlim = [0, 7000] )
+    #The gradients - the diemnstions are #epochs X #Batchs # Layers
+    conv_net = False
+    if conv_net:
+        gradients =data_array['var_grad_val'][0][0][0]
+        num_of_epochs = len(gradients)
+        num_of_batchs = len(gradients[0])
+        num_of_layers = len(gradients[0][0]) / 2
+    else:
+        gradients = np.squeeze(data_array['var_grad_val'])[:, :, :]
+        num_of_epochs,num_of_batchs,  num_of_layers = gradients.shape
+        num_of_layers = int(num_of_layers / 2)
+    #The indxes where we sampled the network
+    print (np.squeeze(data_array['var_grad_val'])[0,0].shape)
+    epochsInds = (data_array['params']['epochsInds']).astype(np.int)
+    #The norms of the layers
+    #l2_norm = calc_weights_norms(data_array['ws_all'])
+    f_log, axes_log, f_norms, axes_norms, f_snr,  axes_snr,axes_gaus, f_gaus = create_figs()
+    p_1, p_0,  sum_y ,p_3, p_4= [], [], [], [], []
+    # Go over the layers
+    cov_traces_all,means_all = [],[]
+    all_gradients = np.empty(num_of_layers, dtype=np.object)
+    #print np.squeeze(data_array['var_grad_val']).shape
+    for layer in range(0,num_of_layers):
+        # The traces of the covarince and the means of the gradients for the current layer
+        # Go over all the epochs
+        cov_traces, means = [], []
+        gradinets_layer = []
+        for epoch_index in range(num_of_epochs):
+            # the gradients are dimensions of #batchs X # output weights - when output weights is the number of wieghts that go out from the layer
+            gradients_current_epoch_and_layer = flatted_graidnet(gradients, epoch_index, 2 * layer)
+            gradinets_layer.append(gradients_current_epoch_and_layer)
+            num_of_output_weights =  gradients_current_epoch_and_layer.shape[1]
+            # the average vector over the batchs - this is vector in the size of #output weights
+            # We averged over the batchs - It's mean vector of the batchs!
+            average_vec = np.mean(gradients_current_epoch_and_layer, axis=0)
+            # The sqrt of the sum over all the weights of the squares of the gradinets -  Sqrt of AA^T - This is a number
+            gradients_mean = LA.norm(average_vec)
+            # The covarince matrix is in the size of #output weights X #output weights
+            sum_covs_mat = np.zeros((average_vec.shape[0], average_vec.shape[0]))
+            # Go over all the vectors of batchs (each vector is the size of # output weights, reduce the mean (over the batchs)
+            # and calculate the covariance matrix
+            for batch_index in range(num_of_batchs):
+                # This is in the size of the #output weights
+                current_vec = gradients_current_epoch_and_layer[batch_index, :] - average_vec
+                # The outer product of the current gradinet of the weights (in this specipic batch) with the transpose of it -
+                # give a matrix in the size of # output weights X # output weights
+                current_cov_mat = np.einsum('i,j', current_vec, current_vec)
+                #current_cov_mat = np.dot(current_vec[:,None], current_vec[None,:])
+                # Sum the covarince matrixes over the batchs
+                sum_covs_mat+=current_cov_mat
+            #Take the mean of the cov matrix over the batchs  - The size is #output weights X # output weights
+            mean_cov_mat = sum_covs_mat / num_of_batchs
+            #The trace of the mean of the cov matrix - a number
+            trac_cov = np.sqrt(np.trace(mean_cov_mat))
+            means.append(gradients_mean)
+            cov_traces.append(trac_cov)
+            """
+                #cov_traces.append(np.mean(grad_norms))
+                #means.append(norm_mean)
+                c_var,c_mean,total_w = [], [],[]
+                for neuron in range(len(grad[epoch_number][0][layer])/10):
+                    gradients_list = np.array([grad[epoch_number][i][layer][neuron] for i in range(len(grad[epoch_number]))])
+                    total_w.extend(gradients_list.T)
+                    grad_norms1 = np.std(gradients_list, axis=0)
+                    mean_la = np.abs(np.mean(np.array(gradients_list), axis=0))
+                    #mean_la = LA.norm(gradients_list, axis=0)
+                    c_var.append(np.mean(grad_norms1))
+                    c_mean.append(np.mean(mean_la))
+                #total_w is in size [num_of_total_weights, num of epochs]
+                total_w = np.array(total_w)
+                #c_var.append(np.sqrt(np.trace(np.cov(np.array(total_w).T)))/np.cov(np.array(total_w).T).shape[0])
+                #print np.mean(c_mean).shape
+                means.append(np.mean(c_mean))
+                cov_traces.append(np.mean(c_var))
+            """
+        gradinets_layer = np.array(gradinets_layer)
+        all_gradients[layer]= gradinets_layer
+        cov_traces_all.append(np.array(cov_traces))
+        means_all.append(np.array(means))
+        #The cov_traces and the means are vectors with the dimension of # epochs
+        #y_var = np.array(cov_traces)
+        #y_mean = np.array(means)
+        y_var = np.sum(cov_traces_all, axis=0)
+        y_mean = np.sum(means_all, axis=0)
+        snr =  y_mean**2 / y_var
+        #Plot the gradients and the means
+        c_p1, = axes_log.plot(epochsInds[:], np.sqrt(y_var),markersize = 4, linewidth = 4,color = colors[layer], linestyle=':', markeredgewidth=0.2, dashes = [4,4])
+        c_p0,= axes_log.plot(epochsInds[:], y_mean,  linewidth = 2,color = colors[layer])
+        c_p3,= axes_snr.plot(epochsInds[:],snr,  linewidth = 2,color = colors[layer])
+        c_p4,= axes_gaus.plot(epochsInds[:],np.log(1+snr),  linewidth = 2,color = colors[layer])
+        #For the legend
+        p_0.append(c_p0), p_1.append(c_p1),sum_y.append(y_mean) , p_3.append(c_p3), p_4.append(c_p4)
+    plt_ut.adjust_axes(axes_log, axes_norms, p_0, p_1, f_log, f_norms, axes_snr, f_snr, p_3, axes_gaus, f_gaus, p_4,
+                       directory_name=figures_dir)
+    plt.show()
+def calc_mean_var_loss(epochsInds,loss_train):
+    #Loss train is in dimension # epochs X #batchs
+    num_of_epochs = loss_train.shape[0]
+    #Average over the batchs
+    loss_train_mean = np.mean(loss_train,1)
+    #The diff divided by the sampled indexes
+    d_mean_loss_to_dt = np.sqrt(np.abs(np.diff(loss_train_mean) / np.diff(epochsInds[:])))
+    var_loss = []
+    #Go over the epochs
+    for epoch_index in range(num_of_epochs):
+        #The loss for the specpic epoch
+        current_loss = loss_train[epoch_index, :]
+        #The derivative between the batchs
+        current_loss_dt = np.diff(current_loss)
+        #The mean of his derivative
+        average_loss = np.mean(current_loss_dt)
+        current_loss_minus_mean = current_loss_dt- average_loss
+        #The covarince between the batchs
+        cov_mat = np.dot(current_loss_minus_mean[:, None], current_loss_minus_mean[None, :])
+        # The trace of the cov matrix
+        trac_cov = np.trace(cov_mat)
+        var_loss.append(trac_cov)
+    return np.array(var_loss), d_mean_loss_to_dt
+def plot_loss_figures(data_array, fig_size=(14, 10), xlim = None, y_lim = None):
+    epochsInds = (data_array['params']['epochsInds']).astype(np.int)
+    dif_var_loss, diff_mean_loss = calc_mean_var_loss(epochsInds, np.squeeze(data_array['loss_train']))
+    f_log1, (axes_log1) = plt.subplots(1, 1, figsize=fig_size)
+    axes_log1.set_title('The mean and the varince( between the batchs) of the derivative of the train error')
+    axes_log1.plot(epochsInds[1:], np.array(diff_mean_loss), color='green', label = 'Mean of the derivative of the error')
+    axes_log1.plot(epochsInds[:], (dif_var_loss), color='blue', label='Variance of the derivative of the error' )
+    axes_log1.set_xscale('log')
+    axes_log1.set_yscale('log')
+    axes_log1.set_xlabel('#Epochs')
+    axes_log1.legend()
+    f_log1, (axes_log1) = plt.subplots(1, 1, figsize=fig_size)
+    title = r'The SNR of the error derivatives'
+    p_5, =axes_log1.plot(epochsInds[1:], np.array(diff_mean_loss)/ np.sqrt(dif_var_loss[1:]), linewidth = 3, color='green',
+                  )
+    plt_ut.update_axes(axes_log1, f_log1, '#Epochs', 'SNR',[0, 7000], [0.001, 1], title, 'log', 'log',
+                       [1, 10, 100, 1000, 7000], [0.001, 0.01, 0.1, 1])
+    #axes_log1.plot(epochsInds[:], (dif_var_loss), color='blue', label='Variance of the derivative of the error')
+    axes_log1.legend([r'$\frac{|d Error|}{STD\left(Error)\right)}$'], loc= 'best',fontsize = 21)
+def create_figs(fig_size = (14, 10)):
+    f_norms, (axes_norms) = plt.subplots(1, 1, figsize=fig_size)
+    f_log, (axes_log) = plt.subplots(1, 1, figsize=fig_size)
+    f_snr, (axes_snr) = plt.subplots(1, 1, figsize=fig_size)
+    f_gaus, (axes_gaus) = plt.subplots(1, 1, figsize=fig_size)
+    f_log.subplots_adjust(left=0.097, bottom=0.11, right=.95, top=0.95, wspace=0.03, hspace=0.03)
+    return f_log, axes_log, f_norms, axes_norms, f_snr, axes_snr,axes_gaus, f_gaus
+def flatted_graidnet(gradients, epoch_number, layer):
+    gradients_list = []
+    # For each neuron in the current layer go over all the weights
+    for i in range(len(gradients[epoch_number])):
+        current_list_inner = []
+        for neuron in range(len(gradients[epoch_number][0][layer])):
+            c_n = gradients[epoch_number][i][layer][neuron]
+            current_list_inner.extend(c_n)
+        gradients_list.append(current_list_inner)
+    gradients_list = np.array(gradients_list)
+    gradients_list =np.reshape(gradients_list, (gradients_list.shape[0], -1))
+    return gradients_list
+def calc_weights_norms(ws, num_of_layer = 6):
+    layer_l2_norm = []
+    for i in range(num_of_layer):
+        flatted_list = [1]
+        """
+        if type(ws_in[epoch_number][layer_index]) is list:
+            flatted_list = [item for sublist in ws_in[epoch_number][layer_index] for item in sublist]
+        else:
+            flatted_list = ws_in[epoch_number][layer_index]
+        """
+        layer_l2_norm.append(LA.norm(np.array(flatted_list)))
+    # plot the norms
+    #axes_norms.plot(epochsInds[:], np.array(layer_l2_norm), linewidth=2, color=colors[layer_index])
+    return layer_l2_norm
+def extract_array(data, name):
+    results = [[data[j,k][name] for k in range(data.shape[1])] for j in range(data.shape[0])]
+    return results
+def load_from_memory(data_array):
+    plot_gradients(data_array=data_array)
+if __name__ == '__main__':
+    directory = './figures/'
+    if not os.path.exists(directory):
+        os.makedirs(directory)
+    root = tk.Tk()
+    root.withdraw()
+    file_path = filedialog.askopenfilename()
+    str_names = [[('/').join(file_path.split('/')[:-1]) + '/']]
+    plot_gradients(str_names, figures_dir=directory)

idnns/plots/utils.py ADDED Viewed

	@@ -0,0 +1,243 @@

+import matplotlib
+matplotlib.use("TkAgg")
+import scipy.io as sio
+import matplotlib.pyplot as plt
+import os
+import numpy as np
+import sys
+if sys.version_info >= (3, 0):
+	import _pickle as cPickle
+else:
+	import cPickle
+def update_axes(axes, f, xlabel, ylabel, xlim, ylim, title='', xscale=None, yscale=None, x_ticks=None, y_ticks=None,
+                p_0=None, p_1=None, p_3=None, p_4=None,
+                p_5=None, title_size=22):
+	"""adjust the axes to the ight scale/ticks and labels"""
+	font_size = 30
+	axis_font = 25
+	legend_font = 16
+	categories = 6 * ['']
+	labels = ['$10^{-4}$', '$10^{-3}$', '$10^{-2}$', '$10^{-1}$', '$10^0$', '$10^1$']
+	# If we want grey line in the midle
+	# axes.axvline(x=370, color='grey', linestyle=':', linewidth = 4)
+	# The legents of the mean and the std
+	"""
+	if p_0:
+		leg1 = f.legend([p_0[0],p_0[1],p_0[2],p_0[3],p_0[4], p_0[5]], categories, title=r'$\|Mean\left(\nabla{W_i}\right)\|$',bbox_to_anchor=(0.09, 0.95),  loc=2,fontsize = legend_font,markerfirst = False, handlelength = 5)
+		leg1.get_title().set_fontsize('21')  # legend 'Title' fontsize
+		axes.add_artist(leg1)
+	if p_1:
+		leg2 = f.legend([p_1[0],p_1[1],p_1[2],p_1[3],p_1[4], p_1[5]], categories, title=r'$Variance\left(\nabla{W_i}\right)$', loc=2,bbox_to_anchor=(0.25, 0.95), fontsize = legend_font ,markerfirst = False,handlelength = 5)
+		leg2.get_title().set_fontsize('21')  # legend 'Title' fontsize
+		axes.add_artist(leg2)
+	if p_3:
+		leg2 = f.legend([p_3[0], p_3[1], p_3[2], p_3[3], p_3[4], p_3[5]], categories,
+						  title=r'$SNR\left(\nabla{W_i}\right)$', loc=3, fontsize=legend_font,
+						  markerfirst=False, handlelength=5,bbox_to_anchor=(0.15, 0.1))
+		leg2.get_title().set_fontsize('21')  # legend 'Title'
+	if p_4:
+		leg2 = f.legend([p_4[0], p_4[1], p_4[2], p_4[3], p_4[4], p_4[5]], categories,
+						title=r'$\log\left(1+ SNR\left(\nabla{W_i}\right)\right)$', loc=3, fontsize=legend_font,
+						markerfirst=False, handlelength=5, bbox_to_anchor=(0.15, 0.1))
+		leg2.get_title().set_fontsize('21')  # legend 'Title'
+	if p_5:
+		pass
+		#leg2 = axes.legend(handles=[r'$\frac{|d Error|}{STD\left(Error)\right)}$'], loc=3, fontsize=legend_font,
+		#                bbox_to_anchor=(0.15, 0.1))
+		#leg2.get_title().set_fontsize('21')
+	"""
+	# plt.gca().add_artist(leg2)
+	axes.set_xscale(xscale)
+	axes.set_yscale(yscale)
+	axes.set_xlabel(xlabel, fontsize=font_size)
+	axes.set_ylabel(ylabel, fontsize=font_size)
+	axes.xaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
+	axes.yaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
+	if y_ticks:
+		axes.set_xticks(x_ticks)
+		axes.set_yticks(y_ticks)
+	axes.tick_params(axis='x', labelsize=axis_font)
+	axes.tick_params(axis='y', labelsize=axis_font)
+	axes.xaxis.major.formatter._useMathText = True
+	axes.set_yticklabels(labels, fontsize=font_size)
+	axes.set_title(title, fontsize=title_size)
+	axes.xaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter(useMathText=True))
+	axes.set_xlim(xlim)
+	if ylim:
+		axes.set_ylim(ylim)
+def update_axes_norms(axes, xlabel, ylabel):
+	"""Adjust the axes of the norms figure with labels/ticks"""
+	font_size = 30
+	axis_font = 25
+	legend_font = 16
+	# the legends
+	categories = [r'$\|W_1\|$', r'$\|W_2\|$', r'$\|W_3\|$', r'$\|W_4\|$', r'$\|W_5\|$', r'$\|W_6\|$']
+	# Grey line in the middle
+	axes.axvline(x=370, color='grey', linestyle=':', linewidth=4)
+	axes.legend(categories, loc='best', fontsize=legend_font)
+	axes.set_xlabel(xlabel, fontsize=font_size)
+	axes.set_ylabel(ylabel, fontsize=font_size)
+	axes.tick_params(axis='x', labelsize=axis_font)
+	axes.tick_params(axis='y', labelsize=axis_font)
+def update_axes_snr(axes, xlabel, ylabel):
+	"""Adjust the axes of the norms figure with labels/ticks"""
+	font_size = 30
+	axis_font = 25
+	legend_font = 16
+	# the legends
+	categories = [r'$W_1$', r'$W_2$', r'$W_3$', r'$W_4$', r'$W_5$', r'$W_6$']
+	# Grey line in the middle
+	axes.set_title('The SNR ($norm^2/variance$)')
+	# axes.axvline(x=370, color='grey', linestyle=':', linewidth=4)
+	axes.legend(categories, loc='best', fontsize=legend_font)
+	axes.set_xlabel(xlabel, fontsize=font_size)
+	axes.set_ylabel(ylabel, fontsize=font_size)
+	axes.tick_params(axis='x', labelsize=axis_font)
+	axes.tick_params(axis='y', labelsize=axis_font)
+def adjust_axes(axes_log, axes_norms, p_0, p_1, f_log, f_norms, axes_snr=None, f_snr=None, p_3=None, axes_gaus=None,
+                f_gau=None, p_4=None, directory_name=''):
+	# adejust the figure according the specipic labels, scaling and legends
+	# Change the log and log to linear if you want linear scaling
+	# update_axes(reg_axes, '# Epochs', 'Normalized Mean and STD', [0, 10000], [0.000001, 10], '', 'log', 'log', [1, 10, 100, 1000, 10000], [0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10], p_0, p_1)
+	title = 'The Mean and std of the gradients of each layer'
+	update_axes(axes_log, f_log, '# Epochs', 'Mean and STD', [0, 7000], [0.001, 10], title, 'log', 'log',
+	            [1, 10, 100, 1000, 7000], [0.001, 0.01, 0.1, 1, 10], p_0, p_1)
+	update_axes_norms(axes_norms, '# Epochs', '$L_2$')
+	if p_3:
+		title = r'SNR of the gradients ($\frac{norm^2}{variance}$)'
+		update_axes(axes_snr, f_snr, '# Epochs', 'SNR', [0, 7000], [0.0001, 10], title, 'log', 'log',
+		            [1, 10, 100, 1000, 7000], [0.0001, 0.001, 0.01, 0.1, 1, 10], p_3=p_3)
+	if p_4:
+		title = r'Gaussian Channel bounds of the gradients ($\log\left(1+SNR\right)$)'
+		update_axes(axes_gaus, f_gau, '# Epochs', 'log(SNR+1)', [0, 7000], [0.0001, 10], title, 'log', 'log',
+		            [1, 10, 100, 1000, 7000], [0.0001, 0.001, 0.01, 0.1, 1, 10], p_4=p_4)
+	# axes_log.plot(epochsInds[1:], np.abs(np.diff(np.squeeze(data_array['loss_train']))) / np.diff(epochsInds[:]), color='black', linewidth = 3)
+	# axes_log.plot(epochsInds[0:], np.sum(np.array(sum_y), axis=0), color='c', linewidth = 3)
+	# axes_log.plot(epochsInds[1:], diff_mean_loss, color='red', linewidth = 3)
+	# f_log1, (axes_log1) = plt.subplots(1, 1, figsize=fig_size)
+	# axes_log1.plot(epochsInds[1:], np.sum(np.array(sum_y), axis=0)[1:] / diff_mean_loss, color='c', linewidth=3)
+	# axes_log.set_xscale('log')
+	f_log.savefig(directory_name + 'log_gradient.svg', dpi=200, format='svg')
+	f_norms.savefig(directory_name + 'norms.jpg', dpi=200, format='jpg')
+def adjustAxes(axes, axis_font=20, title_str='', x_ticks=[], y_ticks=[], x_lim=None, y_lim=None,
+               set_xlabel=True, set_ylabel=True, x_label='', y_label='', set_xlim=True, set_ylim=True, set_ticks=True,
+               label_size=20, set_yscale=False,
+               set_xscale=False, yscale=None, xscale=None, ytick_labels='', genreal_scaling=False):
+	"""Organize the axes of the given figure"""
+	if set_xscale:
+		axes.set_xscale(xscale)
+	if set_yscale:
+		axes.set_yscale(yscale)
+	if genreal_scaling:
+		axes.xaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
+		axes.yaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter())
+		axes.xaxis.major.formatter._useMathText = True
+		axes.set_yticklabels(ytick_labels)
+		axes.xaxis.set_major_formatter(matplotlib.ticker.ScalarFormatter(useMathText=True))
+	if set_xlim:
+		axes.set_xlim(x_lim)
+	if set_ylim:
+		axes.set_ylim(y_lim)
+	axes.set_title(title_str, fontsize=axis_font + 2)
+	axes.tick_params(axis='y', labelsize=axis_font)
+	axes.tick_params(axis='x', labelsize=axis_font)
+	if set_ticks:
+		axes.set_xticks(x_ticks)
+		axes.set_yticks(y_ticks)
+	if set_xlabel:
+		axes.set_xlabel(x_label, fontsize=label_size)
+	if set_ylabel:
+		axes.set_ylabel(y_label, fontsize=label_size)
+def create_color_bar(f, cmap, colorbar_axis, bar_font, epochsInds, title):
+	sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(vmin=0, vmax=1))
+	sm._A = []
+	cbar_ax = f.add_axes(colorbar_axis)
+	cbar = f.colorbar(sm, ticks=[], cax=cbar_ax)
+	cbar.ax.tick_params(labelsize=bar_font)
+	cbar.set_label(title, size=bar_font)
+	cbar.ax.text(0.5, -0.01, epochsInds[0], transform=cbar.ax.transAxes,
+	             va='top', ha='center', size=bar_font)
+	cbar.ax.text(0.5, 1.0, str(epochsInds[-1]), transform=cbar.ax.transAxes,
+	             va='bottom', ha='center', size=bar_font)
+def get_data(name):
+	"""Load data from the given name"""
+	gen_data = {}
+	# new version
+	if os.path.isfile(name + 'data.pickle'):
+		curent_f = open(name + 'data.pickle', 'rb')
+		d2 = cPickle.load(curent_f)
+	# Old version
+	else:
+		curent_f = open(name, 'rb')
+		d1 = cPickle.load(curent_f)
+		data1 = d1[0]
+		data = np.array([data1[:, :, :, :, :, 0], data1[:, :, :, :, :, 1]])
+		# Convert log e to log2
+		normalization_factor = 1 / np.log2(2.718281)
+		epochsInds = np.arange(0, data.shape[4])
+		d2 = {}
+		d2['epochsInds'] = epochsInds
+		d2['information'] = data / normalization_factor
+	return d2
+def load_reverese_annealing_data(name, max_beta=300, min_beta=0.8, dt=0.1):
+	"""Load mat file of the reverse annealing data with the give params"""
+	with open(name + '.mat', 'rb') as handle:
+		d = sio.loadmat(name + '.mat')
+		F = d['F']
+		ys = d['y']
+		PXs = np.ones(len(F)) / len(F)
+		f_PYs = np.mean(ys)
+	PYs = np.array([f_PYs, 1 - f_PYs])
+	PYX = np.concatenate((np.array(ys)[None, :], 1 - np.array(ys)[None, :]))
+	mybetaS = 2 ** np.arange(np.log2(min_beta), np.log2(max_beta), dt)
+	mybetaS = mybetaS[::-1]
+	PTX0 = np.eye(PXs.shape[0])
+	return mybetaS, np.squeeze(PTX0), np.squeeze(PXs), np.squeeze(PYX), np.squeeze(PYs)
+def get_data(name):
+	"""Load data from the given name"""
+	gen_data = {}
+	# new version
+	if os.path.isfile(name + 'data.pickle'):
+		curent_f = open(name + 'data.pickle', 'rb')
+		d2 = cPickle.load(curent_f)
+	# Old version
+	else:
+		curent_f = open(name, 'rb')
+		d1 = cPickle.load(curent_f)
+		data1 = d1[0]
+		data = np.array([data1[:, :, :, :, :, 0], data1[:, :, :, :, :, 1]])
+		# Convert log e to log2
+		normalization_factor = 1 / np.log2(2.718281)
+		epochsInds = np.arange(0, data.shape[4])
+		d2 = {}
+		d2['epochsInds'] = epochsInds
+		d2['information'] = data / normalization_factor
+	return d2

main.py ADDED Viewed

	@@ -0,0 +1,24 @@

+"""
+Train % plot networks in the information plane
+"""
+import tensorflow.compat.v1 as tf
+tf.disable_v2_behavior()
+from idnns.networks import information_network as inet
+def main():
+    #Build the network
+    print ('Building the network')
+    net = inet.informationNetwork()
+    net.print_information()
+    print ('Start running the network')
+    net.run_network()
+    print ('Saving data')
+    net.save_data()
+    print ('Ploting figures')
+    #Plot the newtork
+    net.plot_network()
+if __name__ == '__main__':
+    main()

test.py ADDED Viewed

	@@ -0,0 +1,14 @@

+import idnns.plots.plot_figures as plt_fig
+if __name__ == '__main__':
+    str_name =[['jobs/usa22_DataName=MNIST_sampleLen=1_layerSizes=400,200,100_lr=0.002_nEpochInds=22_nRepeats=1_LastEpochsInds=299_nDistSmpls=1_nEpoch=300_batch=2560/']]
+    str_name = [['jobs/trails1_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=84_nRepeats=1_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4016/']]
+    str_name =[['jobs/trails1_DataName=g2_sampleLen=1_layerSizes=10,7,5,4,3_lr=1e-05_nEpochInds=75_nRepeats=1_LastEpochsInds=999_nDistSmpls=1_nEpoch=1000_batch=4016/']]
+    #str_name = [['jobs/trails2_DataName=var_u_sampleLen=1_layerSizes=10,7,5,4,3_lr=0.0004_nEpochInds=84_nRepeats=1_LastEpochsInds=9998_nDistSmpls=1_nEpoch=10000_batch=4016/']]
+    #plt_fig.plot_figures(str_name, 2, 'd')
+    plt_fig.plot_alphas(str_name[0][0])
+    mode = 2
+    save_name = 'figure'
+    #plt_fig.plot_figures(str_name, mode, save_name)
+    #plt_fig.plot_hist(str_name[0][0])