Building Tensorflow from Sources on Your MacBook

This guide is inspired by the tremendous walkthrough on TechPolyMath. It walks through building and installing the TensorFlow binary from source on a macOS system. Mostly I am publishing this also as a reference for myself but others might find it useful

The following assumptions were made, using a 2016 MacBook Pro:

  • macOS High Sierra
  • Intel Core i5 Processor
  • TensorFlow 1.4.1
  • Python 3.6.2
  • JDK 8u151
  • Bazel 0.5.4

Note that the resulting binary we build will be optimized for CPU support only.

Installing Prerequisites on macOS

Install Java Development Kit

At the time of publication, Bazel required the Java Development Kit 8.0. If it is not already installed follow these steps.

  • Download the DMG image that containers the JDK installer from Oracle’s Java SE Development Kit 8 Downloads page.
  • The file should be named something like jdk-8u151-macosx-x64.dmg
  • Mount the DMG image, double-click the installer, and follow the guided prompts.

Install Dependencies with Anaconda

It is strongly recommended that you use a virtualenv or Conda environment to isolate the packages and installation of TensorFlow from source. This enables you to maintain multiple combinations of package versions for various projects or experiments.

Since I use Anaconda, I created a new environment by executing the command conda create -n tensorflow python=3.6

Install Bazel

Instead of using the popular Homebrew System, I used anaconda. Basically I wanted to get by with as few tools as possible and since I am a Macports user I tried to avoid homebrew. You can refer to the Bazel documentation for alternatives.

  • From a Terminal, change into your newly created environment source activate tensorflow
  • Use anaconda search bazel to get the right version from conda-forge
  • Install using conda install --channel  https://conda.anaconda.org/c3i_test2 bazel
  • Confirm Bazel is installing by verifying the output of bazel version

I ran into a weird bug, where tensorflow wouldn’t compile with the conda-forge version of bazel 0.8.1, so had to use the next best version available.

Install Python Dependencies

TensorFlow requires the following Python packages:

  • six
  • NumPy
  • wheel

If these are not present on your machine or current environment execute the following to install:

pip3 install six numpy wheel</code>  <h3>Prepare the TensorFlow Installation</h3>  <h5>Clone the TensorFlow Repository</h5>  Now, clone the latest TensorFlow repository to some place on your computer by issuing the following command:  <code> git clone https://github.com/tensorflow/tensorflow

This will take a short while depending on your Internet connection bandwidth due to the repository’s size (~250 MB).

Most of the time you want install a release version rather than trunk. You can display the available releases and check out one using the following commands (example is TensorFlow release 1.4, since it is the newest release):

git branch -l --remote </code>  <code> git checkout r1.4

Configure and Build the TensorFlow pip Package

Frank Hinek from TechPolyMath posted a script that is based on the work of Sasha Nikiforov and a Stackoverflow post with a few minor modifications to support Anaconda environments.

Clone the build_tf.sh script gist to the same directory containing the TensorFlow source:

curl -O https://gist.githubusercontent.com/frankhinek/ \  20f8086d70886c56405fe4431f998ac4/raw/  \ 98c2ea1570d35bf20c0761390eb6d423ea387b02/build_tf.sh</code>  Now, make the script executable:  <code> chmod u+x build_tf.sh

Execute the script to configure and build the TensorFlow pip package using all of the CPU optimizations available for your CPU:

./build_tf.sh</code>  During the TensorFlow configuration process, accept all the defaults by pressing the Enter key repeatedly until the configuration is complete unless you have a specific need.  This build process will take some time to complete. In my experience it averages around 30 minutes, but a system with more than 16GB of RAM will compile faster.  The script will build a .whl file and store it in the /tmp/tensorflow_pkg directory.  <h3>Installing TensorFlow</h3>  The final step is to install the custom-built TensorFlow binary using the wheel file. The specific name of the while will vary depending on the version of TensorFlow and Python in your environment.  In my case, the command to install is:  <code> pip install /tmp/tensorflow_pkg/tensorflow-1.4.1-cp36-cp36m-macosx_10_7_x86_64.whl

Verify Installation

The last step is to confirm that TensorFlow is now installed and working as expected. Before proceeding be sure that you change to a directory other than the one that contains the TensorFlow source tree. We want to import the pip installed package and not the one in the source directory.

Launch the python interactive shell and execute the following commands:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello)

If the system outputs the following, then TensorFlow is installed and working:
Hello, TensorFlow!

Uninstall everything you don’t need anymore

JDK8

Take a look at this nice page documenting how you get rid of all traces of Java by Oracle: Uninstall Java, Uninstall Java JDK

Conda Dependencies

conda uninstall openjdk
conda uninstall bazel
conda uninstall libcxx
conda uninstall libcxxabi

Tensorflow git Repository

sudo rm -rf tensorflow

And finally you’re done, have fun with your newly compiled and optimized tensorflow on your Macbook. As you will probably notice it’s somewhat faster now. Personally I wonder if MPI support would make any difference on the Macbook, so maybe I try that next.

Spaghetti and Moonrockets

This is a translated version of a bookchapter I wrote for German highschoolers interested in meteorology. It’s an easy introduction to the Kalman-Filter and due to the addressed audience has as little math in it as possible. For more details and further reading I suggest looking at the sources or at my post coming soon….

It’s always the same old story: You plan your birthday BBQ a week ahead of time or you want to go swimming with your friends, and even though the forecaster promised good weather for that day it’s raining cats and dogs. Why can’t we just get a reliable weather forecast for a larger timespan, and what has the moonlanding to do with it?

Cloudy with a Chance of Spaghetti

The problems of weather forecasting are manyfold, however in my opinion there’s mainly three things.

First of all you have to think about why forecasting of weather works at all: We need a mathematical model of the weather. The model consists of functions (imagine  y=x^2 ) that describe how air moves in order to calculate temperature, pressure, humidity and wind speed etc. for every timestep. To find and perfect these functions, an arrangement of mathematical equations, is a herculean problem itself, which many meteorologists, physicists and mathematicians work on tirelessly. Phenomena like turbulence, aerosols and cloud formation is not thouroughly understood, but at least one has a sense of how large the corresponding errors in the equations are.

Second, there is the fundamental problem of measurement bias. By measuring we try to estimate the state of the system (pressure, temperature, etc.) at this very moment with high accuracy. These measurements are then pasted into the mathematical model. But how much do we trust in our own measurements? This is a big problem, since small deviations in the beginning can have large consequences in the future. Just imagine you trying to measure the temperature of the room with 50 people and a thermometer. Probably each of the 50 people will give a different answer regarding the second or third digit after the comma. Apart from this, we don’t even know how well the thermometer itself measures temperature. Figure 1 shows a timeseries taken from the so-called Lorenz-Model of 1963.

This specific model is described by an array of 3 “functions” inspired by rising and lowering air-masses and it behaves similarly to our atmosphere. Within the figure 1, you can see 50 different lines, that are very closely together in the beginning. In order to get values for tomorrow or the day after tomorrow we need to integrate the functions of your model. If you paste the 50 slightly different values as the initial conditions into our model you can see that after a short amount of time the forecasts drift apart completely. Meteorologist call this figure a spaghetti-plot.

The nice thing about spaghetti plots is that you can still use it regardless of the different restults. If for example 40 of 50 “spaghetti” point to warmer temperatures, you can assume with approximately  80\% certainty  \left( \frac{40}{50} \right) that its going to be warmer. With the spaghetti drifting apart we get a sense for the trustworthyness of our forecasts. This whole thing is closely related to something we call butterfly-effect and chaos-theory and is a fascinating story on its own. The more mathematically inclined reader might want to check out my piece on chaos and the butterfly effect

By the way, the uncertainty about the initial state is the reason for the false weather forecast for our birthday BBQ and more specifically is the reason why one shouldn’t trust weather forecasts over 3 days into the future. But there has been a lot of improvements in the past few decades, namely:

Third, the so-called data-assimilation. This term describes how we incorporate all the measurement data we have into one big picture.

Data-assimilation can be performed in space and in time. First let’s take a look at space: Peaking at figure 2 one can see a map of the world with all known weather stations as little black dots. Looking closer you realize that these dots are not equally distributed regarding ocean and land, and only the northern hemisphere has a dense station network. In order to start our weather model we need to have data about the initial state at every point in space. To get these values even at locations without a station we use a few mathematical tricks. The easiest possibilities to get data at locations without a measurement are drawn in figure 3. One could simply draw a line (green) between data points or one tries to fit a normal distribution around each data point (blue). This way we get values for every location in space.

Second let’s move onto data-assimilation in time. From time to time we get new information from our weather-stations and more information is always good. Wouldn’t it be great if we could grab the spaghetti, i.e. the prediction for temperature etc., from our figure 1 that have diverged to much from “truth” and “bump” them back to the right values? Thats what the Kalman-filter does.

Kálmán, we have a Problem

In fact this problem I just mentioned was already tackled in the 1960s. An American mathematician named Rudolf Kálmán developed a method to keep the Apollo capsule on its course to the moon.


Figure 1: Following the model the spacecraft would follow the ideal green line, in reality the course is distorted by small inaccuracies, as are the measurements (red dots)

Kálmán called his technique a “filter” and it works in 2 steps. First, one uses Newton-mechanics to predict the state of the system (in this case the position of the spacecraft in space) and the magnitude of uncertainty of this position because of effects that were not considered.

Second, one uses a weighted average to incorporate the latest measured data (of course those are biased, too). For Kalman those were board-measurements of the Apollo capsule, for weather-prediction purposes those would be new temperature measurements.

In order to understand this properly we create a small example with a spacecraft moving on a number-line between 0 and 1. Let’s say that the position predicted by our model (Newton’s mechanics) is at  \frac{1}{4} the measured position on the other hand is at  \frac{3}{4} .


Figure 2: Simple Example on the number line. At a given moment in time the spacecraft is located between 0 and 1.

The Kalman Gain  K is a number that expresses the confidence we have in our measurement relative to the prediction of the model.  K = 1 means that “we are absolutely sure that our measurements are correct” whereas  K = 0 means “we should place all our bets on the prediction, the measurement is rubbish”. Using the Kalman gain  K we can calculate a weighted average

 Estimate_x = \frac{3}{4} K + \frac{1}{4}(1-K)

If one believes equally in the model and the prediction one would take the average of both values.

 Estimate_x = \frac{\frac{3}{4} + \frac{1}{4}}{2} = \frac{1}{2} = 0.5

This seems to be a perfectly reasonable choice, however the Kalman filter can do more. It considers not only the uncertainty about the prediction, but also the error we make when measuring and calculates the optimal value for the relative confidence  K . If you trust the prediction more than the measurements it gives more weight to the prediction. And if the measurements are more plausible it gives those priority.

An example for a weighted average where the measurement is considered to be more reliable would be  K = \frac{2}{3} . In that case the filter calculates an estimate that is closer to the observed position.

 Estimate_x = \frac{2}{3} \cdot \frac{3}{4} + \frac{1}{3} \cdot \frac{1}{4} \simeq 0.58


Figure 3: The Kalman filter compares the prediciton of the model (Newton mechanics, blue) to on-board-measurements (red) in order to get a better estimation of the true position (green, unknown)

Not only does the filter provide a good estimation, but the calculation thereof is simple enough to be performed in real-time on a calculator. Everything that is needed for generating the estimate is the preiction and the latest measurement. Every computation for the moon-landing had to be performed on a boardcomputer less powerful than your average school-calculator.

With this very same technique that helped landing on the moon we can bump our “spaghetti” towards the true value and this helped improving weather forecasting tremendously.

A Filter for every Eventuality

Different implementations of this Kalman-filter are used today in modern weather forecasting and proved to be very useful. Additionally these filters are also used to keep your car navigation system on course if the connection to one of the satellites drops or in order to “filter” the noise from an audiofile. What started with the journey of 3 men to the moon, is now part of our daily life.

Sources and Literature

  • Wikipedia, keyword “Rudolf Kalman”, January 16th, 2017, at 15:22, as seen on https://en.wikipedia.org/wiki/Rudolf_E._Kálmán
  • Wikipedia, keyword “Kalman Filter”, January 27th 2017, at 22:06, as seen on https://en.wikipedia.org/wiki/Kalman_filter
  • Evensen, G. (2003). The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean dynamics 53, 343–367.
  • Jones, D. (2014). University of Toronto, PHY2506: Data Assimilation (lecture material).
  • Kucharski, Adam „Understanding the unseen“, +plus magazine, Version vom 31.03.2016 14:25 Uhr, abrufbar unter https://plus.maths.org/content/understanding-unseen
  • Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the atmospheric sciences 20, 130–141.
  • Vose, R. S., et al. The Global Historical Climatology Network: Long-term monthly temperature, precipitation, and pressure data. No. CONF-930133-2. Oak Ridge National Lab., TN (United States), 1992.