Thursday, March 15, 2018

Tensorflow 1.6.0 on Mac OSX10.11 (el capitan) with macports

We should start with xcode. If you are doing any coding on a mac you ultimately need this rly horrible software. Get it from apple store or wherever bad software goes to  rest. My version is:


$xcodebuild -version
$Xcode 8.0

We continue with macports.


Macports is an easy to use package manager for installing open source software on Mac OS, allows easy download and install, and I like that. It works as command line, and uses a well-contained local address for all the installations it makes, which is great, does not mess up with other builds on the machine. (Apple's own package manager/repository could have served for this purpose but perhaps they didn't see any value in software they cannot sell, who knows...)


Ok I am on Mac os x el capitan (10.11) so the first thing is downloading the right package for this system. Follow here: https://guide.macports.org/chunked/installing.macports.html

My macports package is : MacPorts-2.4.2-10.11-ElCapitan.pkg

I double click and install without touching any default locations. An important thing to remember is macports will base all its installations under " /opt/local " by default. This is not a user specific location, it is shared by all users, so in order to make modifications here, you need to be a superuser.  (If you want to see all default locations and hierarchy: https://guide.macports.org/chunked/porthier.html )


Main macports command is "port" ,

$port version # tells you which version of macports you have
$port search python # for example, allows you search the macport repository (aka port tree) for software that is related to python. 


macports holds 'portfiles' which are basically descriptions of software, download locations, dependency list and other installation instructions. Before you can do the search above, you need to update macports port tree. To do this macport tries to connect with an rsync server, which is not allowed for my network, so i need to tell macport to do this by using the daily snapshot of the server instead, by changing the source location: 

$sudo vim /opt/local/etc/macports/sources.conf 


At the end of the file comment the rsync address,
#rsync://rsync.macports.org/macports/release/tarballs/ports.tar [default]
add the https tarball address instead:
https://distfiles.macports.org/ports.tar.gz [default]

Apparently this is a common issue for many people so they made a FAQ entry for it here: https://trac.macports.org/wiki/howto/PortTreeTarball


Now that macports has the right address, we can update its port tree:



$sudo port -d sync
--->  Updating the ports tree
Synchronizing local ports tree from https://distfiles.macports.org/ports.tar.gz
...

Next, I get what I need for my tensorflow build: python3.6,  virtualenv, and tf itself. Why the first two:
Py3.6: My collaborators use py3.6 so that is why I am sticking with it. 
Virtualenv: Mac already comes with some python version, as many of its apps use python scripts behind the scenes apparently. But I wanted a build environment I feel more confident in breaking things and rebuilding them, without worrying of messing up anything with this system. So I would like to work in the well-contained world of a virtual environment. Also because, I might want to have different tensorflow versions with different python version dependencies in the future, therefore I do not yet want a system-wide tensorflow. One that is contained to a limited virtual environment is enough for me.
Ok so here I continue:

$port search python36
$sudo port -v install python36
$sudo port select --set python3 python36
Therefore if I install python 3.6 inside macports, and set it such that this py36 still going to be what macports will call when asked for 'python3'. However, note that this is only effective within macport space, apple python3 would still overrule this. For example, I have py26 and py27 as system python by default, thanks to apple:

$which python
/usr/bin/python 
$python --version

Python 2.7.10

And macports knows about these versions I have:

$port select --list python
Available versions for python:
        none (active)
        python26-apple
        python27-apple
        python36


But as you see, it does not know which one I prefer within macports environment, for example when it needs to execute some installation instruction for a port. Which is ok with me, tho. I am interested in python3 command, which I set earlier via port select --set:

$ port select --list python3
Available versions for python3:
        none
        python36 (active)

$ which python3
/opt/local/bin/python3

So, all good. python3 is who it should be. Moving on: virtualenv

$port search virtualenv
$sudo port -v install py36-virtualenv

$sudo port select --set virtualenv virtualenv36
$which virtualenv
/opt/local/bin/virtualenv

$virtualenv --version

15.1.0

All good. Now virtualenv is also installed, and available to all users. 

So far I have been making installations in '/opt' space, hence with sudo. Now I will install tensorflow inside a virtual environment, under my home. 

$cd 
$virtualenv --system-site-packages -p python3 myTFEnv_36

This creates a directory in that address, and allows the installations within that environment to use system-site-packages in particular for python, it says, use python3 of the system, which in my case is python36.

$cd myTFEnv_36
source ./bin/activate #this is actually when that virtual environment becomes active. You would see that the terminal prompt changes when you do that: 
(myTFEnv_36)

The beauty of this is I can have many TF versions with different python versions and use them with as little effort as changing directories. (Remember to source the activate files to activate the virtual environment)

Ok, cool, but we still dont have tensorflow. Hurry:
We have a python package manager (pip) in virtual environment. Let's see its version 'coz TF people say it should be > 8.1

$ pip -V 
pip 9.0.1 from /Users/epsilon/myTFEnv_36/lib/python3.6/site-packages

$which pip
/Users/epsilon/myTFEnv_36/bin/pip
This way, you also see how virtualenv is referencing to python3 we asked it to use from the system-site-package

Good, we have the right version of pip. Moving on to tf:

$pip install tensorflow
..
Downloading tensorflow-1.6.0-cp36-cp36m-macosx_10_11_x86_64.whl
..
Installing collected packages: termcolor, astor, werkzeug, six, html5lib, bleach, markdown, numpy, protobuf, tensorboard, absl-py, gast, grpcio, tensorflow
..
..
..
Successfully installed absl-py-0.1.11 astor-0.6.2 bleach-1.5.0 gast-0.2.0 grpcio-1.10.0 html5lib-0.9999999 markdown-2.6.11 numpy-1.14.1 protobuf-3.5.2 six-1.11.0 tensorboard-1.6.0 tensorflow-1.6.0 termcolor-1.1.0 werkzeug-0.14.1

Voila, we have tf. Validate following TF guideline:
$ python

Python 3.6.4 (default, Dec 21 2017, 20:32:22)
>>> # Python

... import tensorflow as tf

>>> hello = tf.constant('Hello, TensorFlow!')

>>> sess = tf.Session()

>>> print(sess.run(hello))

b'Hello, TensorFlow!

For some other tf related work, you will also need python's data analysis toolkit thingy, pandas, so get it now as well:

$pip install pandas
Collecting pandas
Downloading pandas-0.22.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.9MB)
..


Let's say you want to run getting started examples from tensorflow guide, then carry on as:
$git clone https://github.com/tensorflow/models
$cd models/samples/core/get_started/
$python premade_estimator.py

You have run your first tf script on your mac :) 

===
Troubleshooting:

1-At first I tried this whole thing with py3.5 but I quickly noticed that the tensorflow that was packaged as py3.5 version on macports was actually a py36 and named wrongly. And I couldnt find the real source with the py3.5 instead on macports. Maybe the tf compiled for py3.5 has a bug? I don't really know. I just used the tf for py36 to overcome this.

2-"tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA"
When you run tf.Session() in the above validation example you might be getting this. Which tells that your cpu could run tf with higher performance but the software was not compiled for it. I didnt need this optimization, hence decided to not worry about it right now. But as soon as I upgrade to Sierra I will update the tf version to take advantage of such speedups, thanks to kindly provided precompiled code here: https://github.com/lakshayg/tensorflow-build (see also related http://www.andrewclegg.org/tech/TensorFlowLaptopCPU.html and https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/platform/cpu_feature_guard.cc )

3-This is not tensorflow related but still: At some point I needed to make some plots to see the tensorflow results and realized I am lacking a few packages for it. My goto visualizer for this would have been xmgrace or gnuplot but I decided to give python plotting a chance. I'll just share in case someone finds it useful:
I did "pip install matplotlib" inside the virtualenv but kept getting import error: "ImportError: no module named .." when I tried to import matplotlib. Then I installed it via macports, and opened a new clean virtualenv, but that alone did not solve the issue either, this time the problem was the backend tho, the module was found. The default macOS backend kept not working for some reason. So I decided to use another backend "TkAgg" whose only dependency was tk framework:
$sudo port install py36-tkinter
did the job to a greater extend:
$python
Python 3.6.4 (default, Dec 21 2017, 20:32:22) 
>>> import matplotlib as mpl
>>> mpl.use('TkAgg')   # this is where I change the backend

>>> import matplotlib.pyplot as plt

so that was successfully imported and I thought all was done but then I tried to make a scatter plot to see my results:

>>> import pandas as pd
>>> import matplotlib
>>> matplotlib.use('TkAgg')
>>> import matplotlib.pyplot as plt
>>> df=pd.read_csv(r'results_1.csv',header=None)
>>> df.columns=['x','y']

>>> plt.scatter(df['x'],df['y'])
_tkinter.TclError: no display name and no $DISPLAY environment variable

ouch! Just when I thought it was all done I realized I dont even have an x server! so python was simply not able to open a new window to make a plot there. I installed quartz for x11 terminal management from
https://www.xquartz.org/
I restarted the machine, because relaunching from command line did not work:
launchctl load -w /Library/LaunchAgents/org.macosforge.xquartz.startx.plist 

And tried it all out again:
>>> plt.scatter(df['x'],df['y'])
returned:
<matplotlib.collections.PathCollection object at 0x10f846dd8>
which got me worried for a second thinking it was an error, but it looked like just the address of the object in memory. So I moved on:
>>> plt.show()
and I could finally see the results. 

===
Edit:
It has been a few days that I have been testing some algorithms with tf on this mac laptop. 
I like it so far, works good enough for local tests, no performance complaints from me till now. I like tensorflow also. But to be very honest, I dearly miss my linux workstation and in general coding with fortran where it was all much simpler and transparent somehow. 

No comments:

Post a Comment