Jazoon 2017 AI meet Developers Conference Review

Hi I had the opportunity to participate to this conference in Zurich on the 27 October 2017 and attend to the following sessions:

  • Build Your Intelligent Enterprise with SAP Machine Learning
  • Applied AI: Real-World Use Cases for Microsoft’s Azure Cognitive Services
  • Run Deep Learning models in the browser with JavaScript and ConvNetJS
  • Using messaging and AI to build novel user interfaces for work
  • JVM based DeepLearning on IoT data with Apache Spark
  • Apache Spark for Machine Learning on Large Data Sets
  • Anatomy of an open source voice assistant
  • Building products with TensorFlow

Most of the sessions have been recorded and they are available here:


The first session has been a more a sales/pre-recorded demos presentation of SAP capabilities in terms of AI mainly in their cloud:


But with some interesting ideas like the Brand Impact Video analyzer that computes how much airtime is filled by specific brands inside a video:


And another good use case representation is the defective product automatic recognition using image similarity distance API:


The second session has been around the new AI capabilities offered by Microsoft and divided into two parts:

Capabilities for data scientists that want to build their python models

  • Azure Machine Learning Workbench that is an electron based desktop app that mainly accelerates the data preparation tasks using “a learn by example” engine that creates on the fly data preparation code.


  • Azure Notebooks a free but limited Cloud Based Jupyter Notebook environment to share and re-use models/notebooks


  • Azure Data Science Virtual Machine a pre-built VM with all the most common DS packages (TensorFlow, Caffe, R, Python, etc..)


Capabilities (i.e. Face/Age/Sentiment/OCR/Hand written detection) for developers that want to consume Microsoft pre-trained models calling directly Microsoft Cognitive API



The third session has been more an “educational presentation” around deep learning, and how at high level a deep learning system work, however we have seen in this talk some interesting topics:

  • The existence of several pre-trained models that can be used as is especially for featurization purposes and/or for transfer learning


  • How to visualize neural networks with web sites like http://playground.tensorflow.org
  • A significant amount of demos that can show case DNN applications that can run directly in the browser

The fourth session has been one also an interesting session, because the speaker clearly explained the current possibilities and limits of the current application development landscape and in particular of the enterprise bots.


Key take away: Bots are far from being smart and people don’t want to type text.

Suggested approach bots are new apps that are reaching their “customers” in the channels that they already use (slack for example) and those new apps using the context and channel functionalities have to extend and at the same time simplify the IT landscape.


Example: bot in a slack channel that notifies manager of an approval request and the manager can approve/deny directly in slack without leaving the app.

The fourth and the fifth talk have been rather technical/educational on specific frameworks (IBM System ML for Spark) and on models portability (PMML) with some good points around hyper parameter tuning using a spark cluster in iterative mode and DNN auto encoders.



The sixth talk has been about the open source voice assistant MyCroft and the related open source device schemas.

The session has been principally made on live demos showcasing several open source libraries that can be used to create a device with Alexa like capabilities:

  • Pocketsphinx for speechrecognition
  • Padatious for NLP intent detection
  • Mimic for text to speech
  • Adapt Intent parser


The last session was on tensor flow but also in general experiences around AI coming from Google, like how ML is used today:


And how Machine Learning is fundamental today with quotes like this:

  • Remember in 2010, when the hype was mobile-first? Hype was right. Machine Learning is similarly hyped now. Don’t get left behind
  • You must consider the user journey, the entire system. If users touch multiple components to solve a problem, transition must be seamless

Other pieces of advice where around talent research and maintain/grow/spread ML inside your organization :

How to hire ML experts:

  1. don’t ask a Quant to figure out your business model
  2. design autonomy
  3. $$$ for compute & data acquisition
  4. Never done!

How to Grow ML practice:

  1. Find ML Ninja (SWE + PM)
  2. Do Project incubation
  3. Do ML office hours / consulting

How to spread the knowledge:

  1. Build ML guidelines
  2. Perform internal training
  3. Do open sourcing

And on ML algorithms project prioritization and execution:

  1. Pick algorithms based on the success metrics & data you can get
  2. Pick a simple one and invest 50% of time into building quality evaluation of the model
  3. Build an experiment framework for eval & release process
  4. Feedback loop

Overall the quality has been good even if I was really disappointed to discover in the morning that one the most interesting session (with the legendary George Hotz!) has been cancelled.


Helping Troy Hunt for fun and profit

Hi everyone,  I’m a huge fan of the security expert Troy Hunt and of his incredible “free!” service haveibeenpwned.com (if you don’t know it, please use it now! to test if your email accounts are compromised ! ).


Now Troy has created a contest where you can actually win a shiny Lenovo laptop, if you create something “new” that can help people to be more aware of the security risks related to pwned accounts.

I decided to participate and my idea is the following, helping all the people that have gmail (and Hotmail/outlook/office 365 in alpha version!) accounts to verify if their friends, colleagues and family members have their email accounts compromised.

I uploaded the code and executables here and I strongly suggest you to read ENTIRELY the readme instructions to understand how the tool works, what are expected results and what you can do.

Regardless if I win the laptop or not, I already won because I was able, thanks to this tool, to alert my wife and some of my friends of the danger and to have the right “push” to convince them to setup two-factor authentication.

If you want to donate , for this effort please donate directly to Troy here, he deserves a good beer !


A massive adventure ends…

Hi everyone my journey in Microsoft ends and I want to use this opportunity to revisit many of the good moments I spent in this great company .

The first days I joined it was like entering in a new solar system where you have to learn all the names of the planets, moons, orbits at warp speed

but at the same time I had all the support of my manager , my team mates and my mentor !

Very quickly it was time to enter in action with my team and start working with clients, partners, developers, colleagues from different continents, teams and specializations.

We worked on labs and workshops helping customers and partners to discover and ramp up on Microsoft Advanced Analytics capabilities

and with customer engagements helping our clients to exploit all the possibilities that the Azure Cloud can provide to help them in reaching their goals

I certainly i cannot forget the incredible Microsoft Tech Ready event in Seattle:

and all the talented colleagues I had the opportunity to work with.

Finally I want to say a massive thank you to my v-team , I will miss you!!

As an adventure ends I am really excited to bring my passion and curiosity into a new one joining a customer .

I can’t wait to learn more, improve and be part of the massive digital transformation journey that is in front of us.




AI is progressing at incredible speed!

Several people tend to think that all the new AI technologies like Convolutional neural networks, Recurrent Neural Networks, Generative adversarial networks,etc.. are used mainly in tech giants like Google , Microsoft , etc.. , in reality many enterprises are already leveraging deep learning in production like Zalando, Instacart and many others . Well known deep learning frameworks like Keras, Tensorflow, CNTK, Caffe2, etc.. are now finally reaching a larger audience.

Big data engines like Spark are finally able to pilot also deep learning workloads and also the first steps to make large deep neural networks models fit inside small cpu/low memory, occasionally connected IOT devices are coming.

Finally new hardware has been built specifically for deep learning :

  1. https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-brainwave/
  2. https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
  3. https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/

But the AI space never sleeps and we are already seeing arriving new solutions/frameworks and architectures that are actually under development:

Ray: the new distributed execution framework that aims to replace the well known Spark!

Pytorch and fast.ai framework that wants to compete and beat all the existing deep learning frameworks

To overcome one of the biggest problems on deep learning : amount of training data, snorkel a new framework has been designed to create brand new training data with little human interaction

Finally to help to create a better integration and performance of the deep learning models with the applications that want to consume those models a new prediction serving system Clipper has been designed .

The speed of AI evolution is incredible and be prepared to see much more than this in the near future!

How to create the perfect Matchmaker with Bot Framework and Cognitive Services

Hi everyone, this time I wanted to showcase some of the many capabilities of Microsoft Cognitive Services using a “cupido”   bot built with Microsoft Bot Framework .

So what is the plan? Here some ideas:

  • Leverage only Facebook as channel! Why? Because with facebook you have people already “logged in” and you can leverage the messenger profile api to retrieve automatically the user details and more importantly his facebook photo!
  • Since usually the facebook photo is an image with a face , we can use this image with Vision and Face Api to understand gender, age and bunch of other interesting info without any user interaction!
  • We can score with a custom vision model that we trained using some publicly available images if a person looks like a super model or not 😉
  • Using all this info (age, gender, makeup, sunglasses, super model or not, hair color, etc…) collected with all those calls we can decide which candidates inside our database are the right ones for our user and display the ones that are fitting according to our demo rules.

Of course at the beginning our database of profiles will be empty , but with help of friends / colleagues we can quickly fill it and have fun during the demo.

So in practice how does it look like?

Here the first interaction, after saying hello the bot immediately personalizes the experience with our facebook data (foto, name, last name,etc..) and asks if we want to participate to the experiment:

After accepting it uses the described APIs to understand the image and calculate age, hair, super model score, etc…

Yeah, I know my super model score is not really good, but let’s see if there are any matches for me anyway….

Of course the bot is smart enough to display the profile of my wife otherwise I was really in a big problem :-).

Now I guess many of you have this question: how the super model score is calculated?

Well I trained the custom vision service of Microsoft with 30+ photos of real models and 30+ photos of “normal people” and after 4 iterations I had already a 90% accuracy on detecting super models in photos 😉

Of course there are several things to consider here:

  1. Images should be the focus of the picture
  2. have sufficiently diverse images, angles, lighting, and backgrounds
  3. Train with images that are similar (in quality) to the images that will be used in scoring

And we have for sure super model pics that have larger resolution, better lighting and good exposure vs the photos of “normal” people like you and me, but for the purposes of this demo the results were very good.

Another consideration to do is that you don’t always have to use Natural Language Processing in the bots (in our case in fact we skipped the usage of LUIS ) because, especially if we are not developing a Q&A/support bot, users prefer buttons and minimal amount of info to provide.

Imagine a Bot that handles your Netflix subscription, you just want  buttons like  activate/deactivate membership (if you go in vacation) and the other is “recommendations for tonight” .

Another important thing to consider is Bot Analytics and understand how your bot is performing, I leverage this great tool that under the covers uses Azure Application Insights:

If instead you are in love with statistics you can try this jupyter notebook with the following template to analyze with your custom code the Azure Application Insights metrics and events.

If you want to try the bot already with all the telemetry setup done , you can grab , compile and try the demo code (do not use this code in any production environment) that is available here and if this is your first bot start from this tutorial to understand a bit the various pieces needed.

How to generate Terabytes of IOT data with Azure Data Lake Analytics

Hi everyone, during one of my projects I’ve been asked the following question:

I’m actually storing my IOT sensor’s data in Azure Data Lake for analysis and feature engineering , but currently I still have very few devices, so not a big amount of data and I’m not able to understand how much fast will be my queries and my transformations when with much more devices and months/years of sensor data my data lake will reach do over several terabytes.

Well in that case let’s generate quickly those terabytes of data using U-SQL capabilities!

Let’s assume that our data resembles the following:

deviceId, timestamp, sensorValue, …….

so we have for each IOT device a unique identifier called deviceId and let’s assume is a composition of numbers and letters, we have a timestamp indicating the time at millisecond precision, where the IOT event was generated and finally we have the values of the sensors in that moment (temperature, speed, etc..).

The idea is the following give a real deviceId, generate N “synthetic deviceIds” that have all the same data of the original device . So if we have , for example , 5 real deviceId each with 100.000.000 records (500.000.000 records in total), if we generate 1000 synthetic deviceIds for each real deviceId  we will have 1000x5x100.000.000 additional records so 500.000.000.000 records.

But we can expand the amount of synthetic data even more playing with time, for example, if our real data has events only for  2017, we can duplicate the entire dataset for all the years starting from 2006 to 2016 and have records.

Here some sample C# code that generates the synthetic deviceIds:

note the GetArraysOfSyntheticDevices function that will be executed into the U-SQL script.

Before using it we have to register the assembly into our DataLake account and database (in my case the master one):

DROP ASSEMBLY master.[Microsoft.DataGenUtils];
CREATE ASSEMBLY master.[Microsoft.DataGenUtils] FROM @”location of dll”;

Now we can read the original IOT data and create the additional data:

REFERENCE ASSEMBLY master.[Microsoft.DataGenUtils];

@t0 =

deviceid string,
timeofevent DateTime,
sensorvalue float
FROM “2017/IOTRealData.csv”
USING Extractors.Csv();

//Let’s have the distinct list of all the real DeviceIds
deviceid AS deviceid
FROM @t0;

//Let’s calculate for each deviceId an array of 1000 synthetic devices

@t2 =
SELECT deviceid,
Microsoft.DataGenUtils.SyntheticData.GetArrayOfSynteticDevices(deviceid, 1000) AS SyntheticDevices
FROM @t1;

//Let’s assign to each array of synthetic devices the same data of the corresponding real device

@t3 = SELECT a.SyntheticDevices,
FROM @t0 AS de INNER JOIN @t2 AS a ON de.deviceid== a.deviceid;

//Let’s use the explode function to expand the array to records

@t1Exploded =
emp AS deviceid,
FROM @t3 AS de
EXPLODE(de.SyntheticDevices) AS dp(emp);

//Now we can write the expanded data

OUTPUT @t1Exploded
TO “SyntethicData/2017/expanded_{*}.csv”
USING Outputters.Csv();

Once you have the expanded data for the entire 2017 you can just use c# DateTime functions that add Years, Months or days to a specific date, applied that to timeofevent column and write the new data in a new folder (for example SyntethicData/2016, SyntethicData/2015 etc…).


UniFi – Install a UniFi Cloud Controller on Azure

Hi everyone, this time I want to share one of my weekend projects inspired by this article of Troy Hunt.  As Troy I also experienced the pain of a single router/gateway/access point device and I decided to switch to UniFi devices .

On the net you can find dozens of tutorials on how assemble the various bits, here instead I want to explain how to set up the UniFi Controller software on Azure with very simple steps.

Step 1 : Go the following website https://azure.microsoft.com/ and register . You will receive 200$ of Azure credits that can be used in the first month. Alternatevely you can register for visual studio essentials program and have 25$ each month for 1 year .

Step 2: Once you have your subscription up and running you can provision a Linux VM clicking on the big “+” button and searching for Linux Ubuntu Server:Step 3: After you selected the Virtual Machine just give to it a name, username, password , choose as VM disk type HDD, a resource group name that you like and a data center near to where you live.

Step 4: Set the size of the VM ( I used A2 standard but you can try also with A1 or even A0, also remember you can schedule the vm to start/stop only when you need and save a lot of money):

Step 5: In the additional settings leave everything to the default values and finally hit the purchase button!

Step 6: After few minutes or even less your VM is ready and you should be able to see a screen like this:

Write down the Public IP Address because you will need it shortly.

Step 7: Setup the open ports in order to have the Controller working correctly.

First go the the network interface of the vm

Select the first one (there should be only one):

Now select the network security group first clicking on link n.1 and then on the link n.2:

Finally here add the necessary inbound rules exactly as described here:

Step 8: Connect using Putty on Windows or you Mac OS standard shell to the mentioned IP address and install the Unifi controller software with those commands:

echo “deb http://www.ubnt.com/downloads/unifi/debian unifi5 ubiquiti” | sudo tee -a /etc/apt/sources.list

sudo apt-key adv –keyserver keyserver.ubuntu.com –recv 06E85760C0A52C50

sudo apt-get update

sudo apt-get install unifi

Step 9: Connect to the controller web interface located here https://IP_Address:8443/ and complete the UniFi wizard:

Finally you may now proceed to adopt your UniFi devices using Layer 3 Adoption !