Automatic Machine Learning with DataRobot

Hi this time I want to share with you my experimentations with a DataRobot , an automated machine learning software that has promised to help to leverage machine learning techniques with few clicks of mouse .

Let’s see it in action with a very simple dataset, the so called Titanic: Machine Learning from disaster competition on Kaggle (Extract):

“The sinking of the RMS Titanic is one of the most infamous shipwrecks in history.  On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.

In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy.”

The data looks like this:

Screen Shot 2018-04-24 at 9.21.21 PM

and we can see that for each passenger we have some data (Name, Sex, Age,  Port of Embark, Ticket Class) and the indication if this passenger is survived to the Titanic sank in the Survived column. Our objective is to discover as much as possible inside all the information we have about passengers if there are and what are the key influencers of the survival .

The interface of DataRobot is very simple at the start, asking us to upload the data we have: Screen Shot 2018-04-24 at 9.33.03 PM

Once the upload is done DataRobot asks us which is the column we want to “predict” :

Screen Shot 2018-04-24 at 9.35.49 PM

and once we select our target column we can press the big “Start” button .

Here DataRobot analyzes our file and calculates how many models are suitable for this data and once done that start automatically “training” all those models in parallel according to the process power we select (“workers” setting).

Screen Shot 2018-04-24 at 9.38.15 PM

Screen Shot 2018-04-24 at 9.41.30 PM

Once all the models are trained on the leaderboard we can find at first place the best model possible according to the metric that DataRobot picks for the problem we are trying to solve.

Once we have selected the “best model” we can understand what are the key findings like those ones:

Screen Shot 2018-04-24 at 9.51.48 PM

Screen Shot 2018-04-24 at 10.10.09 PM

in other words females in first class had high chances of survival while men in third class were really at risk of not surviving.

Using the predict feature of Data Robot we tested with an external kaggle “test” file the accuracy of this model uploading the predictions obtained by Data Robot to Kaggle and here is the result:

Screen Shot 2018-04-24 at 9.55.48 PM

which is absolutely not bad , because given the fact that 9408 data scientists participated to this competition , this means I am in the top 18% globally!

The pros: I did not touch the data like adding more features, normalizing the data, removing columns like IDs , etc…, the data was analyzed by DataRobot as is.  I used all default settings without touching any “advanced option”.

The cons: DataRobot misses a data preparation functionality (you can try products like  Alteryx or Trifacta in combination with DataRobot) and this means that we have to at least use two products to manage end to end a data science experiment that involves typically operations like joins of multiple tables, files, complex aggregations, sub queries, etc..

Finally while we have to absolutely admit that 80% of data science experimentation job is around data collection, source access, data preparation, cleaning, etc.., at the same time DataRobot can unlock several quick wins and opportunities in all the IT departments where several analysts /developers are really expert in those activities , they have the business knowledge of the data but they lack data science abilities .


Bot hand off to agent with Salesforce Live Chat Part 2

Hi our previous article we introduced the api calls to send and receive messages to a live agent on Salesforce. Now it’s time to add the bot component and combine bot and live agent to implement the hand off .

For the bot I used one the of frameworks I know better the Microsoft Bot Framework , but some of the concepts can be applied also to other bot solutions.

We start using the Bot Intermediator Sample provided here  , that has already some functionality built in. In particular it uses the bot routing engine that can has been built with the idea of routing conversations between user, bot and agent , creating when needed direct conversations between the user and agent that is actually routed by the bot using this engine.

Let’s see a way that we can use to combine this with salesforce live agent api , we will take some shortcuts and this solution it is not meant to be used in production environment, but hopefully can give you an idea of how you can design a fully fledged solution .

  1. When in the conversation is mentioned the word “human” the intermediator sample triggers the request of intervention of an agent and parks the request inside the database of pending requests of the routing engine . Our addition it has been to define an additional ConcurrentDictionary as in memory storage to store the request and its conversation and add later other properties interesting for us.
  2. Using quartz scheduling engine we can monitor with a recurring job the pending requests of the routing engine , dequeue them starting (always using quartz) an on demand job that opens a connection with live chat , waits that the agent takes the call and binds into to the request the sessionId and the other properties of the LiveChat session opened. This thread can finish here but before we start another on demand thread that is watching any incoming message coming for this request from LiveChat session and routes them to the conversation opened at step 1
  3. In the message controller of the bot, in addition to the default routing rules, we add another rule that checks if the current conversation is “attached” to a live chat session and if yes sends all the chat messages written by the user to the related live chat session.
  4. When the watch live chat session thread does not receive more messages goes in timeout or receives a disconnect/end chat event , it removes the conversation with live chat session from the dictionary and from this moment if the user writes again , he will write to the bot and he wants again to speak with an agent he has to trigger the human “keyword” again.

Here some screenshots:

Chat begins with bot that simply repeats the sentences we write

Screen Shot 2018-03-20 at 9.32.05 PM

Live Agent is ready to handle new calls

Screen Shot 2018-03-20 at 9.32.27 PM


Let’s ask for help

Screen Shot 2018-03-20 at 9.35.01 PM

And here the request arrives on live chat

Screen Shot 2018-03-20 at 9.35.15 PM

Once accepted we can start the hand off starting a case in salesforce

Screen Shot 2018-03-20 at 9.35.28 PM

And here we can check if we are taking to a human 🙂

Screen Shot 2018-03-20 at 9.38.56 PM

Screen Shot 2018-03-20 at 9.38.40 PM

In the third and final part we will look inside some code snipplets that show case this functionality and we will describe what can be a good design of the solution if we want to industrialize it.


Bot hand off to agent with Salesforce Live Chat Part 1

Hi everyone, one of the most requested features into modern implementations is a smooth transition from the automated response system (our lovely bot) to a human.

Our objective in fact is usually the following:

  1. Handle the customer request  first doing a qualification of the request (collect data, ask additional information)
  2. Now it can happen that the request can be handled with simple and repetitive solution and bot should exactly cover this scenario
  3. It can also happen that the request is so complex that can be handled only by a call center operator but we will make good usage of the operator’s time because he will be involved in an activity where he can bring a distinctive value

One the most used Call Center modules for human assistance on a case is Salesforce Live Chat and it makes sense to understand how we can make a transition from any bot implementation to Live Chat without requesting the customer to change UI, transition to another web page and more importantly to re-type all the information he wrote at the qualification state (so assuming that the triage has been done in the bot application we want to bring the entire conversation state from the bot to the live agent attention).


Let’s start with the basics and see the “how to” from the beginning:

First you need a salesforce developer sandbox for your testing , you can request one for free here.

Once you have your sandbox you have to enable the live agent functionality, following the steps described here , please pay attention to each step and your last step should be this one .

You can try if everything works just creating a sample html page with javascript created by the buttons functionality and the deployment one (remember to put the deployment javascript at the end of the page before the closing body tag!).

If you want an unofficial guide to help you more check also this blog  or this other blog .

At this point you should have your live chat working nicely and we can now proceed to study the salesforce live agent rest api that allow us to us the live chat functionality programmatically.

If you look a bit to how the API works you will soon notice that this API has been design to be consumed mainly directly by final clients (web pages or mobile apps) while it lacks some Server to Server functionality like web-hooks , so in a nutshell it is very helpful if you want to build a branded web page or IOS/Android app for call center support but it a bit less helpful to use it for transitioning a conversation from a server application (our bot).

In order to use the api we need some info: your Salesforce Organization Id ,  your live agent deploymentId , live agent buttonId and finally the live agent api endpoint.

You can find this info here and in this guide.

Ok now can finally start with some coding 🙂 , I will use c# (running from a Mac) so I guess it can run on any platform .

First we need to do our first rest call to retrive the session ID for the new session, the session key for the new session, the affinity token for the session that’s passed in the header for all future requests and finally the clientPollTimeout that represents the number of seconds before you must make a Messages request before your Messages long polling loop times out and is terminated (we will understand this better later):

 private static async Task<ChatObj> createSession()
             string sessionEndpoint = liveAgentEndPoint + liveAgentSessionRelativePath;
             HttpClient client = new HttpClient();
         client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-API-VERSION", liveAgentApiVersion);
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-AFFINITY", "null");
             HttpResponseMessage response = await client.GetAsync(sessionEndpoint);
             JObject jObj = new JObject();
             if (response.IsSuccessStatusCode)
                 string resp = await response.Content.ReadAsStringAsync();
                 jObj = JObject.Parse(resp);

             ChatObj chatObj = new ChatObj();
             return chatObj;

Now that we have this information we can actually say to the live agent that we would like to start a chat session with him (!) and this requires another api call to request a chat visitor session and this session will be actually opened only when the live agent accepts the request into the salesforce console.

So first we do the request:

  private static async Task createChatRequest(ChatBag chatObj)
             HttpClient client = new HttpClient();
             client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-API-VERSION", liveAgentApiVersion);
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-AFFINITY", chatObj.getAffinityToken());
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-SESSION-KEY", chatObj.getSessionKey());
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-SEQUENCE", "1");
             JObject body = new JObject();
             body.Add(new JProperty("organizationId", liveAgentOrgId));
             body.Add(new JProperty("deploymentId", liveAgentDeploymentId));
             body.Add(new JProperty("buttonId", liveAgentButtonId));
             body.Add(new JProperty("sessionId", chatObj.getSessionId()));
             body.Add(new JProperty("trackingId", ""));
             body.Add(new JProperty("userAgent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"));
             body.Add(new JProperty("language", "en-US"));
             body.Add(new JProperty("screenResolution", "1440x900"));
             body.Add(new JProperty("visitorName", "ConsoleTest"));
             body.Add(new JProperty("prechatDetails", new List<String>()));
             body.Add(new JProperty("receiveQueueUpdates", true));
             body.Add(new JProperty("prechatEntities", new List<String>()));
             body.Add(new JProperty("isPost", true));
             StringContent cnt = new StringContent(body.ToString(), Encoding.UTF8, "application/json");
             HttpResponseMessage response = await client.PostAsync(liveAgentEndPoint + liveAgentChasitorRelativePath, cnt);
             if (response.IsSuccessStatusCode)
                 string responseText = await response.Content.ReadAsStringAsync();


If everything went right we should receive an “OK” as response while we wait for the operator to actually accept the visitor session request.

An important thing to notice is that the API supports prechatDetails and prechatEntities objects that we can use to bring with us the conversation data that the customer had with the bot , so the live agent can look at this info and immediately help the customer with the right context without re-asking the same questions.

Since the process of approval to start the chat is not automatic but we have to wait for the live agent to accept, at this stage we have just to poll the Message api and wait for having the confirmation using a thread that calls the api in this way:

  private static async Task<ChatMessageResponse> receiveMessages(ChatBag chatObj)
             ChatMessageResponse jObj = new ChatMessageResponse();
             HttpClient client = new HttpClient();
             client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-API-VERSION", liveAgentApiVersion);
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-AFFINITY", chatObj.getAffinityToken());
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-SESSION-KEY", chatObj.getSessionKey());

            HttpResponseMessage response = await client.GetAsync(liveAgentEndPoint + liveAgentMessagesRelativePath);
             if (response.IsSuccessStatusCode)
                 string respText = await responseContent.ReadAsStringAsync();
                 jObj = JsonConvert.DeserializeObject<ChatMessageResponse>(respText);

                 if (jObj!=null)
                     var msgs = from x in jObj.messages
                                                             where x.type == "ChatRequestSuccess"
                                    select x;
                     foreach (Messages activity in msgs)
                         Console.WriteLine("VisitorId: " +activity.message.visitorId);
             return jObj;

Ok so when we receive the ChatRequestSuccess Type message, this means that chat request was successful and routed to available agents .

To be completely sure that an agent really accepted our conversation we have to wait for the ChatEnstablished Type message where we can also read the name and the id of the agent answering us.

Ok now we can finally send an “Hello Mr Agent!” text to our Live Agent with this api:

  private static async Task sendTxtMessage(ChatBag chatObj,string textToSend)
             HttpClient client = new HttpClient();
             client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-API-VERSION", liveAgentApiVersion);
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-AFFINITY", chatObj.getAffinityToken());
             client.DefaultRequestHeaders.Add("X-LIVEAGENT-SESSION-KEY", chatObj.getSessionKey());
             JObject bodyT = new JObject();
             bodyT.Add(new JProperty("text", textToSend));
             StringContent cnt = new StringContent(bodyT.ToString(), Encoding.UTF8, "application/json");
             HttpResponseMessage response = await client.PostAsync(liveAgentEndPoint + liveAgentChasitorChatRelativePath, cnt);
             if (response.IsSuccessStatusCode)
                 string respText = await responseStep2.Content.ReadAsStringAsync();

And we can receive the replies of the agent always using the same receive message polling technique but this time searching for  ChatMessage  type kind of messages.

In the next part of the article I will go through the integration with a bot and attempt to see how we can implement the hand off !

Text Analytics with Facebook FastText

I recently I had to work a very interesting challenge, imagine a very large collection of phrases/statements and you want to derive from each of them the key ask/term .

An example can be a large collection of emails, scan the subjects and understand what was the key topic of the email, so for example an email with the following subject : “Dear dad the printer is no more working….” we can say that the key is “printer“.


If you try to apply generic key phrases algorithms , they can work pretty well in a very general context, but if your context is specific they will miss several key terms that are part of the dictionary of your context.


I successfully used facebook fasttext for this supervised classification task, and here is what you need to make it work :

  1. A Virtual Machine Ubuntu Linux 16.04 (is free on a Macbook with Parallels Lite)
  2. Download and compile fast text as described here
  3. A training set with statements and corresponding labels
  4. A validation set with statements and corresponding labels

So yes you need to manually label a good amount of statements to make fasttext “understand” well your dataset. 

Of course you can speed up the process transforming each statement into an array of the words and massively assign those labels where it makes sense ( python pandas or just old SQL can help you here).

Let’s see how to build the training set:

create a file called train.csv and here insert each statement in a line in the following format __label__labelname here write your statement.

Let’s make an example with the email subject used before:

__label__printer Dear dad the printer is no more working

You can have also multiple labels for the same statement, let’s see this example:

__label__printer __label__wifi Dear dad the printer and the wifi are dead

The validation set can be another file called validation.csv filled exactly in the same way, and of course you have to follow the usual good practices to split correctly your labeled dataset into the training dataset and validation dataset.

In order to start the training with fasttext you have to type the following command:

./fasttext supervised -input data/train.csv -output result/model_1 -lr 1.0 -epoch 25 -wordNgrams 2

this assumes that you are with terminal inside the fasttext folder and the training file is inside a subfolder called data , while the resulting model will be saved in a result folder.

I added also some other optional arguments to improve the precision in my specific case, you can look at those options here.

Once the training is done (you will understand why is called fasttext here!) , you can check the precision of the model in this way:

./fasttext test result/model_1.bin data/valid.csv

In my case I obtained a good 83% 🙂


If you want to test your model manually (typing sentences and obtaining the corresponding labels) , you can try the following command:

./fasttext predict result/model_1.bin –

Fasttext has also python wrappers , like this one I used and you can leverage this wrapper to perform a massive scoring like I did here:

from fastText import load_model
import pandas as pd
k = 1
for index, row in df.iterrows():
labels, probabilities = fastm.predict(str(row["short_statement"]), k)
for w, f in zip(labels, probabilities):

You can improve the entire process in many different ways, for example you can use the unsupervised training to obtain word vectors for you dataset , use this “dictionary” as base for your list of labels and use the nearest neighbor to find similar words that can grouped into single labels when doing the supervised training.


Let’s dig in our email!

As many of you, even if we are almost in 2018, I still work A LOT using emails and recently I was asking myself the following question what if I can leverage analytics and also machine learning to have a better understanding of my emails?


Here is a quick way to understand who is inspiring you more


and who are instead the ones spreading a bit more negativity in your daily job 🙂


You will need (if you want to process ALL your emails in one shot!) :

  1. Windows 7/8/10
  2. Outlook 2013 or 2016
  3. Access 2013 or 2016
  4. An Azure Subscription
  5. A data lake store and analytics account
  6. PowerBI Desktop or any other Visualization Tool you like (Tableau or simply Excel)

Step 1 : Link MS Access Tables to your Outlook folders as explained here

Step 2: Export from Access to csv files your emails.

Step 3: Upload those files to your data lake store.

Step 4: Process the fields containing text data with the U-SQL cognitive extensions and derive sentiment and key phrases of each email

Step 5: With PowerBI Desktop you can access the output data sitting into the data lake store as described here

Step 6: Find the senders with highest average sentiment and the ones with the lowest one 🙂 .


If you are worried about leaving your emails in the cloud, after obtaining the sentiment and key phrases , you can download this latest output and remove all the data from data lake store , using this (local) file as input for power bi desktop.

In addition to this I would also suggest to perform a one way hash of the sender email address and upload to the data lake store account the emails with this hashed field instead of the real sender.


Once you have the data lake analytics job results you can download them and join locally in Access to associate again each email to the original sender.


From Windows 10 to Mac OS Sierra without admin privileges

Hi everyone, lately thanks to my manager and my new employer I was able to switch from a Windows 10 laptop to a shiny mac book pro  and I want to share with you some tips and tricks that probably you will encounter if you will do the same. First let’s start with the basics: why I have chosen to switch? Well I always (since 2009) had only Apple devices at home and I always loved the consistency and the “stability” of the Apple devices, but I never had the opportunity to actually “work” with a Mac , so this is also a learning for me. If you actually never used a Mac the first obstacle will be shortcuts like CTRL+C and CTRL+V , the mouse clicks (actually the right click on the track pad), the scrolling with 2 fingers on the trackpad and now the shiny and mysterious touch bar. Passed this first shock, you will quickly get used to the magic search experience of spotlight, the backup for dummies of time machine and the well known experience of the App Store.

Now let’s focus on the work related stuff: you can finally have on a mac also office 2016 but it is miles and miles away from the functionalities and easy of use of Office 2016 on windows, not super evident differences but if you use office professionally you will quickly find the missing pieces.

Solution ? Go the App Store, purchase Parallels Lite and enjoy Linux and Windows Virtual Machines. You will have VMs without being admin because Parallels Lite uses the native hypervisor available on Mac since Yosemite.

Thanks to this I was able to have back also several “life saving” applications that I use daily like PowerBi Desktop, SQL Server Management Studio and Visual Studio 2017. To be honest they have their versions in the mac world but the functionalities that are missing in those versions are too numerous to live only with that.

So I ended up having a windows 10 VM full of software, so why don’t use directly windows? Well , with the windows VM i can exactly use windows for the apps that are running great on that platform and if the system starts to be unstable I can still normally work on my mac without losing my work while windows does his own “things” 🙂 .

When needed I leverage an ubuntu VM with docker  and vs code with the same segregation of duties principle (main OS fast and stable, guest OS with rich and dedicated software).

Now I work several times in this way : sql server hosted on linux, I do import/export of external data easily with Sql server management studio from windows and I run pyspark notebooks on docker accessing the same data and finally I do visualizations with power bi desktop on windows.

In case, like me, you have strict policies around admin accounts , I want to share with you this: do you remember the concept of portable apps in windows? Well on the mac you can do the same with some (not all) the applications that are outside the App Store (you can install almost all the apps in the App Store without admin privileges).

The technique to have an application on mac “portable” is simply the double extraction of the pkg files and Payload files to one folder that you can access (like your desktop), you can check the details here and here and basically run those applications from the locations that you like.

The exceptions will be :

  1. Applications not signed by a recognized and well know developer or software house
  2. Applications that on start up will ask you to install additional services
  3. Applications that before being launched require the registration of specific libraries/frameworks

There are cases (like azure machine learning workbech ) where the installer it’s actually writing everything in you user account folders but the last step will be the copy of the UI app to the Applications folder and this will fail if you are not admin. The solution is to look a bit inside the installer folders and find inside the json files the location of the downloaded packages . Once you find the URL of the missing one (use the installer error message to help you to find the package he was not able to copy) , download it locally and execute the app from any location, it should work without problems.


Jazoon 2017 AI meet Developers Conference Review

Hi I had the opportunity to participate to this conference in Zurich on the 27 October 2017 and attend to the following sessions:

  • Build Your Intelligent Enterprise with SAP Machine Learning
  • Applied AI: Real-World Use Cases for Microsoft’s Azure Cognitive Services
  • Run Deep Learning models in the browser with JavaScript and ConvNetJS
  • Using messaging and AI to build novel user interfaces for work
  • JVM based DeepLearning on IoT data with Apache Spark
  • Apache Spark for Machine Learning on Large Data Sets
  • Anatomy of an open source voice assistant
  • Building products with TensorFlow

Most of the sessions have been recorded and they are available here:

The first session has been a more a sales/pre-recorded demos presentation of SAP capabilities in terms of AI mainly in their cloud:


But with some interesting ideas like the Brand Impact Video analyzer that computes how much airtime is filled by specific brands inside a video:


And another good use case representation is the defective product automatic recognition using image similarity distance API:


The second session has been around the new AI capabilities offered by Microsoft and divided into two parts:

Capabilities for data scientists that want to build their python models

  • Azure Machine Learning Workbench that is an electron based desktop app that mainly accelerates the data preparation tasks using “a learn by example” engine that creates on the fly data preparation code.


  • Azure Notebooks a free but limited Cloud Based Jupyter Notebook environment to share and re-use models/notebooks


  • Azure Data Science Virtual Machine a pre-built VM with all the most common DS packages (TensorFlow, Caffe, R, Python, etc..)


Capabilities (i.e. Face/Age/Sentiment/OCR/Hand written detection) for developers that want to consume Microsoft pre-trained models calling directly Microsoft Cognitive API



The third session has been more an “educational presentation” around deep learning, and how at high level a deep learning system work, however we have seen in this talk some interesting topics:

  • The existence of several pre-trained models that can be used as is especially for featurization purposes and/or for transfer learning


  • How to visualize neural networks with web sites like
  • A significant amount of demos that can show case DNN applications that can run directly in the browser

The fourth session has been one also an interesting session, because the speaker clearly explained the current possibilities and limits of the current application development landscape and in particular of the enterprise bots.


Key take away: Bots are far from being smart and people don’t want to type text.

Suggested approach bots are new apps that are reaching their “customers” in the channels that they already use (slack for example) and those new apps using the context and channel functionalities have to extend and at the same time simplify the IT landscape.


Example: bot in a slack channel that notifies manager of an approval request and the manager can approve/deny directly in slack without leaving the app.

The fourth and the fifth talk have been rather technical/educational on specific frameworks (IBM System ML for Spark) and on models portability (PMML) with some good points around hyper parameter tuning using a spark cluster in iterative mode and DNN auto encoders.



The sixth talk has been about the open source voice assistant MyCroft and the related open source device schemas.

The session has been principally made on live demos showcasing several open source libraries that can be used to create a device with Alexa like capabilities:

  • Pocketsphinx for speechrecognition
  • Padatious for NLP intent detection
  • Mimic for text to speech
  • Adapt Intent parser


The last session was on tensor flow but also in general experiences around AI coming from Google, like how ML is used today:


And how Machine Learning is fundamental today with quotes like this:

  • Remember in 2010, when the hype was mobile-first? Hype was right. Machine Learning is similarly hyped now. Don’t get left behind
  • You must consider the user journey, the entire system. If users touch multiple components to solve a problem, transition must be seamless

Other pieces of advice where around talent research and maintain/grow/spread ML inside your organization :

How to hire ML experts:

  1. don’t ask a Quant to figure out your business model
  2. design autonomy
  3. $$$ for compute & data acquisition
  4. Never done!

How to Grow ML practice:

  1. Find ML Ninja (SWE + PM)
  2. Do Project incubation
  3. Do ML office hours / consulting

How to spread the knowledge:

  1. Build ML guidelines
  2. Perform internal training
  3. Do open sourcing

And on ML algorithms project prioritization and execution:

  1. Pick algorithms based on the success metrics & data you can get
  2. Pick a simple one and invest 50% of time into building quality evaluation of the model
  3. Build an experiment framework for eval & release process
  4. Feedback loop

Overall the quality has been good even if I was really disappointed to discover in the morning that one the most interesting session (with the legendary George Hotz!) has been cancelled.