Indexing SQL Data Warehouse with Azure Search and consume it with Salesforce

SQL Data Warehouse is one of the brand new PAAS services that Microsoft offers as part of their Cortana Analytics suite. If we want to find a way to describe it quickly we can say that is the Azure equivalent of Amazon Redshift with some differences of course, but in essence is a cloud MPP database that can be scaled quickly on demand , can ingest TB of data and leveraging the multi node architecture can process/query this data in seconds. One of the limitations of these systems (MPP) is that they cannot handle an high number of concurrent queries , usually the maximum value is around 20/30 and if the that number is reached new queries will be parked in queues (of course the entire query queue processing logic is more complicated than this but for in essence is that). Now this is fine is you plan to expose your data warehouse only to few analysts but if you want to make this data available also to other endpoints/consumers this became a problem. An example can be this : imagine in fact that you store in your DWH all your customer data/interactions/orders collected across multiple systems and you want your call center agents to access this info in order to correctly support your customers. So you want a super quick response time service called by several clients for searching costumers data and while this is a task that a classic relational database can perform with no problems , as said this represents a problem for an MPP database.

Luckily azure PAAS offering has multiple ways to solve this problem, one way can be to use Azure Search to index SQL Data Warehouse costumer data and offer to call center systems the Azure Search API as the service that will provide the search capabilities needed.

Azure Search has also the Indexer functionality that with pure configuration can automatically index a portion of a database, however looking at Azure Search documentation it seems that SQL Data Warehouse is not supported , but trying to use the Azure SQL Indexer I had no problems in performing the configuration task. I was able ,following the mentioned documentation, to schedule the indexing process and using the High Water Mark Change Detection policy (I had a field with timestamp) I was able to process the data progressively as it was updated/added to the DWH.

If you need to launch on demand indexing you can leverage the Azure Search Rest API and call Run Indexer operation , this combined with an Azure Logic App can let you update the index every time you need according to your workflow design (event based, depending on another scheduling  etc ..).

Consuming this service , from a call center software like Salesforce service cloud can be done using callouts consuming Azure Search Rest Api, a good example can be this , just change the json generation part so that instead of making queries on Salesforce accounts you actually call the Azure Search Rest Api. Actually you have to make two calls : one for the autocomplete (call the suggestions api ) that will provide you suggestions (so even if you make a typo the fuzzy search will help you to find the right customer) sand one that is the search api that will give you the customer detail data coming from the index.

One important thing to note: Salesforce callouts have concurrency limits but these limits can be bypassed using some tricks as described here .