Technologies
Azure Data Factory or ADF is the Microsoft cloud orchestration tool for managing ETL operations amongst other things. Using it, can sometimes feel a little frustrating , a bit like traveling along a road your Sat Nav says is fine and is a major road when…. You get the picture.
With most visual pipeline tools, the pipelines perform sets of tasks. In previous Microsoft technologies such as the Microsoft SSIS and other data flow types of platforms the ability to control execution flow by business conditional logic is an important element of that technology.
The problem…
In ADF V2 Microsoft allows 4 types of exit flow from a normal pipeline task box ; success, failure, completion and skipped. For these specific exit conditions , see details here .
This is fine if you are just looking at a simple flow from one task to another and trapping errors, however you may want to check a conditional operation and output branch according to the result of the conditional check. How can you do that ? A good example would be: does an input file exist and have one branch execute if the file does exist and another if it doesn’t.
Unfortunately ADF V2 as yet does not have anything straight out of the box that can do this at the moment. Ideally it should have a true and false activity paths. I have raised an enhancement feedback request to Microsoft, and if you fancied using one of your precious votes , you can find it here.
The Solution
This is a crude sticking plaster way of achieving this with the IF condition task. You write a SQL stored procedure that when called will generate a know exception. This will force the activity flow along the exception path. You then check either with another IF Condition or Switch what the exception was that was raised. If it is the one that you generated then you can process accordingly, otherwise it is a genuine exception and you process it as such. So to break this down into stages:
- Write a generic SQL Stored Procedure that takes 2 parameters, MessageId and Message
- This Stored Procedure should generate an exception with the Messageid and Message passed in
- Within the Initial IF Condition call this Stored Procedure to implement the branching. In the example above. It could be Messageid =50001 , Message =’No Input file found’
- On the failure path of the first IF Condition, have a second IF Condition that checks for the MessageId , say 50001as in the example.
- If found then all the processing for ‘File not found can be achieved here, within the relavant IF activity (true of false path).
- If it is not 50001 then it is a genuine exception and should be handled accordingly on the other part of the IF Condition.
So this does work though is not ideal. All the logic for the second path has to be contained in either the True or False branch of the second If Condition, with the exception handling with the same constraints.
Personally one of the less positive things about ADF V2 is the way logic and flow boxes are hidden within For Each Loops and Conditional tasks . It makes the visual code harder to follow.
But any way , hope this helps and Stay Safe.