Complex JSON Transformation with Jolt - Hortonworks
Today we don't have a good way to perform JSON manipulation in NiFi. The kind when the output is, again, a JSON. Typical transforms:
- Field renaming
- Enrichment, default fields for sparse incoming JSON
- Transposing, map->list, list->map, etc.
- Obfuscating sensitive fields
Some functionality can be achieved with a ReplaceText processor, but there are major issues:
- It operates on a text string, not structured
- Replace can backfire when there is a regex match in an unexpected location
Proposed Solution
Create a dedicated JSON Transform processor. While doing my research I locked in on Jolt: http://bazaarvoice.github.io/jolt/
- Java-based implementation. There are myriads JSON transform libraries, but most of them are JavaScript or even browser-focused only
- Alternatives like a JSON serializer for JDK's XSLT parsers might work, but are usually way too much trouble than they are worth. XSLT files aren't the most user friendly bits either
- Jolt transform spec is, in turn, a JSON
- Any complex transformation logic which can't be expressed in standard terms can be plugged in via a Java extension class with Jolt
- There is an online interactive design tool, which helps with the 'no UI' aspect: http://jolt-demo.appspot.com/
Examples
Below is an example transformation I needed in one of the flows (would like to substitute a ReplaceText with this new transformer eventually). The use case - rename one of the fields in the incoming JSON to bring it to a common data format which streams into a central location. Much more complicated transformations are, of course, possible, and are listed in the Jolt online demo app (link above).
Read full article from Complex JSON Transformation with Jolt - Hortonworks
AWS big data consultant should understand the need of Data, and they should work to build more appropriate services to meet the requirements of their clients.
ReplyDelete