CSV to CSV data transformation using Interoperability
Hi folks!
Have a question for those who are masters of interoperability.
I have a basic task of having one CSV with some data. I need to transform one column in the initial dataset and get the new csv with the same form.
What's the best approach with Interoperability?
Should I user record mapper?
Should I use streams, objects?
What is the best practice?
I would not create record mapper unless you want to make sure the data is in correct format. If you already have record maps defined I see no issue using them.
I don't plan to use record maps at all. The idea is to use DTL for every record.
I created a BPL and read one line at a time. I tried updating data line in code without DTL. I get an error instead of seeing output file. Please take a look in git repo here (just start the production and look for messages from File Passthrough Service):
https://github.com/oliverwilms/interoperability-update-datafile
If you need to process the entire file and no line filtering, I would go for using pass through file service (EnsLib.File.PassthroughService) to send an instance of stream container (Ens.StreamContainer) to either a message router (EnsLib.MsgRouter.RoutingEngine) or custom (BPL or code) process, and use a transform (class extending Ens.DataTransform) to transform the source stream container into a target stream container and send it to the file pass through operation (EnsLib.File.PassthroughOperation) for output.
I would use a custom process over message router if transform needs data source(s) (e.g. response from another process or operation) other than the input file. The transform can pick a suitable target stream class (Extending %Stream.Object) to hold in the Ens.StreamContainer depending on where you want to store the data (database vs file system,…)
HTH
@Robert Barbiaux , this is very cool!
In fact the purpose of what I plan to do is to expose the idea of data-transformation for newcomers in the simplest possible manner.
I wanted to have every line as a message that contains data that will be transformed via the rule.
I understand that in a real-life interoperability cases one message should be a one file/stream but the purpose is to explain how engine works.
For a simple message transformation flow example, I would go for record map :
So you can focus on DTL and the whole flow can be done from the administration portal, look ma, no code ;-)
Indeed, it works! Thank you, Robert!
One issue though: as a result of Operation I have one new file per new message/record in a source file. Any chance to ask Production to put all the same amount records as were in the initial file?
If you set "Filename" to '%f', the output file name will be the same as the input file name and records from one input file will be appended to an output file with the same name.
Thank you, Robert!
This could work but for some reason '%f' doesn't work for record mapper:
I'm getting <NOTOPEN> error if it is only the '%f'
and if I use the default setting of FileOperation as '%f_%Q%!+(_a)' I get the file name that starts from '_' symbol and looks like:
_2023-01-22_13.10.49.784
Maybe it is the way to update this setting on-the-go somehow? E.g. with a callback?
Never mind.
Turned out I didn't %Source to %Source data copy in transformation thus there were no filename in the result file.
The only question left - how to manage Headers line in such a production? If possible?
For simple headers (and footers), you can use the 'batch class' feature of the record map file batch service (EnsLib.RecordMap.Service.BatchFileService) and operation (EnsLib.RecordMap.Operation.BatchFileOperation), and a class such as EnsLib.RecordMap.SimpleBatch to specify a header string.
💡 This question is considered a Key Question. More details here.