CSV to CSV data transformation using Interoperability

Question

Question

Evgeny Shvarov · Jan 15, 2023

#CSV #Interoperability #Key Question #InterSystems IRIS

Hi folks!

Have a question for those who are masters of interoperability.

I have a basic task of having one CSV with some data. I need to transform one column in the initial dataset and get the new csv with the same form.

What's the best approach with Interoperability?

Should I user record mapper?

Should I use streams, objects?

What is the best practice?

Discussion (12)1

Log in or sign up to continue

Robert Barbiaux Jan 21, 2023 to Evgeny Shvarov

For a simple message transformation flow example, I would go for record map :

you get readymade service (EnsLib.RecordMap.Service.FileService) and operation (EnsLib.RecordMap.Operation.FileOperation) in the library to read and write CSV files and
a mecanism that generates an appropriate message class based on a declarative definition (see Using the Record Mapper | Developing Productions | InterSystems IRIS Data Platform 2022)

So you can focus on DTL and the whole flow can be done from the administration portal, look ma, no code ;-)

1 0

score 0 · Answer 1 · 2023-01-15T10:09:39-05:00

I would not create record mapper unless you want to make sure the data is in correct format. If you already have record maps defined I see no issue using them.

score 0 · Answer 2 · 2023-01-15T15:33:29-05:00

Evgeny Shvarov Jan 15, 2023 to Oliver Wilms

I don't plan to use record maps at all. The idea is to use DTL for every record.

0 0

score 0 · Answer 3 · 2023-01-15T23:38:12-05:00

I created a BPL and read one line at a time. I tried updating data line in code without DTL. I get an error instead of seeing output file. Please take a look in git repo here (just start the production and look for messages from File Passthrough Service):

https://github.com/oliverwilms/interoperability-update-datafile

score 2 · Answer 4 · 2023-01-16T11:44:52-05:00

If you need to process the entire file and no line filtering, I would go for using pass through file service (EnsLib.File.PassthroughService) to send an instance of stream container (Ens.StreamContainer) to either a message router (EnsLib.MsgRouter.RoutingEngine) or custom (BPL or code) process, and use a transform (class extending Ens.DataTransform) to transform the source stream container into a target stream container and send it to the file pass through operation (EnsLib.File.PassthroughOperation) for output.

I would use a custom process over message router if transform needs data source(s) (e.g. response from another process or operation) other than the input file. The transform can pick a suitable target stream class (Extending %Stream.Object) to hold in the Ens.StreamContainer depending on where you want to store the data (database vs file system,…)

HTH

score 0 · Answer 5 · 2023-01-18T03:10:57-05:00

@Robert Barbiaux , this is very cool!

In fact the purpose of what I plan to do is to expose the idea of data-transformation for newcomers in the simplest possible manner.

I wanted to have every line as a message that contains data that will be transformed via the rule.

I understand that in a real-life interoperability cases one message should be a one file/stream but the purpose is to explain how engine works.

score 0 · Answer 6 · 2023-01-21T13:31:39-05:00

Indeed, it works! Thank you, Robert!

One issue though: as a result of Operation I have one new file per new message/record in a source file. Any chance to ask Production to put all the same amount records as were in the initial file?

score 1 · Answer 7 · 2023-01-22T03:38:37-05:00

The record map file operation append records to the output file. The initial value of the 'Filename' setting is '%Q', hence you get one file per timestamp.

If you set "Filename" to '%f', the output file name will be the same as the input file name and records from one input file will be appended to an output file with the same name.

score 0 · Answer 8 · 2023-01-22T08:16:10-05:00

Thank you, Robert!

This could work but for some reason '%f' doesn't work for record mapper:

I'm getting <NOTOPEN> error if it is only the '%f'

and if I use the default setting of FileOperation as '%f_%Q%!+(_a)' I get the file name that starts from '_' symbol and looks like:

_2023-01-22_13.10.49.784

Maybe it is the way to update this setting on-the-go somehow? E.g. with a callback?

score 0 · Answer 9 · 2023-01-23T04:45:33-05:00

Never mind.

Turned out I didn't %Source to %Source data copy in transformation thus there were no filename in the result file.

The only question left - how to manage Headers line in such a production? If possible?

score 0 · Answer 10 · 2023-01-24T00:51:08-05:00

For simple headers (and footers), you can use the 'batch class' feature of the record map file batch service (EnsLib.RecordMap.Service.BatchFileService) and operation (EnsLib.RecordMap.Operation.BatchFileOperation), and a class such as EnsLib.RecordMap.SimpleBatch to specify a header string.

score 0 · Answer 11 · 2023-02-15T13:26:13-05:00

Developer Commu... · Feb 15, 2023

💡 This question is considered a Key Question. More details here.

0 0