Export Mania 2017 - DataPort

Someone asked a vague question over on the forum about exporting electronic documents.  Since I haven't done a post about exporting yet, I thought this would be a good topic to cover.  The OP wasn't specific about how to go about exporting so we'll cover them all over the coming posts!

Let's define a few requirements:

  1. There will be a file exported that include's meta-data for the given record(s)
  2. There will be a folder created to contain the electronic documents for the given record(s)
  3. The names of the electronic documents shall be the record numbers and not the record titles

Let's see how to accomplish this with DataPort...


Out-of-the-box DataPort

I can use the out-of-the-box Tab Delimited formatter to craft a DataPort project using the settings below.

2017-10-15_18-40-31.png

In the settings section I can specify where the documents are be exported.  Then I scroll down to pick my saved search.  

2017-10-15_18-42-40.png

Lastly, I pick the fields I want.  I must include the "DOS file" to get the electronic attachments.  So for now I'll include the title and DOS file.  

2017-10-15_18-50-32.png

If I save and execute this project I get a meta-data file like this one...

2017-10-15_18-57-31.png

And a set of electronic documents like these...

2017-10-15_18-55-36.png

The forum post asked how these file names could be in record number format.  The existing format is a combination of the dataset ID, the URI, an underscore, and the suggested file name.  If this is as far as I go then I cannot meet the requirement.  

So purely out-of-the-box is insufficient! :(


Out-of-the-box DataPort with Powershell

I think a quick solution could be to directly rename the files exported.  Unfortunately my current export lacks enough meta-data to accomplish the task though.  The file name doesn't include the record number.  Nor does the meta-data file.

But if I add "Expanded number" as a field, I can then directly manipulate the results.

2017-10-15_19-02-09.png

Now my meta-data file looks like this...

2017-10-15_19-04-06.png

Now that I have the meta-data I need I can write a powershell script to accomplish my tasks.  The script doesn't actually need to connect into CM at all (though it's probably best if I looked up the captions for the relevant). 

The script does need to do the following though:

  1. Open the meta-data file
  2. Iterate each record in the file
  3. Rename the existing file on disk
  4. Update the resulting row of meta-data to reflect the new name
  5. Save the meta-data file back.

A quick and dirty conversion of these requirements into code yields the following:

$metadataFile = "C:\DataPort\Export\Out-of-the-box Electronic Export\records.txt"
$subFolder = "C:\DataPort\Export\Out-of-the-box Electronic Export\Documents\"
$metaData = Get-Content -Path $metadataFile | ConvertFrom-Csv -Delimiter "`t"
for ( $i = 0; $i -le $metaData.Length; $i++ ) 
{
    $recordNumber = $metaData[$i]."Expanded Number"
    $existingFileName = $metaData[$i]."DOS file"
    $existingFilePath = [System.IO.Path]::Combine($subFolder, $existingFileName)
    $newFileName = ($recordNumber + [System.IO.Path]::GetExtension($existingFileName))
    $newFilePath = [System.IO.Path]::Combine($subFolder, $newFileName)
    if ( ![String]::IsNullOrWhiteSpace($existingFileName) -and (Test-Path $existingFilePath) -and (Test-Path $newFilePath) -eq $false ) 
    {
        if ( (Test-Path $newFilePath) ) 
        {
            Remove-Item -Path $newFilePath
        }
        Move-Item -Path $existingFilePath -Destination $newFilePath
        $metaData[$i].'DOS file' = $newFileName
    }
}
$metaData | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Out-File -FilePath $metadataFile

After I run that script on my DataPort results, I get this CSV file...

2017-10-15_19-24-23.png

And my electronic documents are named correctly...

2017-10-15_19-28-25.png

So with a little creative powershell I can achieve the result, but it's a two step process.  I must remember to execute this after I execute my export.  Granted, I could actually call the export process at the top of the powershell and then not mess with DataPort directly.

This still seems a bit of a hack to get my results.  Maybe creating a new DataPort Export Formatter is easy?


Custom DataPort Export Formatter

The out-of-the-box Tab delimited DataPort formatter works well.  I don't even need to re-create it!  I just need to be creative.

So my first step is to create a new visual studio class library that contains one class based on the IExportDataFormatter interface.

namespace CMRamble.DataPort.Export
{
    public class NumberedFileName : IExportDataFormatter
    {
     
    }
}

If I use the Quick Action to implement the interface, it gives me all my required members and methods.  The code below shows what it gives me...

public string KwikSelectCaption => throw new NotImplementedException();
public OriginType OriginType => throw new NotImplementedException();
 
public string Browse(Form parentForm, string searchPrefix, Point suggestedBrowseUILocation, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void Dispose()
{
    throw new NotImplementedException();
}
public void EndExport(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void ExportCompleted(ProcessStatistics stats, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public string GetFormatterInfo(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    throw new NotImplementedException();
}
public string Validate(Form parentForm, string connectionStringToValidate, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}

Now I don't know how the out-of-the-box tab formatter works, and to be honest I don't care how it does what it does.  I just want to leverage it!  So I'm going to create a static, readonly variable to hold an instance of the out-of-the-box formatter.  Then I force all these members and methods to use the out-of-the-box formatter.  It makes my previous code now look like this....

private static readonly ExportDataFormatterTab tab = new ExportDataFormatterTab();
public string KwikSelectCaption => tab.KwikSelectCaption;
 
public OriginType OriginType => tab.OriginType;
 
public string Browse(Form parentForm, string searchPrefix, Point suggestedBrowseUILocation, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.Browse(parentForm, searchPrefix, suggestedBrowseUILocation, additionalData);
}
 
public void Dispose()
{
    tab.Dispose();
}
 
public void EndExport(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.EndExport(additionalData);
}
 
public void ExportCompleted(ProcessStatistics stats, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.ExportCompleted(stats, additionalData);
}
 
public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.ExportNextItem(items, additionalData);
}
 
public string GetFormatterInfo(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.GetFormatterInfo(additionalData);
}
 
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    tab.StartExport(exportPath, overWriteIfExists, objectType, TRIMVersionInfo, headerCaptions);
}
 
public string Validate(Form parentForm, string connectionStringToValidate, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.Validate(parentForm, connectionStringToValidate, additionalData);
}

If I compile this and register it within DataPort, I have a new Export DataFormatter that behaves just like the out-of-the-box formatter (trust me, I tested it).  Now what I need to do is to add the logic that renames my files and the corresponding meta-data.

First steps first: I need to store some of the information provided in the StartExport method. 

private bool correctExportedFileName = false;
private string exportPath;
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    this.exportPath = $"{Path.GetDirectoryName(exportPath)}\\Documents";
    var captions = headerCaptions.ToList();
    var numberField = captions.FirstOrDefault(x => x.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.AgendaItemExpandedNumber).Caption));
    var fileField = captions.FirstOrDefault(x => x.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.RecordFilePath).Caption));
    if ( numberField != null & fileField != null )
    {
        correctExportedFileName = true;
    }
    tab.StartExport(exportPath, overWriteIfExists, objectType, TRIMVersionInfo, headerCaptions);
}

Note that in the code above I've had to do a couple of seemingly dodgy things:

  1. I had to hard-code the name of the subfolder because it's not avialable to me (so weird I can't access the project details)
  2. I used the AgendaItemExpandedNumber property Id because that's what maps to the record's expanded number (weird, I know)

All that's left to do is to fix the file and meta-data!  I do that in the ExportNextItem method.  That method is invoked each time an object has been extracted.  So that's when I need to do the rename and meta-data correction.  

My method becomes:

public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    if ( correctExportedFileName )
    {
        var numberField = items.FirstOrDefault(x => x.ItemCaption.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.AgendaItemExpandedNumber).Caption));
        var fileField = items.FirstOrDefault(x => x.ItemCaption.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.RecordFilePath).Caption));
        if ( numberField != null && fileField != null )
        {
            var originalFileName = Path.Combine(exportPath, fileField.ItemValue);
            if ( File.Exists(originalFileName) )
            {
                var newFileName = $"{numberField.ItemValue}{System.IO.Path.GetExtension(fileField.ItemValue)}";
                var newFilePath = Path.Combine(exportPath, newFileName);
                if (File.Exists(newFilePath) && File.Exists(originalFileName))
                {
                    File.Delete(newFilePath);
                }
                File.Move(originalFileName, newFilePath);
                fileField.ItemValue = newFileName;
            }
        }
    }
    tab.ExportNextItem(items, additionalData);
}

Now if I compile and export, my meta-data file looks exactly the same as the previous method (out-of-the-box with powershell)!

Feel free to try out my DataPort Formatter here.

Adding Geolocation to my Json Import

My initial import just mapped the facility ID into the expanded record number property and the name into the title.  I can see from the Json response that there is a "latlng" property I can import.  It has two real numbers separated by a comma.  If I map that to the GPS Location field within Content Manager, I get this error message:

Details: Setting property failed. Item Index: 295, Property Caption: 'GPS Location', Value: 27.769625, -82.767725    Exception message: You have not entered a correct geographic location. Try POINT(-122.15437906 37.443134073) or use the map interface to mark a point.

Funny how DataPort is so demanding with regards to import formats.  Even funnier that it gives you no capability to transform data during "port".  I'll need to add some features into my Json import data formatter: add value prefix, suffix, and a geolocation converter feature to each property.  

I'll use the prefix in record numbers moving forward (everyone does that).  I'll use suffixes possibly in record numbers, but more likely on record titles (inserting a name, facility, region, etc, in title).  I'll use a dodgy static mapping for the geolocation converter (whatever gets my current data into the POINT structure).

2017-10-13_19-49-48.png

Now when I import from this json source I'll have additional record number protections and an importable geolocation.  Also notice that I'm exposing two properties to Content Manager: name and title.  Both of these point to the "name1" property of the original source.  Since DataPort only allows you to match one column to one source, you cannot re-use an import source property.  In my example I want to push a copy of the original value into a second additional field.  Having this flexibility gives me just what I need.

Using Xsd to Dynamically change DataPort during install

This post will lead to code which can manipulate the list of available source format in Content Manager's DataPort.  The code is executed as the last action within an installer project for a new data provider.  It loads an Xml document representing the user's list of sources and then either adds to the list or updates existing entries.  During uninstall it will remove only the newly installed source.

What I needed to accomplish this:

  • Microsoft Visual Studio 2017
  • .Net Framework 4.5.2
  • WiX Toolset v3
  • HPE Content Manager 9.1

The user's list of available DataPort sources is located off the roaming user profile.  Within it exists one node typed "ArrayofDataFormatterDefinition".  That, in turn, contains one or more DataFormatterDefinition children.  

2017-09-24_6-30-50.png

The goals include: add, update, or remove items from this configuration file.


First I launched the Developer Command Prompt for Visual Studio 2017 so that I could mirror someone else's model within my own project.

2017-09-24_4-51-58.png

I navigated into the roaming application data directory for data port preferences.  Executing XSD within that starting directory will make it easier to organize my results.

2017-09-24_4-56-36.png

We can only use XSD on files ending with an ".xml" extension, which the developers have curiously not done.  Since I also don't want to mess up my own copy some how, I might as well go ahead and copy what I've got to a file XSD will accept.  

I did this by executing

copy ImportDataFormatters ImportDataFormatters.xml

Then I execute

xsd ImportDataFormatters.xml

Command Prompt after generating scheme definition

Command Prompt after generating scheme definition

Next I execute

xsd ImportDataFormatters.xsd /c

Command Prompt after generating class definition

Command Prompt after generating class definition

Next I flipped over to Visual Studio and imported the class file.

2017-09-24_5-44-17.png
2017-09-24_6-13-08.png

The file name doesn't match the generated class names.  It doesn't matter either.  What I really want are the properties of the second class defined.  These are the things I want to change for the user. 

2017-09-24_6-15-42.png

In my custom action for this installer I can now serialize and deserialize using the code below.

private static void SaveImportFormattersPreferenceFile(string preferenceFile, XmlSerializer serializer, ArrayOfDataFormatterDefinition importFormatters)
{
    using (TextWriter writer = new StreamWriter(preferenceFile))
    {
        serializer.Serialize(writer, importFormatters);
        writer.Close();
    }
}
 
private static ArrayOfDataFormatterDefinition LoadImportFormattersPreferenceFile(string preferenceFile, XmlSerializer serializer)
{
    ArrayOfDataFormatterDefinition importFormatters;
    using (StreamReader reader = new StreamReader(preferenceFile))
    {
        importFormatters = (ArrayOfDataFormatterDefinition)serializer.Deserialize(reader);
        reader.Close();
    }
 
    return importFormatters;
}

Next I need the logic to find entries in the list or to create a new one.

XmlSerializer serializer = new XmlSerializer(typeof(ArrayOfDataFormatterDefinition));
ArrayOfDataFormatterDefinition importFormatters = LoadImportFormattersPreferenceFile(preferenceFile, serializer);
List<ArrayOfDataFormatterDefinitionDataFormatterDefinition> items = importFormatters.Items.ToList();
var item = importFormatters.Items.FirstOrDefault(x => x.ClassName.Equals("CMRamble.DataPort.Acme"));
if (item == null)
{
    item = new ArrayOfDataFormatterDefinitionDataFormatterDefinition();
    items.Add(item);
}

After I'm done manipulating the item I have in memory, I need to save the changes to disk.

importFormatters.Items = items.ToArray();
SaveImportFormattersPreferenceFile(preferenceFile, serializer, importFormatters);

During my uninstall action I need to basically repeat the process, but this time just remove anything matching my class name.

XmlSerializer serializer = new XmlSerializer(typeof(ArrayOfDataFormatterDefinition));
ArrayOfDataFormatterDefinition importFormatters = LoadImportFormattersPreferenceFile(preferenceFile, serializer);
List<ArrayOfDataFormatterDefinitionDataFormatterDefinition> items = importFormatters.Items.ToList();
importFormatters.Items = items.Where(x => !x.ClassName.Equals("CMRamble.DataPort.Acme")).ToArray();
SaveImportFormattersPreferenceFile(preferenceFile, serializer, importFormatters);

And that's it!  The above code can be attached to any WiX installer action.