Export Mania 2017 - DataPort

Someone asked a vague question over on the forum about exporting electronic documents.  Since I haven't done a post about exporting yet, I thought this would be a good topic to cover.  The OP wasn't specific about how to go about exporting so we'll cover them all over the coming posts!

Let's define a few requirements:

  1. There will be a file exported that include's meta-data for the given record(s)
  2. There will be a folder created to contain the electronic documents for the given record(s)
  3. The names of the electronic documents shall be the record numbers and not the record titles

Let's see how to accomplish this with DataPort...


Out-of-the-box DataPort

I can use the out-of-the-box Tab Delimited formatter to craft a DataPort project using the settings below.

2017-10-15_18-40-31.png

In the settings section I can specify where the documents are be exported.  Then I scroll down to pick my saved search.  

2017-10-15_18-42-40.png

Lastly, I pick the fields I want.  I must include the "DOS file" to get the electronic attachments.  So for now I'll include the title and DOS file.  

2017-10-15_18-50-32.png

If I save and execute this project I get a meta-data file like this one...

2017-10-15_18-57-31.png

And a set of electronic documents like these...

2017-10-15_18-55-36.png

The forum post asked how these file names could be in record number format.  The existing format is a combination of the dataset ID, the URI, an underscore, and the suggested file name.  If this is as far as I go then I cannot meet the requirement.  

So purely out-of-the-box is insufficient! :(


Out-of-the-box DataPort with Powershell

I think a quick solution could be to directly rename the files exported.  Unfortunately my current export lacks enough meta-data to accomplish the task though.  The file name doesn't include the record number.  Nor does the meta-data file.

But if I add "Expanded number" as a field, I can then directly manipulate the results.

2017-10-15_19-02-09.png

Now my meta-data file looks like this...

2017-10-15_19-04-06.png

Now that I have the meta-data I need I can write a powershell script to accomplish my tasks.  The script doesn't actually need to connect into CM at all (though it's probably best if I looked up the captions for the relevant). 

The script does need to do the following though:

  1. Open the meta-data file
  2. Iterate each record in the file
  3. Rename the existing file on disk
  4. Update the resulting row of meta-data to reflect the new name
  5. Save the meta-data file back.

A quick and dirty conversion of these requirements into code yields the following:

$metadataFile = "C:\DataPort\Export\Out-of-the-box Electronic Export\records.txt"
$subFolder = "C:\DataPort\Export\Out-of-the-box Electronic Export\Documents\"
$metaData = Get-Content -Path $metadataFile | ConvertFrom-Csv -Delimiter "`t"
for ( $i = 0; $i -le $metaData.Length; $i++ ) 
{
    $recordNumber = $metaData[$i]."Expanded Number"
    $existingFileName = $metaData[$i]."DOS file"
    $existingFilePath = [System.IO.Path]::Combine($subFolder, $existingFileName)
    $newFileName = ($recordNumber + [System.IO.Path]::GetExtension($existingFileName))
    $newFilePath = [System.IO.Path]::Combine($subFolder, $newFileName)
    if ( ![String]::IsNullOrWhiteSpace($existingFileName) -and (Test-Path $existingFilePath) -and (Test-Path $newFilePath) -eq $false ) 
    {
        if ( (Test-Path $newFilePath) ) 
        {
            Remove-Item -Path $newFilePath
        }
        Move-Item -Path $existingFilePath -Destination $newFilePath
        $metaData[$i].'DOS file' = $newFileName
    }
}
$metaData | ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Out-File -FilePath $metadataFile

After I run that script on my DataPort results, I get this CSV file...

2017-10-15_19-24-23.png

And my electronic documents are named correctly...

2017-10-15_19-28-25.png

So with a little creative powershell I can achieve the result, but it's a two step process.  I must remember to execute this after I execute my export.  Granted, I could actually call the export process at the top of the powershell and then not mess with DataPort directly.

This still seems a bit of a hack to get my results.  Maybe creating a new DataPort Export Formatter is easy?


Custom DataPort Export Formatter

The out-of-the-box Tab delimited DataPort formatter works well.  I don't even need to re-create it!  I just need to be creative.

So my first step is to create a new visual studio class library that contains one class based on the IExportDataFormatter interface.

namespace CMRamble.DataPort.Export
{
    public class NumberedFileName : IExportDataFormatter
    {
     
    }
}

If I use the Quick Action to implement the interface, it gives me all my required members and methods.  The code below shows what it gives me...

public string KwikSelectCaption => throw new NotImplementedException();
public OriginType OriginType => throw new NotImplementedException();
 
public string Browse(Form parentForm, string searchPrefix, Point suggestedBrowseUILocation, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void Dispose()
{
    throw new NotImplementedException();
}
public void EndExport(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void ExportCompleted(ProcessStatistics stats, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public string GetFormatterInfo(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    throw new NotImplementedException();
}
public string Validate(Form parentForm, string connectionStringToValidate, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    throw new NotImplementedException();
}

Now I don't know how the out-of-the-box tab formatter works, and to be honest I don't care how it does what it does.  I just want to leverage it!  So I'm going to create a static, readonly variable to hold an instance of the out-of-the-box formatter.  Then I force all these members and methods to use the out-of-the-box formatter.  It makes my previous code now look like this....

private static readonly ExportDataFormatterTab tab = new ExportDataFormatterTab();
public string KwikSelectCaption => tab.KwikSelectCaption;
 
public OriginType OriginType => tab.OriginType;
 
public string Browse(Form parentForm, string searchPrefix, Point suggestedBrowseUILocation, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.Browse(parentForm, searchPrefix, suggestedBrowseUILocation, additionalData);
}
 
public void Dispose()
{
    tab.Dispose();
}
 
public void EndExport(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.EndExport(additionalData);
}
 
public void ExportCompleted(ProcessStatistics stats, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.ExportCompleted(stats, additionalData);
}
 
public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    tab.ExportNextItem(items, additionalData);
}
 
public string GetFormatterInfo(Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.GetFormatterInfo(additionalData);
}
 
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    tab.StartExport(exportPath, overWriteIfExists, objectType, TRIMVersionInfo, headerCaptions);
}
 
public string Validate(Form parentForm, string connectionStringToValidate, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    return tab.Validate(parentForm, connectionStringToValidate, additionalData);
}

If I compile this and register it within DataPort, I have a new Export DataFormatter that behaves just like the out-of-the-box formatter (trust me, I tested it).  Now what I need to do is to add the logic that renames my files and the corresponding meta-data.

First steps first: I need to store some of the information provided in the StartExport method. 

private bool correctExportedFileName = false;
private string exportPath;
public void StartExport(string exportPath, bool overWriteIfExists, DataPortConfig.SupportedBaseObjectTypes objectType, string TRIMVersionInfo, string[] headerCaptions)
{
    this.exportPath = $"{Path.GetDirectoryName(exportPath)}\\Documents";
    var captions = headerCaptions.ToList();
    var numberField = captions.FirstOrDefault(x => x.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.AgendaItemExpandedNumber).Caption));
    var fileField = captions.FirstOrDefault(x => x.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.RecordFilePath).Caption));
    if ( numberField != null & fileField != null )
    {
        correctExportedFileName = true;
    }
    tab.StartExport(exportPath, overWriteIfExists, objectType, TRIMVersionInfo, headerCaptions);
}

Note that in the code above I've had to do a couple of seemingly dodgy things:

  1. I had to hard-code the name of the subfolder because it's not avialable to me (so weird I can't access the project details)
  2. I used the AgendaItemExpandedNumber property Id because that's what maps to the record's expanded number (weird, I know)

All that's left to do is to fix the file and meta-data!  I do that in the ExportNextItem method.  That method is invoked each time an object has been extracted.  So that's when I need to do the rename and meta-data correction.  

My method becomes:

public void ExportNextItem(List<ExportItem> items, Dictionary<AdditionalDataKeysDescriptiveData> additionalData)
{
    if ( correctExportedFileName )
    {
        var numberField = items.FirstOrDefault(x => x.ItemCaption.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.AgendaItemExpandedNumber).Caption));
        var fileField = items.FirstOrDefault(x => x.ItemCaption.Equals(new EnumItem(AllEnumerations.PropertyIds, (int)PropertyIds.RecordFilePath).Caption));
        if ( numberField != null && fileField != null )
        {
            var originalFileName = Path.Combine(exportPath, fileField.ItemValue);
            if ( File.Exists(originalFileName) )
            {
                var newFileName = $"{numberField.ItemValue}{System.IO.Path.GetExtension(fileField.ItemValue)}";
                var newFilePath = Path.Combine(exportPath, newFileName);
                if (File.Exists(newFilePath) && File.Exists(originalFileName))
                {
                    File.Delete(newFilePath);
                }
                File.Move(originalFileName, newFilePath);
                fileField.ItemValue = newFileName;
            }
        }
    }
    tab.ExportNextItem(items, additionalData);
}

Now if I compile and export, my meta-data file looks exactly the same as the previous method (out-of-the-box with powershell)!

Feel free to try out my DataPort Formatter here.

Adding Geolocation to my Json Import

My initial import just mapped the facility ID into the expanded record number property and the name into the title.  I can see from the Json response that there is a "latlng" property I can import.  It has two real numbers separated by a comma.  If I map that to the GPS Location field within Content Manager, I get this error message:

Details: Setting property failed. Item Index: 295, Property Caption: 'GPS Location', Value: 27.769625, -82.767725    Exception message: You have not entered a correct geographic location. Try POINT(-122.15437906 37.443134073) or use the map interface to mark a point.

Funny how DataPort is so demanding with regards to import formats.  Even funnier that it gives you no capability to transform data during "port".  I'll need to add some features into my Json import data formatter: add value prefix, suffix, and a geolocation converter feature to each property.  

I'll use the prefix in record numbers moving forward (everyone does that).  I'll use suffixes possibly in record numbers, but more likely on record titles (inserting a name, facility, region, etc, in title).  I'll use a dodgy static mapping for the geolocation converter (whatever gets my current data into the POINT structure).

2017-10-13_19-49-48.png

Now when I import from this json source I'll have additional record number protections and an importable geolocation.  Also notice that I'm exposing two properties to Content Manager: name and title.  Both of these point to the "name1" property of the original source.  Since DataPort only allows you to match one column to one source, you cannot re-use an import source property.  In my example I want to push a copy of the original value into a second additional field.  Having this flexibility gives me just what I need.

Getting hyped over HyperV

I vividly remember going to a VMWare in-person training session in the Melbourne CBD about a decade ago.  Ever since then I've been using their products to manage my individual and company environments and infrastructures.  Just about every customer site uses their products as well.  It costs money to maintain and support, which I don't like.  

I recently stepped through an MCSA study guide and there was a whole section devoted to HyperV (Microsoft's alternative to VMWare).  It was entirely new for me.  I'm impressed how easy it is to setup, use, and manage.  Since it's an integral component of Windows itself, making use of it becomes more interesting as each day arrives.  

For me the best feature is that it's included as a feature of Windows 10 and Windows Server.  That means I can create a VM on my desktop and then migrate it to a development, testing, or production environment.  I don't have to pay for a third-party product to do this.  Considering I rebuild my workstation every 3 months, it's becoming a silent blessing.

Combine that with the fact that I get powershell commands along with the management console, I can now script EVERYTHING!  

Maybe a script to create a new workgroup server virtual machine?

$envName = 'CMRamble'           #Environment Name
$serverType = 'Workgroup'       #Server Type
$vmRoot = "D:\HyperV"           #Where to put the VMs
$vmName = "$($envName) $($serverType)"   #Name of this VM as appears in HyperV
$vmPath =  "$($vmRoot)\$($envName)"    #Storage Location of VMs in Root Path
$vhdPath = Join-Path $vmPath "$($vmName)\Virtual Hard Disks\$($serverType).vhdx"    #Storage Location of HD for VM
$vSwitch = "$($envName)"        #Virtual Switch Name
$osPath = "D:\Software\en_windows_server_2016_x64_dvd_9718492.iso"   #Storage Location of Operating System
#Find existing or create new VM
$vm = Get-VM -Name $vmName -ErrorAction SilentlyContinue
if ( $vm -eq $null ) {
    New-VM -Name $vmName -Path $vmPath -MemoryStartupBytes 512MB -SwitchName $vSwitch -Generation 2
}
#Create new virtual hard disk if needed
if ( (Test-Path $vhdPath) -eq $false) {
    New-VHD -Path $vhdPath -SizeBytes 60GB -Dynamic 
}
#Attach vhd
Add-VMHardDiskDrive -VMName $vmName -Path $vhdPath
#Get or create DVD drive
$dvd = Get-VMDvdDrive -VMName $vmName
if ( $dvd -eq $null ) {
    $dvd = Add-VMDvdDrive -VMName $vmName
}
#Set DVD to boot first
Set-VMFirmware -VMName $vmName -FirstBootDevice $dvd
#Point the DVD to the OS
if ( (Test-Path $osPath) -eq $true ) {
    Set-VMDvdDrive -VMName $vmName -Path $osPath -ToControllerNumber 0 -ToControllerLocation 1
}
#Dodgy firmware fix that seemed to fix something
Get-VM $vmName|Get-VMFirmware|ForEach {Set-VMFirmware -BootOrder ($_.Bootorder | ? {$_.BootType -ne 'File'}) $_}

Then more scripts that start it, configure the OS, install Content Manager, and copy the software configuration files to it. After running all these scripts the VM is ready to be added within the Enterprise Studio!  Pretty cool to be able to have a new environment fully ready in just a few minutes.

Moving the VM to another server is super easy.  I right-click on the VM and select Move.  Then specify the name of the destination server and select where on that server it should be placed.  I'm placing this VM on the "D:\" partition, which I happen to know is a massive SSD (makes a big difference with HyperV).  

I didn't stop-watch it, but it seems to have taken 10 minutes or so for this 20 GB VM.  Best part is that my Content Manager Virtual Machine had no down-time.  

 
Moving a VM from one server to another is quite easy

Moving a VM from one server to another is quite easy

 

 

You may get the error message below when you attempt to kick off a move.  If you did, you need to initiate the move on the source server directly (and not by connecting the HyperV console to another server).

2017-10-13_10-31-07.png

Adding HyperV to your workstation

First I right-click on the start-menu and select Programs and Features

2017-10-13_10-56-43.png

Then I click the Programs and Features link in the Related settings section of the right-hand pane.

 
2017-10-13_10-57-42.png
 

Then I click Turn Windows features on or off

2017-10-13_10-59-39.png

Then I find and enable both the HyperV features

2017-10-13_10-53-39.png

I let it finish and restart my system

2017-10-13_10-56-15.png