Transcribing Audio Records with the Cloud Speech API

In this post I'll show how one might approach enriching their public archives with audio transcriptions provided by Google's Cloud Speech API.

First we can start with this collection of records:

Collection of Hearing Records

Collection of Hearing Records

Simple meta-data for each file

Simple meta-data for each file

For each of these audio files I'll have to download it, convert it, stage it, pass it to the speech api, and capture the results.  I'll craft this in powershell for the moment and then later implement this as a cloud function.  The results can then be added to the notes or as an OCR text rendition (or reviewed and then added).

2018-06-08_20-46-04.png

The speech API will give me chunks of text with an associated confidence level, as shown below:

2018-06-08_20-54-10.png

Which can all be mashed together for a pretty accurate transcription:

2018-06-08_21-17-59.png

Step by Step

My first step is to enable the Speech API within GCP:

 
2018-06-02_1-33-25.png
 

Then create a storage bucket to house the files.  I could skip this and upload it directly within a request, but staging them in a bucket makes it easier for me to later work with cloud functions.  

To convert the audio from mp3 to wav format with a single audio channel I used ffmpeg:

 
ffmpeg -hide_banner -loglevel panic -y -i %inputfile% -ac 1 %outputfile%

I like ffmpeg because it's easy to use on windows servers & workstations.  But there's also a good fluent ffmpeg module available in nodejs, which allows this to be built as a cloud function.  For now here's my powershell function to convert the audio file...

function ConvertTo-Wav {
	Param([string]$inputFile)
	#few variables for local pathing
	$newFileName = [System.IO.Path]::getfilenamewithoutextension($inputFile) + ".wav"
	$fileDir = [System.IO.Path]::getdirectoryname($inputFile)
	$newFilePath = [System.IO.Path]::combine($fileDir, $newFileName)
	#once is enough
	Write-Debug("ConvertTo-Wav Target File: " + $newFilePath)
	if ( (Test-Path $newFilePath) -eq $false ) {
		#convert using open source ffmpeg
		$convertCmd = "ffmpeg -hide_banner -loglevel panic -y -i `"$inputFile`" -ac 1 `"$newFilePath`""
		Write-Debug ("ConvertTo-Wav Command: " + $convertCmd)
		Invoke-Expression $convertCmd
	}
	return $newFilePath
}

This function queries for the audio records from content manager and caches the result to disk (simply for development of the script)...

function Get-CMAudioFilesJson {
	$localResponseFile = [System.IO.Path]::Combine($localTempPath,"audioRecords.json")
	$audioFiles = @()
	if ( (Test-Path $localResponseFile) -eq $false ) {
		#fetch if results not cached to disk
		$searchUri = ($baseCMUri + "/Record?q=container:11532546&format=json&pageSize=100000&properties=Uri,RecordNumber,Url,RecordExtension,RecordDocumentSize")
		Write-Debug "Searching for audio records: $searchUri"
		$response = Invoke-RestMethod -Uri $searchUri -Method Get -ContentType $ApplicationJson
		#flush to disk as raw json
		$response | ConvertTo-Json -Depth 6 | Set-Content $localResponseFile
	} else {
		#load and convert from json
		Write-Debug ("Loading Audio Records from local file: " + $localResponseFile)
		$response = Get-Content -Path $localResponseFile | ConvertFrom-Json
	}
	#if no results just error out
	if ( $response.Results -ne $null ) {
		Write-Debug ("Processing $($response.Results.length) audio records")
		$audioFiles = $response.Results
	} else {
		Write-Debug "Error"
		break
	}
	return $audioFiles
}

A function to submit the record's audio file to the speech api (and capture the results):

function Get-AudioText 
{
    Param($audioRecord)
    #formulate a valid path for the local file system
    $localFileName = (""+$audioRecord.Uri+"."+$audioRecord.RecordExtension.Value)
    $localPath = ($localTempPath + "\" + $localFileName)
    $sourceAudioFileUri = ($audioRecord.Url.Value + "/File/Document")
    $speechApiResultPath = ($localTempPath + "\" + $audioRecord.Uri + ".txt")  
    $speechTextPath = ($localTempPath + "\" + $audioRecord.Uri + "_text.txt")
    #download the audio file if not already done so
    if ( (Test-Path $localPath) -eq $false ) {  
        Invoke-WebRequest -Uri $sourceAudioFileUri -OutFile $localPath
    }
    #convert file if necessary
    if ( ($audioRecord.RecordExtension.Value.ToLower()) -ne "wav" ) {
        $localPath = ConvertTo-Wav $localPath
        $localFileName = [System.IO.Path]::GetfileName($localPath)
        if ( (Test-Path $localPath) -eq $false ) {
            Write-Error "Error Converting $($localPath)"
            return
        }
    }
 
    #transcribe, if not already done so
    Write-Debug ("Checking Speech API Text: "+$speechApiResultPath)
    if ( (Test-Path $speechApiResultPath) -eq $false ) {
        try {
            $bucketFilePath = "$bucketPath/$localFileName"
            Put-BucketFile -bucketFilePath $bucketFilePath -bucketPath $bucketPath -localPath $localPath
            #invoke speech api
            $speechCmd = "gcloud ml speech recognize-long-running $bucketFilePath --language-code=en-US"
            Write-Debug ("Speech API Command: "+$speechCmd)
            Invoke-Expression $speechCmd -OutVariable $speechResult | Tee-Object -FilePath $speechApiResultPath   
            Write-Debug ("Speech API Result: " + $speechResult)    
        } catch {
            Write-Error $_Write-Error $_
        }
    }
 
    #process transcription result
    if ( (Test-Path $speechApiResultPath) -eq $true ) {
        Write-Debug ("Reading Speech Results File: " + $speechApiResultPath)
		#remove previous consolidated transcription file
		if ( (Test-Path) -eq $true ) {Remove-Item $speechTextPath -Force }
		#flush each transcript result to disk
		$content.results | ForEach-Object { $_.alternatives | ForEach-Object { Add-Content $speechTextPath ($_.transcript+' ')  }  }
    } else {
        Write-Debug ("No Speech API Results: " + $speechTextPath)
    }
}

And then some logic to parse the search results and invoke the speech api:

#fetch the search results
$audioFiles = Get-CMAudioFilesJson
if ( $audioFiles -eq $null ) {
    Write-Error "No audio files found"
    exit
}
#process each
Write-Debug "Found $($audioFiles.Length) audio files"
foreach ( $audioFile in $audioFiles ) {
    Write-Host "Transcribing $($audioFile.RecordNumber.Value)"
    Get-AudioText  -audioRecord $audioFile
}

Here's the complete script:

Clear-Host
$DebugPreference = "Continue"
 
#variables and such
$AllProtocols = [System.Net.SecurityProtocolType]'Ssl3,Tls,Tls11,Tls12'
[System.Net.ServicePointManager]::SecurityProtocol = $AllProtocols
$ApplicationJson = "application/json"
$baseCMUri = "http://efiles.portlandoregon.gov"
$localTempPath = "C:\temp\speechapi"
$bucketPath = "gs://speech-api-cm-dev"
 
#create local staging area
if ( (Test-Path $localTempPath) -eq $false ) {
    New-Item $localTempPath -Type Directory
}
 
function Get-CMAudioFilesJson {
	$localResponseFile = [System.IO.Path]::Combine($localTempPath,"audioRecords.json")
	$audioFiles = @()
	if ( (Test-Path $localResponseFile) -eq $false ) {
		#fetch if results not cached to disk
		$searchUri = ($baseCMUri + "/Record?q=container:11532546&format=json&pageSize=100000&properties=Uri,RecordNumber,Url,RecordExtension,RecordDocumentSize")
		Write-Debug "Searching for audio records: $searchUri"
		$response = Invoke-RestMethod -Uri $searchUri -Method Get -ContentType $ApplicationJson
		#flush to disk as raw json
		$response | ConvertTo-Json -Depth 6 | Set-Content $localResponseFile
	} else {
		#load and convert from json
		Write-Debug ("Loading Audio Records from local file: " + $localResponseFile)
		$response = Get-Content -Path $localResponseFile | ConvertFrom-Json
	}
	#if no results just error out
	if ( $response.Results -ne $null ) {
		Write-Debug ("Processing $($response.Results.length) audio records")
		$audioFiles = $response.Results
	} else {
		Write-Debug "Error"
		break
	}
	return $audioFiles
}
 
function ConvertTo-Wav {
	Param([string]$inputFile)
	#few variables for local pathing
	$newFileName = [System.IO.Path]::getfilenamewithoutextension($inputFile) + ".wav"
	$fileDir = [System.IO.Path]::getdirectoryname($inputFile)
	$newFilePath = [System.IO.Path]::combine($fileDir, $newFileName)
	#once is enough
	Write-Debug("ConvertTo-Wav Target File: " + $newFilePath)
	if ( (Test-Path $newFilePath) -eq $false ) {
		#convert using open source ffmpeg
		$convertCmd = "ffmpeg -hide_banner -loglevel panic -y -i `"$inputFile`" -ac 1 `"$newFilePath`""
		Write-Debug ("ConvertTo-Wav Command: " + $convertCmd)
		Invoke-Expression $convertCmd
	}
	return $newFilePath
}
 
function Put-BucketFile {
	Param($bucketFilePath,$bucketPath,$localPath)
	#upload to bucket
    $checkCommand = "gsutil -q stat $bucketFilePath"
    $checkCommand += ';$?'
    Write-Debug ("GCS file check: " + $checkCommand)
    $fileCheck = Invoke-Expression $checkCommand
    #fileCheck is true if it exists, false otherwise
    if (-not $fileCheck ) {
        Write-Debug ("Uploading to bucket: gsutil cp " + $localPath + " " + $bucketPath)
        gsutil cp $localPath $bucketPath
    }
}
 
function Get-AudioText 
{
    Param($audioRecord)
    #formulate a valid path for the local file system
    $localFileName = (""+$audioRecord.Uri+"."+$audioRecord.RecordExtension.Value)
    $localPath = ($localTempPath + "\" + $localFileName)
    $sourceAudioFileUri = ($audioRecord.Url.Value + "/File/Document")
    $speechApiResultPath = ($localTempPath + "\" + $audioRecord.Uri + ".txt")  
    $speechTextPath = ($localTempPath + "\" + $audioRecord.Uri + "_text.txt")
    #download the audio file if not already done so
    if ( (Test-Path $localPath) -eq $false ) {  
        Invoke-WebRequest -Uri $sourceAudioFileUri -OutFile $localPath
    }
    #convert file if necessary
    if ( ($audioRecord.RecordExtension.Value.ToLower()) -ne "wav" ) {
        $localPath = ConvertTo-Wav $localPath
        $localFileName = [System.IO.Path]::GetfileName($localPath)
        if ( (Test-Path $localPath) -eq $false ) {
            Write-Error "Error Converting $($localPath)"
            return
        }
    }
 
    #transcribe, if not already done so
    Write-Debug ("Checking Speech API Text: "+$speechApiResultPath)
    if ( (Test-Path $speechApiResultPath) -eq $false ) {
        try {
            $bucketFilePath = "$bucketPath/$localFileName"
            Put-BucketFile -bucketFilePath $bucketFilePath -bucketPath $bucketPath -localPath $localPath
            #invoke speech api
            $speechCmd = "gcloud ml speech recognize-long-running $bucketFilePath --language-code=en-US"
            Write-Debug ("Speech API Command: "+$speechCmd)
            Invoke-Expression $speechCmd -OutVariable $speechResult | Tee-Object -FilePath $speechApiResultPath   
            Write-Debug ("Speech API Result: " + $speechResult)    
        } catch {
            Write-Error $_Write-Error $_
        }
    }
 
    #process transcription result
    if ( (Test-Path $speechApiResultPath) -eq $true ) {
        Write-Debug ("Reading Speech Results File: " + $speechApiResultPath)
		#remove previous consolidated transcription file
		if ( (Test-Path) -eq $true ) {Remove-Item $speechTextPath -Force }
		#flush each transcript result to disk
		$content.results | ForEach-Object { $_.alternatives | ForEach-Object { Add-Content $speechTextPath ($_.transcript+' ')  }  }
    } else {
        Write-Debug ("No Speech API Results: " + $speechTextPath)
    }
}
 
#fetch the search results
$audioFiles = Get-CMAudioFilesJson
if ( $audioFiles -eq $null ) {
    Write-Error "No audio files found"
    exit
}
#process each
Write-Debug "Found $($audioFiles.Length) audio files"
foreach ( $audioFile in $audioFiles ) {
    Write-Host "Transcribing $($audioFile.RecordNumber.Value)"
    Get-AudioText  -audioRecord $audioFile
}

Subscribing a MicroStrategy Report to be delivered to Content Manager

As users work with MicroStrategy they will create records, right?  In this post I'll detail one approach for routinely getting those records into Content Manager with no user effort....

First create a new file transmitter named Content Manager:

 
2018-05-25_21-05-36.png
 
 
 

Then create a new Content Manager device:

 
Select the new Content Manager transmitter type

Select the new Content Manager transmitter type

 
 
Note the File Location is dynamic, based on the User's ID

Note the File Location is dynamic, based on the User's ID

 

Then update the user so that they have an address on the new device:

 
Note that user addresses can be automatically attached via a command manager script

Note that user addresses can be automatically attached via a command manager script

 

Also need to ensure that the user can subscribe to a file (Content Manager is of type File):

 
 

Now the user can subscribe a report to a file:

 
2018-05-25_21-31-07.png
 

This one will be scheduled to run every day at 5am:

 
2018-05-26_0-20-25.png
 

When the schedule runs the PDF will be placed within a sub-folder unique for this user:

The Recipient ID in the folder name doesn't really help me.  The script will need to know the login of the user so that it can be registered within CM on their behalf.  I can lookup the login by using the command manager, as shown below.

If I execute the command manager from within powershell I won't get a nice table.  Instead I'll get some really ugly output:

2018-05-26_8-44-30.png

I don't want the results file to include everyone.  I just need those who have addresses pointing to my Content Manager staging devices.  But I also want to know the name of the report and have it formatted to be more easily worked-with inside powershell.  I'll have to use a command manager procedure....

DisplayPropertyEnum iProperty = DisplayPropertyEnum.EXPRESSION;
ResultSet oUserProperties = executeCapture("LIST ALL PROPERTIES FOR USERS IN GROUP 'EVERYONE';");
ResultSet oPropertySet = null;
oUserProperties.moveFirst();
while (!oUserProperties.isEof()) {
    String sUserLogin = oUserProperties.getFieldValueString(DisplayPropertyEnum.LOGIN);
    String sID = oUserProperties.getFieldValueString(DisplayPropertyEnum.ID);
    String sUserName = oUserProperties.getFieldValueString(DisplayPropertyEnum.FULL_NAME);
    ResultSet oUserAddresses = (ResultSet)oUserProperties.getFieldValue(DisplayPropertyEnum.DS_ADDRESSES_RESULTSET);
    oUserAddresses.moveFirst();
    while (!oUserAddresses.isEof()) {
        String sAddressName = oUserAddresses.getFieldValueString(DisplayPropertyEnum.DS_ADDRESS_NAME);
        if (sAddressName.contains("Content Manager")) {
            ResultSet oProjects = executeCapture("LIST ALL PROJECTS;");
            oProjects.moveFirst();
            while (!oProjects.isEof()) {
                String sProjectName = oProjects.getFieldValueString(DisplayPropertyEnum.NAME);
                ResultSet oSubscriptions = executeCapture("LIST ALL SUBSCRIPTIONS FOR RECIPIENTS USER '" + sUserName + "' FOR PROJECT '" + sProjectName + "';");
                oSubscriptions.moveFirst();
                while (!oSubscriptions.isEof()) {
                    String sContent = oSubscriptions.getFieldValueString(DisplayPropertyEnum.CONTENT);
                    String sSubscriptionType = oSubscriptions.getFieldValueString(DisplayPropertyEnum.SUBSCRIPTION_TYPE);
                    if (sSubscriptionType.contains("File")) {
                        ResultSet oRecipientList = (ResultSet)oSubscriptions.getFieldValue(RECIPIENT_RESULTSET);
                        oRecipientList.moveFirst();
                        while (!oRecipientList.isEof()) {
                            String sRecipientAddress = oRecipientList.getFieldValueString(RECIPIENT_ADDRESS);
                            if (sRecipientAddress.contains(sAddressName)) {
                                printOut("||" + sID + ",\"" + sUserLogin + "\",\"" + sContent+"\"");
                            }
                            oRecipientList.moveNext();
                        }
                    }
                    oSubscriptions.moveNext();
                }
                oProjects.moveNext();
            }
        }
        oUserAddresses.moveNext();
    }
    oUserProperties.moveNext();
}

Executing this yields the following content within the log file:

2018-05-26_8-05-45.png

Note that I included two pipe characters in my log file, so that I can later find & parse my results.  In powershell I'll invoke the procedure via the command manager, redirect the output to a file, load the file, find the lines with the double pipes, extract the data, and convert it from CSV. 

function Get-MstrCMUserSubscriptions {
    $command = 'EXECUTE PROCEDURE ListContentManagerUserSubscriptions();'
    $outFile = New-TemporaryFile
    $logFile = New-TemporaryFile
    $userSubscriptions = @("ID,Name,Report")
    try {
        Add-Content -Path $outFile $command
        $cmdmgrCommand = "cmdmgr -n `"$psn`" -u `"$psnUser`" -f `"$outFile`" -o `"$logFile`""
        iex $cmdmgrCommand$results += $_ }}
        $results = Get-Content -Path $logFile | Where-Object { $_ -like "*||*" }
        foreach ( $result in $results ) {
            $userSubscriptions += ($result.split('|')[2])
        }
    } catch {
    }
    if ( (Test-Path $outFile) ) { Remove-Item -Path $outFile -Force }
    if ( (Test-Path $logFile) ) { Remove-Item -Path $logFile -Force }
    return $userSubscriptions | ConvertFrom-CSV
}

That effort yields an array I can work with...

2018-05-26_8-28-54.png

Last step, iterate the array and look for reports in the staging area.  As I find one, I submit it to Content Manager via the Service API and remove it from disk.  

$userSubscriptions = Get-MstrCMUserSubscriptions
foreach ( $userSubscription in $userSubscriptions ) {
    $stagingPath = ("$stagingRoot\\$($userSubscription.ID)")
    $reports = Get-ChildItem $stagingPath -Filter "*$($userSubscription.Report)*"
    foreach ( $report in $reports ) {
        New-CMRecord -UserLogin $userSubscription.Name -Report $userSubscription.Report -File $report.Name
        Remove-Item $report.Name -Force
    }
}

My New-CMRecord function includes logic that locates an appropriate folder for the report.  Yours could attach a schedule, classification, or other meta-data fetched from Microstrategy.  

Ensuring Records Managers can access Microstragegy

I've got a MicroStragtegy server environment with a Records Management group.  I'd like to ensure that certain new CM users always have access to MSTR, so that they have appropriate access to dashboards and reports.  For simplicity of the post I'll focus just on CM administrators.

Within MicroStrategy I'd like to create a new user and include them in a "Records Management" group, like so:

2018-05-05_8-48-44.png

This implementation of MSTR does not have an instance of the REST API available, so I'm limited to using the command manager.  My CM instance is in the cloud and I won't be allowed to install the CM client on the MSTR server.  To bridge that divide I'll use a powershell script that leverages the ServiceAPI and Invoke-Expression cmdlet.  

First I need a function that gets me a list of CM users:

function Get-CMAdministrators {
    Param($baseServiceApiUri)
    $queryUri = ($baseServiceApiUri + "/Location?q=userType:administrator&pageSize=100000&properties=LocationLogsInAs,LocationLoginExpires,LocationEmailAddress,LocationGivenNames,LocationSurname")
    $AllProtocols = [System.Net.SecurityProtocolType]'Ssl3,Tls,Tls11,Tls12'
    [System.Net.ServicePointManager]::SecurityProtocol = $AllProtocols
    $headers = @{ 
        Accept = "application/json"
    }
    $response = Invoke-RestMethod -Uri $queryUri -Method Get -Headers $headers -ContentType "application/json"
    Write-Debug $response
    if ( $response.TotalResults -gt 0 ) {
        return $response.Results
    }
    return $null
}

Second I need a function that creates a user within MSTR:

function New-MstrUser {
    Param($psn, $psnUser, $psnPwd, $userName,$fullName,$password,$group)
    $command = "CREATE USER `"$userName`" FULLNAME `"$fullName`" PASSWORD `"$password`" ALLOWCHANGEPWD TRUE CHANGEPWD TRUE IN GROUP `"$group`";"
    $outFile = ""
    $logFile = ""
    try {
        $outFile = New-TemporaryFile
        Add-Content -Path $outFile $command
        $logFile = New-TemporaryFile
        $cmdmgrCommand = "cmdmgr -n `"$psn`" -u `"$psnUser`" -f `"$outFile`" -o `"$logFile`""
        iex $cmdmgrCommand
    } catch {
        
    }
    if ( (Test-Path $outFile) ) { Remove-Item $outFile -Force }
    if ( (Test-Path $logFile) ) { Remove-Item $logFile -Force }
}

Last step is to tie them together:

$psn = "MicroStrategy Analytics Modules"
$psnUser = "Administrator"
$psnPwd = ""
$recordsManagemenetGroupName = "Records Management"
$baseUri = "http://10.0.0.1/HPECMServiceAPI"
$administrators = Get-CMAdministrators -baseServiceApiUri $baseUri
if ( $administrators -ne $null ) {
    foreach ( $admin in $administrators ) {
        New-MstrUser -psn $psn -psnUser $psnUser -psnPwd $psnPwd -userName $admin.LocationLogsInAs.Value -fullName ($admin.LocationSurName.Value + ', ' + $admin.LocationGivenNames.Value) -password $admin.LocationLogsInAs.Value -group $recordsManagemenetGroupName
    } 
}

After running it, my users are created!

2018-05-05_10-03-31.png