I am Joshua Poehls. Say hello Archives (not so) silent thoughts

PowerShell Script Module Boilerplate

One of the things I always look for when getting familiar with a new language or environment is examples of the physical file structure and logical organization of the code. How do you layout your project files? What is idiomatic?

Unsurprisingly, this isn’t always as simple as finding a few open-source projects on GitHub to reference. Believe it or not, there are a lot of pretty unorganized coders out there. I admit to being a bit OCD with my project structures. I like them clean, organized, and consistent.

In this post I’m going to cover my preferred boilerplate for PowerShell Script Modules.

Script Modules are about as simple as it gets. Typically you have one or more PS1 or PSM1 files that contain your module’s cmdlets. Beyond that you should have a PSD1 manifest.

Fork this!

This entire boilerplate is on GitHub. If you just want a solid starting point, download this repo. If you want to know more, keep reading.

File Structure

+- src/
| +- source_file.ps1
| +- ...
+- tools/
| +- release.ps1
+- LICENSE
+- README.md
  • src/ contains the PS1, PSM1, PS1XML, and any other source files for the module.
  • tools/ is where I put any meta scripts for the project. Usually there is just one script here that builds a release version of my module.
  • LICENSE - if your module is open-source, always specify what license you are releasing it under.
  • README.md - always have a README file. Even if it is only a one sentence description. Markdown is a great format to use for this.

Explicit Exports

By default, PowerShell will export all of the functions in your module. I recommend being explicit about this and always specifying which functions should be publically exported. This way it is easy to add private helper functions that are internal to your module without worrying about them being made public accidentally.

You do this by calling Export-ModuleMember at the bottom of your source file. If you have a source.psm1 file that contains a Show-Calendar function that should be public, you would do something like this:

# Show-Calendar.ps1
#
# Show-Calendar will be public.
# Any other functions in this file will be private.

function Show-Calendar {
	# ...
}

Export-ModuleMember -Function Show-Calendar

Full example →

Release Script

Every project should have a release script. You should never be manually building your distributable release. At a minimum my release script will:

  1. Generate the PSD1 manifest for my module.
  2. Save the manifest into a temporary ./dist folder.
  3. Copy all of the module source files into ./dist.
  4. Add all of the module’s source files to a ZIP file ready for me to distribute.

Here is what a simple release.ps1 script might look like.

View on GitHub →

<#
.SYNOPSIS
	Generates a manifest for the module
	and bundles all of the module source files
	and manifest into a distributable ZIP file.
#>

[CmdletBinding()]
param(
    [Parameter(Mandatory = $true)]
    [version]$ModuleVersion
)

$ErrorActionPreference = "Stop"

Write-Host "Building release for v$moduleVersion"

$scriptPath = Split-Path -LiteralPath $(if ($PSVersionTable.PSVersion.Major -ge 3) { $PSCommandPath } else { & { $MyInvocation.ScriptName } })

$src = (Join-Path (Split-Path $scriptPath) 'src')
$dist = (Join-Path (Split-Path $scriptPath) 'dist')
if (Test-Path $dist) {
    Remove-Item $dist -Force -Recurse
}
New-Item $dist -ItemType Directory | Out-Null

Write-Host "Creating module manifest..."

$manifestFileName = Join-Path $dist 'YourModule.psd1'

# TODO: Tweak the manifest to fit your module's needs.
New-ModuleManifest `
    -Path $manifestFileName `
    -ModuleVersion $ModuleVersion `
    -Guid fe524c79-95a6-4d02-8e15-30dddeb8c874 `
    -Author 'Your Name' `
    -CompanyName 'Your Company' `
    -Copyright '(c) $((Get-Date).Year) Your Company. All rights reserved.' `
    -Description 'Description of your module.' `
    -PowerShellVersion '3.0' `
    -DotNetFrameworkVersion '4.5' `
    -NestedModules (Get-ChildItem $src -Exclude *.psd1 | % { $_.Name })

Write-Host "Creating release archive..."

# Copy the distributable files to the dist folder.
Copy-Item -Path "$src\*" `
          -Destination $dist `
          -Recurse

# Requires .NET 4.5
[Reflection.Assembly]::LoadWithPartialName("System.IO.Compression.FileSystem") | Out-Null

$zipFileName = Join-Path ([System.IO.Path]::GetDirectoryName($dist)) "$([System.IO.Path]::GetFileNameWithoutExtension($manifestFileName))-$ModuleVersion.zip"

# Overwrite the ZIP if it already already exists.
if (Test-Path $zipFileName) {
    Remove-Item $zipFileName -Force
}

$compressionLevel = [System.IO.Compression.CompressionLevel]::Optimal
$includeBaseDirectory = $false
[System.IO.Compression.ZipFile]::CreateFromDirectory($dist, $zipFileName, $compressionLevel, $includeBaseDirectory)

Move-Item $zipFileName $dist -Force

Version Control

Always exclude your ./dist folder from source control. As a rule of thumb, you never want to store the build output of any project in source control.

Depending on how you plan to release your module, you may prefer to exclude *.psd1 manifest files from source control. This just keeps things clean and enforces that you use your release script to build the distributable.

Good Examples

Here are a few open-source PowerShell modules that I’ve found to be good examples to follow.

⦿

PowerShell Batch File Wrapper

Sometimes you want a .cmd wrapper for your PowerShell script. Usually for me this is so people who aren’t familiar with the command line can double-click to execute the script.

This batch file should be saved alongside your PowerShell script, like so.

.\
 |- my_script.ps1
 |- my_script.cmd

my_script.cmd will execute the same named .ps1 file in the same directory, so my_script.ps1 in this case. Any arguments passed to my_script.cmd will pass-through to the PowerShell script.

@ECHO OFF

SET SCRIPTNAME=%~d0%~p0%~n0.ps1
SET ARGS=%*
IF [%ARGS%] NEQ [] GOTO ESCAPE_ARGS

:POWERSHELL
PowerShell.exe -NoProfile -NonInteractive -NoLogo -ExecutionPolicy Unrestricted -Command "& { $ErrorActionPreference = 'Stop'; & '%SCRIPTNAME%' @args; EXIT $LASTEXITCODE }" %ARGS%
EXIT /B %ERRORLEVEL%

:ESCAPE_ARGS
SET ARGS=%ARGS:"=\"%
SET ARGS=%ARGS:`=``%
SET ARGS=%ARGS:'=`'%
SET ARGS=%ARGS:$=`$%
SET ARGS=%ARGS:{=`}%
SET ARGS=%ARGS:}=`}%
SET ARGS=%ARGS:(=`(%
SET ARGS=%ARGS:)=`)%
SET ARGS=%ARGS:,=`,%
SET ARGS=%ARGS:^%=%

GOTO POWERSHELL

What’s going on here?

  • %SCRIPTNAME% variable holds the name of the PowerShell script to execute. %~d0%~p0%~n0 magic gets the full path of the current batch script without the file extension. By specifying the full path of the PowerShell script like this we can guarantee that it is always executed from the right place no matter what your working directory is.
  • Escapes special characters in the arguments so that they are passed to PowerShell as you would expect.
  • Runs PowerShell.exe with:
    • -NoProfile to improve startup performance. Scripts you are distributing shouldn’t rely on anything in your profile anyway.
    • -NonInteractive because usually my scripts don’t need input from the user.
    • -ExecutionPolicy Unrestricted to ensure that the PowerShell script can be executed regardless of the machine’s default Execution Policy.
    • -Command syntax for executing the command ensures that PowerShell returns the correct exit code from your script. Using -Command with $ErrorActionPreference = 'Stop' also ensures that errors thrown from your script cause PowerShell.exe to return a failing exit code (1). PowerShell is quite buggy when it comes to bubbling exit codes. This is the safest method I’ve found.

Batch file tips

Special characters in arguments

Remember that certain special characters need to be escaped in arguments passed to the batch file. These characters are: ^ & < > /?. Note that /? is a sequence and is recognized as a help flag when passed to a batch file.

my_script.cmd "I ""am"" quoted" passes a single argument I "am" quoted to PowerShell.

my_script.cmd "^&<>/?" passes ^&<>/?

Environment variable expansion

Environment variables get automatically expanded in batch file arguments.

my_script.cmd %PROGRAMFILES% passes C:\Program Files

⦿

Differential backups using 7-zip and PowerShell

Backups are like changing the oil on your car. It’s the best thing you can do to extend the life of your engine and the best thing you can do to save yourself when your hard drive crashes, gets stolen, or otherwise corrupted.

I work remotely and needed a backup strategy for my work laptop. My company encourages you to keep critical client files and such in your personal share on the network. Most of our employees work in office so this is perfectly reasonable as the norm. (Though I bet a ton of stuff still ends up on local machines!) As a remotee however, this isn’t practical for reasons that are hopefully obvious to everyone.

There are as many backup strategies, scripts, and apps as there are computers. Today though, it was PowerShell to the rescue with the help of trusty 7-zip (command line version) and robocopy.

Requirements

  • I’m not concerned with a full system backup. No need to image my hard drive, I only want to backup a specific set of directories that contain my working files.
  • I need to backup to a local drive.
  • I need to upload backups to my user share on the company network.
  • I need a scheduled task that will perform the backup on a regular basis.

Implementation details

  • Differential backups. Because I’ll be uploading these over a VPN, I want them to be as small as possible. Differential backups will allow me to upload a full backup once (or occasionally) and then upload much small backups daily.
  • LZMA compression. I’ll sacrifice some speed during the backup stage but save a ton of time on the upload by compressing my backups using 7-zips LZMA (7z) compression format.
  • Local hard drive space is limited so I’ll only keep the most recent backup locally. I’ll keep past backups on the company network and manually prune old ones when needed. This will be a casual safety net if I ever need to roll back to a specific point in time.

Results

Here’s what it looks like.

I’ll store my most recent backups locally in C:\backup. Timestamps in the names keep everything straight. Each backup has an accompanying log file.

Screenshot of backup folder

My backup script is stored in Dropbox.

  • backup.ps1 - The meat of the backup script. The various *.cmd scripts are just wrappers that pass this the appropriate arguments.
  • upload-backup.cmd - Starts uploading all existing backups to the company network.
  • full-backup.cmd - Starts a full backup.
  • diff-backup.cmd - Starts a differential backup.
  • includes.txt - A list of directory and file paths to include in the backup. This is passed to 7-zip.
  • excludes.txt - A list of directory and file paths to exclude from the backup. This is also passed to 7-zip.

Screenshot of script folder

I have a Windows Scheduled Task that runs a differential backup (diff-backup.cmd) hourly. The differential backup is usually pretty fast, so running hourly is no problem.

I have another task that runs the upload (upload-backup.cmd) once a day when I am connected to the VPN. Task Scheduler is supposedly smart enough to only run the task if the VPN connection is available and to retry on a regular basis if it isn’t. I’ve never used this feature before so we’ll see how it goes. I can always fall back to manually running this via a desktop shortcut.

The Script

My script isn’t suited to sharing verbatim but I do want to share the good parts.

Selecting the most recent full backup

Differential backups have to be based on a full backup. The script is smart enough to find the most recent full backup file and use it as the base.

# Find the most recent full backup.
# Depends on Get-ChildItem returning the items sorted in ascending order (oldest backups first).
$fullBackup = Get-ChildItem -File -Path "$backupOutputPath\backup-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9].7z" | select -Last 1 -ExpandProperty FullName
if (-not ($fullBackup) -or -not (Test-Path $fullBackup -PathType Leaf)) {
	throw "No full backup was found. Must have a full backup before performing a differential."
}

7-zip args for a full backup

$7zipArgs = @(
	"a";                          # Create an archive.
	"-t7z";                       # Use the 7z format.
	"-mx=7";                      # Use a level 7 "high" compression.
	"-xr!thumbs.db";              # Exclude thumbs.db files wherever they are found.
	"-xr!*.log";                  # Exclude all *.log files as well.
	"-xr-@`"`"$excludesFile`"`""; # Exclude all paths in my excludes.txt file.
	"-ir-@`"`"$includesFile`"`""; # Include all paths in my includes.txt file.
	"$outputFile";                # Output file path (a *.7z file).
)

Notice that I doubled some quotation marks for the excludes and includes arguments. This escaping is necessary to ensure the path I’m passing in works even if it has spaces in it.

7-zip args for a differential backup

$7zipArgs = @(
	"u";                                    # Update an archive. Slightly confusing since we'll be saving those updates to a new archive file.
	"$fullBackupPath";                      # Path of the full backup we are creating a differential for.
	"-t7z";
	"-mx=7";
	"-xr!thumbs.db";
	"-xr!*.log";
	"-xr-@`"`"$excludesFile`"`"";
	"-ir-@`"`"$includesFile`"`"";
	"-u-";                                  # Don't update the original archive (the full backup).
	"-up0q3r2x2y2z0w2!`"`"$outputFile`"`""; # Flags to specify how the archive should be updated and the output file path (a *.7z file).
)

The last argument there is a doosey. Here’s what those flags mean.

  • -u - Starts the “update options” which follow.
  • Each letter represents a state of a file either in the base archive (our full backup) or on disk (the files we’re backing up).
    • p - File exists in archive, but isn’t in any of the include file paths. The file is no longer included in the backup (includes.txt probably changed).
    • q - File exists in archive, but doesn’t exists on disk. The file has been deleted since the full backup.
    • r - File doesn’t exist in archive, but exists on disk. The file is new since the full backup.
    • x - File in archive is newer than the file on disk. (Odd.)
    • y - File in archive is older than the file on disk. The file has been updated since the full backup.
    • z - File in archive is same as the file on disk.
    • w - Cannot be detected which file is newer (times are the same, sizes are different).
  • After each of those letters is a number that represents what action to take on files matching that state.
    • 0 - Ignore file. Don’t include it in the backup.
    • 1 - Copy file. Copy it from the old archive to the new. From our full backup to our new differential backup.
    • 2 - Compress. Include the file from disk in our new differential backup.
    • 3 - Create an anti-item. This is a feature only supported by 7z archives and basically puts a marker in the archive saying that this file has been deleted. If you extract this archive into a directory where this file exists then it will be deleted during the extraction.
  • Everything after the ! is specifying the output file path.

You can read more about 7-zip’s options in their documentation.

Watching for exit codes

I want my backup to fail if 7-zip fails.

& $7zip @7zipArgs | Tee-Object -LiteralPath $logFile
if ($LASTEXITCODE -gt 1) # Ignores warnings which use exit code 1.
{
	throw "7zip failed with exit code $LASTEXITCODE"
}

Log 7-zip output to both the console and log file

Did you notice Tee-Object in the above snippet? That’s what handles this. The only downside is that your console output will be laggy due to PowerShell buffering the pipeline.

Deleting old backups

Remember, I’m only keeping the most recent full backup and most recent differential backup files locally. My script automatically deletes old backup files after a new backup is successful. This is the magic used to delete the old differential backups when a new one finishes.

# Clean up old differential backup files.
# Only keep the most recent differential backup.
# Depends on Get-ChildItem returning the items sorted in ascending order (oldest backups first).
$allDiffBackups = Get-ChildItem -File -Path "$backupOutputPath\backup-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-diff-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9].7z"
if ($allDiffBackups -is [array] ) {
	[Array]:: Reverse($allDiffBackups)
	$allDiffBackups | select -Skip 1 | % {
		Write-Host "Deleting old differential backup. File: $($_.FullName)"
		# Remove the matching log file.
		Remove-Item -LiteralPath ([System.IO.Path ]::ChangeExtension($_.FullName, ".log")) -ErrorAction SilentlyContinue
		$_
	} | Remove-Item
} 

Robocopy args to upload the backups

$robocopyArgs = @(
	"$backupOutputPath"; # Source path
	"$networkDestPath";  # Destination path
	"/Z";                # Use restartable mode when transferring files.
	"/FP";               # Log the full paths to files.
	"/NP";               # Don't log progress percentages.
	"/X";                # Log a list of 'extra' files that exist in the destination but not locally.
	"/UNILOG+$logfile";  # Append to the specified log file.
	"/TEE";              # Send output to the console in addition to the log file.
)

Those arguments will perform a shallow copy of all files in the source folder to the destination.

Note that since robocopy supports the /UNILOG+ and /TEE arguments I don’t have to use Tee-Object to append to the log file.

Once again, I’ll make sure the copy is successful and fail the script if the copy fails.

& $robocopy @robocopyArgs
if ($LASTEXITCODE -ge 8) # exit code when files failed to copy
{
	# robocopy exit codes: http://support.microsoft.com/kb/954404
	throw "robocopy failed with exit code $LASTEXITCODE"
}

Passing args from PowerShell to executables

There are a million ways, of course, but my favorite is to put the arguments into an array and let PowerShell splat them. I really like how this keeps the script readable even when you are passing a ton of arguments around.

$cmdArgs = @(
	"/E";
	"/UNILOG+$logfile";
) 

& some_application.exe @cmdArgs

The only thing to be aware of here is that if one of your $cmdArg items contains a SPACE then PowerShell will automatically wrap the argument in double quotes when passing it to the executable.

If you are having trouble getting your quotes and other special characters escaped properly, I’ve written several EchoArgs apps to help. They just echo back the arguments passed in so you can see how they are being received.

Epilogue

Hopefully this will help you get a jump start on your own backup strategy. Please, hit me up in the comments if you have any questions.

⦿

Push it out! Push ups every hour.

Push ups. Every hour, on the hour. Can you handle it?

I’ve been motivated by a friend to start this challenge. Every hour, on the hour, I’m going to drop and do a set of push ups.

Remembering is the hardest part. Timers are easy to set but also easy to ignore. My solution is to make myself a massive nag screen that will flash every hour while I’m at work.

Screenshot of nag screen

My strategy is working! I’m 7 hours into my day and haven’t missed a set. It feels great.

Join me!

My nag screen is just a PowerShell script and scheduled task. It really couldn’t be simpler. You don’t even need to be a geek to figure this out.

  1. Download the ZIP.
  2. Extract the files somewhere permanent. Moving them later will require you to manually update the scheduled task.
  3. Run pushitout.cmd to install the scheduled task.
  4. Run it again to see what it looks like.

The default schedule is to nag you every day, every hour, between 8am and 5pm. Feel free to customize this to fit your schedule, editing scheduled tasks is easy.

I’d love to hear how this works out for you, let me know in the comments!

Over achievers will notice that the Show-NagWindow.ps1 script lets you customize the message, font, and colors by passing in different parameters. Edit the obvious parts of pushitout.cmd to customize the nag screen to your liking.

Works on my machine! … which happens to be running Windows 7, .NET 4.5 and PowerShell V3. You’ll probably need admin rights on your machine in order to install the scheduled task.

⦿

User sign up and email confirmation, a UX boilerplate

Sign up, confirm your email address, login. The three steps that almost all web apps share.

It isn’t hard to get this process right but it does take some planning. I’ve had to implement this a lot so I finally decided to document it. This is my boilerplate for a user sign up process. The interaction flow that I use as the starting point for sign-ups in my apps. Along the way I’ll point out some key UX (User eXperience) tips. I hope this saves you some time on your next project.

The UI mockups are just that. Mockups. This post isn’t about the UI of your pages but rather the flow and functionality. The mockups are intended to communicate features, not a specific design or layout.

Login

The login is so obvious that it is hardly worth mentioning.

Login mockup

  • Show friendly messages for invalid credentials.
  • Don’t make them re-enter their username on subsequent attempts.
  • Set the initial focus to the username input. It’s the little things.
  • Provide a link for the user to sign up. They may land on the login page but not have an account, make it easy on them.
  • Provide a link for resetting their password.

You should never provide a password recovery link because you should NEVER be able to recover their password. How should you store passwords?

Sign Up

The sign up process is arguably the most important part of your app. This is the gateway through which all users must pass. Every failure means lost users.

I’m going to simplify the sign up process to the absolute bare essentials.

Sign Up mockup

  • Show a friendly message and login link if the username is already linked to an account. If they click that login link, then navigate to the login form and pre-populate the username field.
  • Provide a link for the user to login if they already have an account. Just because they are on the sign up page doesn’t mean they don’t have an account. Make it easy.

If you collect an email address during sign up then consider adding logic to suggest corrections to commonly misspelled domains. For example, if the user enters “user@gnail.com” you can ask, “Did you mean: user@gmail.com?”.

You can read more about this or skip straight to the jQuery plugin, Mailcheck.

Confirmation email

I’m going to assume your app requires a valid email address. Most do. If yours doesn’t then your job just got a lot easier!

After signing up, the next page your users see is:

Confirmation sent mockup

  • Tell them what email address you sent the confirmation to.
  • Tell them what to do with the message they receive.

The confirmation email they receive:

Confirmation email mockup

  • Show the link URL if you are sending an HTML email. This will ensure that users with a plain text mail client can still see the link.
  • Clicking the link should automatically log the user in. You had to have their credentials in order to send the email and the email has been verified - there is no reason not to login them in.

Ideally you should send the email with both a plain text and HTML body. This ensures that the user gets the best experience possible for their mail client. Read up on MIME multipart/alternative for more information.

They didn’t see the email

There is always the chance that the user will not receive the confirmation email or that they will simply ignore it. We need to plan for this.

If they try to continue without confirming their address:

Confirmation needed page mockup

  • Tell them what the problem is, e.g. “confirmation needed”.
  • Remind them what email address the confirmation was sent to.
  • Tell them again what they should do with that message.
  • Remind them to check their junk or spam folders and help them out with the search by telling them who the sender was.
  • Give them an option to resend the message.
  • Give them an option to change their email address.

Resending the confirmation

Maybe the user’s mail server was offline. Maybe they deleted your message without reading it. Maybe it got trapped by an over zealous spam filter. There are any number of ways the message might get lost. The ability to have it re-sent is a great UX booster that might just save you an otherwise lost user.

Confirmation message sent mockup

Changing their email

This is another little detail that can make a big difference. Maybe the user entered the wrong email. Maybe they just can’t get that account to work and want to try another address. At this point there is little reason not to let them simply enter a different address. Let me explain:

  • Their account is already in an unconfirmed state. Since we have no way to know that the current (unconfirmed) email is valid, there isn’t any harm in letting them change it to another email.
  • They already entered their username and password. Combined with the fact that the existing email is unconfirmed this should be reasonable enough verification of the user’s identity.

Change email mockup

  • Show them their current address on the change form.

The confirmation should be sent to the new address and a friendly message shown once again.

All previously sent confirmation links should be disabled at this point for security. You wouldn’t want the wrong address to be confirmed, you only want the new one.

Email changed mockup

That’s all folks

That is my boilerplate. I believe it covers the requirements that nearly all sign up forms share and will keep at least this dude from having to rethink the process in the future.

In the end all is simple but we have a few guiding principles:

  • Be obvious. At every point give the user all the information they need to make a decision or take the required action.
  • Be simple. Keep all of your forms as lean as possible. Collect any extra profile or account information after the initial sign up. This keeps the barrier of entry as low as possible.
  • Be helpful. Give the user the ability to undo their mistakes and troubleshoot. Resending the confirmation and changing their email address being the prime examples.
  • Be easy. Whenever you can save the user a step, do it. Such as logging them automatically when they click the confirmation link.

Feel free to use and tweak the Balsamiq mockups that I created for this post. You can download them here.

⦿

TableauKit: A PowerShell Module for Tableau

If you’ve been following my Tableau via PowerShell series then you already know there are a lot of cool things you can do with Tableau workbooks using PowerShell. Up until now I’ve been sharing helper functions in a stand-alone fashion. No longer.

I’d like to introduce you to TableauKit. A pure, script-based PowerShell module for working with Tableau files.

Say what?

Today TableauKit is shipping with the following commands:

  • Get-TableauObject - Gets a summary object for a workbook. The summary contains things like the list of dashboards, worksheets, data sources, and parameters in the workbook. It also contains metadata like the version of Tableau that saved the workbook.

    You can dig a bit deeper to find the parameter values and which worksheets are included on specific dashboards.

    No XPath required!

  • Get-TableauXml - Opens up a TWB or TWBX file and returns the workbook's raw XML. XPath Heaven.
  • Export-Tableau - Lets you save changes you've made to the workbook's XML back into a new or existing TWB or TWBX file.

You’ll probably recognize Get-TableauXml and Export-Tableau from my previous posts about them. They are much better than when you last met them, though! Several bugs have been fixed and the names are shorter now.

Get-TableauObject is all new. Just because you can dig through the workbook with XPath doesn’t mean it’s fun. This command provides quick access to the most common parts of a workbook. Don’t worry, it also gives you the raw XML so you can go spelunking for the hard to find stuff!

No really, what can it do?

Find workbooks by worksheet name

Can’t remember which workbook contains that elusive “Top 10 Hot Dogs” worksheet?

Get-TableauObject *.twb, *.twbx |
	where {
        ($_.Worksheets | select -Expand DisplayName) -contains "Top 10 Hot Dogs"
	} |
	select FileName
FileName
--------
C:\users\jpoehls\desktop\result.twb
C:\users\jpoehls\desktop\another-result.twbx

Analyze connection types

What types of connections do my workbooks use?

Get-TableauObject *.twb, *.twbx |
	select -Expand DataSources |
	group ConnectionType |
	sort Count -Desc
Count Name
----- ----
   26 excel
   12 sqlserver
   10 csv
    4
    3 dataengine
    3 msaccess
    2 bigquery
    1 firebird

Find workbooks by connection type

Which workbooks use the Big Query connection?

Get-TableauObject *.twb, *.twbx |
	where { ($_.DataSources | select -Expand ConnectionType) -eq "bigquery" } |
	select FileName
FileName
--------
C:\users\jpoehls\desktop\Shakespeare.twb
C:\users\jpoehls\desktop\Shakespeare_Extract.twb

List all dashboard worksheets

Let’s get a list of all dashboards and their worksheets for all of our workbooks.

Oh, and we want that list presented in a GUI that we can use to further search and filter.

Get-TableauObject *.twb, *.twbx |
	foreach { $workbook = $_; $_ } |
	select -Expand Dashboards |
	foreach { $dashboard = $_; $_ } |
	select -Expand Worksheets |
	select @{ Label="Dashboard"; Expression={$dashboard.DisplayName} },
           @{ Label="Worksheet"; Expression={$_.DisplayName} },
           @{ Label="File"; Expression={$workbook.FileName} } |
	Out-GridView

Results Screenshot

Find worksheets with a specific column filter

Looking for all the workbooks and worksheets that have a filter on your “NextPaymentDate” column? Sure, we can do that.

This code is pretty ridiculous so I’m going to break it up with annotations to make it easier to grok.

This kind of query requires advanced knowledge of workbook internals and your mileage may vary if you run this against your own workbooks.

# Name of the data source.
$dataSource = "MonthlySummary (Landboss_com-Accounts)"
# Name of column that has a filter on it.
$column = "NextPaymentDate"

# Read all the workbooks.
Get-ChildItem *.twb , *.twbx |

    # Save a reference to the file name.
    foreach { $file = $_.FullName; $_ } |
    Get-TableauXml |

	# Find all worksheets with the data source we
	# are looking for.
	Select-Xml "/workbook/worksheets/worksheet/table/view/datasources/datasource" |
	Select -Expand Node |
	where {
        $_.caption -eq $dataSource -or $_.name -eq $dataSource
    } |
    
    # Save a reference to the data source's actual name
    # for later parts of the pipeline.
    foreach { $dataSourceName = $_.name; $_	} |
	
	# Find references to the column we are looking for.
	# Note we're still inside the worksheet node.
	Select-Xml "../../datasource-dependencies[@datasource='$dataSourceName']/column-instance[@column='[$column]']" |
	Select -Expand Node |
    
    # Save a reference to the column's actual name
    # for the next part of the pipeline.
    foreach { $filterColumnName = $_.name; $_ } |
	
	# Find any filters on the worksheet that
	# apply to the column we care about.
	Select-Xml "../../filter[@column='[$dataSourceName].$filterColumnName']" |
	select -Expand Node |
	
	# Traverse back to the worksheet node.
	Select-Xml "../../.." |
	select -Expand Node |
	
	# Build the final result display.
	select @{ Label="Worksheet"; Expression={
        	if ($_.caption) { $_.caption }
    		else { $_.name }
			}},
			@{ Label="File"; Expression={$file} } 
Worksheet                 File                                       
---------                 ----                                       
Avg Monthly Income        C:\users\jpoehls\desktop\Landboss Income.twb
Monthly Income by Account C:\users\jpoehls\desktop\Landboss Income.twb 

You can do anything!

Hopefully this has given you a glimpse at the power of querying your workbooks en masse. The queries are pretty gnarly so I definitely recommend wrapping anything you’ll reuse into helper functions.

What are you waiting for? Go get it!

After you download it, check the README for detailed installation and usage instructions.

TableauKit is a product of InterWorks, a fantastic company that you really want to work for!

Roadmap

The future is wide open for this. What kind of functionality would you like to see added? Let me know in the comments!

⦿

Tableau via PowerShell, Part 2: Saving Changes

This is part 2 of my mini-series on exploring Tableau workbooks with PowerShell. If you missed it, you should read Part 1: Opening Workbooks, before continuing.

Welcome back! Last time I showed you how to open a Tableau workbook (TWB or TWBX) in PowerShell so you could start exploring the raw XML. Today I’ll show the next logical thing, saving the changes you make.

If you are only opening TWB files, then saving your changes couldn’t be easier. Since TWBs are just XML files we can simply write it to a file, like so.

$workbook = Get-TableauWorkbookXml 'MyWorkbook.twb'
$workbook.Save('ModifiedWorkbook.twb')

Cake.

Saving changes back to a TWBX file is a bit more work but still totally doable. Let’s write a new function, Export-TableauPackagedWorkbook. We will spice things up a bit by making this function smart. It’ll take a few parameters to allow us to control exactly how we want the export to happen.

A -Force parameter will let us specify whether we want to overwrite the destination file if it already exists. By default we would be prompted. Note that this isn’t a terribly useful parameter since it would replace the TWBX with an empty TWBX containing only the TWB.

-Update will let us specify that we want to update the TWB inside the destination TWBX file if the destination file exists. This is the cool parameter because it lets us open an existing TWBX, make changes to the TWB inside, and then save those changes back out into the original TWBX.

We will also support the common -WhatIf and -Confirm parameters that are so useful in PowerShell.

Here is our function:

If you find this helpful then check out TableauKit; a full on PowerShell module for working with Tableau files. It contains a new and improved version of the function below and much more.

function Export-TableauPackagedWorkbook {
<#
.SYNOPSIS
    Exports the workbook XML to a packaged workbook (TWBX) file.
 
.PARAMETER Path
    The literal file path to export to.
 
.PARAMETER WorkbookXml
    The workbook XML to export.
 
.PARAMETER Update
    Whether to update the TWB inside the destination TWBX file
    if the destination file exists.
 
.PARAMETER Force
    Whether to overwrite the destination TWBX file if it exists.
    By default, you will be prompted whether to overwrite any
    existing file.
 
.NOTES
    Author: Joshua Poehls ({{site.url}})
#>
    [CmdletBinding(
        SupportsShouldProcess=$true
    )]
    param(
        [Parameter(
            Position=0,
            Mandatory=$true
        )]
        [string]$Path,
 
        [Parameter(
            Position=1,
            Mandatory=$true,
            ValueFromPipeline=$true
        )]
        [xml]$WorkbookXml,
 
        [switch]$Update,
        [switch]$Force
    )
 
    begin {
        $originalCurrentDirectory = [System.Environment]::CurrentDirectory
 
        # System.IO.Compression.FileSystem requires at least .NET 4.5
        [System.Reflection.Assembly]::LoadWithPartialName("System.IO.Compression") | Out-Null
    }
 
    process {
        [System.Environment]::CurrentDirectory = (Get-Location).Path
        $entryName = [System.IO.Path]::GetFileNameWithoutExtension($Path) + '.twb'
        $createNewTwbx = $false
       
        if (Test-Path $Path) {
            if ($Update -or $Force -or $PSCmdlet.ShouldContinue('Overwrite existing file?', 'Confirm')) {
                if ($Update) {
                    if ($PSCmdlet.ShouldProcess($Path, 'Update TWB in packaged workbook')) {
 
                        [System.IO.FileStream]$fileStream = $null
                        [System.IO.Compression.ZipArchive]$zip = $null
                        try {
                            $fileStream = New-Object System.IO.FileStream -ArgumentList $Path, ([System.IO.FileMode]::Open), ([System.IO.FileAccess]::ReadWrite), ([System.IO.FileShare]::Read)
                            $zip = New-Object System.IO.Compression.ZipArchive -ArgumentList $fileStream, ([System.IO.Compression.ZipArchiveMode]::Update)
 
                            # Locate the existing TWB entry and remove it.
                            $entry = $zip.Entries |
                                where {
                                    # Look for a .twb file at the root level of the archive.
                                    $_.FullName -eq $_.Name -and ([System.IO.Path]::GetExtension($_.Name)) -eq '.twb'
                                } |
                                select -First 1
                            if ($entry) {
                                $entry.Delete()
                            }
 
                            $entry = $zip.CreateEntry($entryName, ([System.IO.Compression.CompressionLevel]::Optimal))
                            [System.IO.Stream]$entryStream = $null
                            try {
                                $entryStream = $entry.Open()
                                $WorkbookXml.Save($entryStream)
                            }
                            finally {
                                if ($entryStream) {
                                    $entryStream.Dispose()
                                }
                            }
                        }
                        finally {
                            if ($zip) {
                                $zip.Dispose()
                            }
                            if ($fileStream) {
                                $fileStream.Dispose()
                            }
                        }
                    }
                }
                else {
                    if ($PSCmdlet.ShouldProcess($Path, 'Replace existing packaged workbook')) {
                        # delete existing TWBX
                        Remove-Item $Path -ErrorAction Stop #TODO: Figure out how to pass WhatIf and Confirm to this
                        
                        $createNewTwbx = $true
                    }
                }
            }
        }
        else {
            if ($PSCmdlet.ShouldProcess($Path, 'Export packaged workbook')) {
                $createNewTwbx = $true
            }
        }
 
        if ($createNewTwbx) {
            [System.IO.FileStream]$fileStream = $null
            [System.IO.Compression.ZipArchive]$zip = $null
            try {
                $fileStream = New-Object System.IO.FileStream -ArgumentList $Path, ([System.IO.FileMode]::CreateNew), ([System.IO.FileAccess]::ReadWrite), ([System.IO.FileShare]::None)
                $zip = New-Object System.IO.Compression.ZipArchive -ArgumentList $fileStream, ([System.IO.Compression.ZipArchiveMode]::Update)
 
                $entry = $zip.CreateEntry($entryName, ([System.IO.Compression.CompressionLevel]::Optimal))
                [System.IO.Stream]$entryStream = $null
                try {
                    $entryStream = $entry.Open()
                    $WorkbookXml.Save($entryStream)
                }
                finally {
                    if ($entryStream) {
                        $entryStream.Dispose()
                    }
               }
            }
            finally {
                if ($zip) {
                    $zip.Dispose()
                }
                if ($fileStream) {
                    $fileStream.Dispose()
                }
            }
        }
    }
 
    end {
        [System.Environment]::CurrentDirectory = $originalCurrentDirectory
    }
}

Not too bad. Here are some quick usage examples.

Convert a TWB into a TWBX

$workbook = Get-TableauWorkbookXml 'MyWorkbook.twb'
$workbook | Export-TableauPackagedWorkbook 'MyPackagedWorkbook.twbx'

Update an existing TWBX

$workbook = Get-TableauWorkbookXml 'MyPackagedWorkbook.twbx'
# TODO: Make some changes to the workbook XML
$workbook | Export-TableauPackagedWorkbook 'MyPackagedWorkbook.twbx' -Update

Leave me a comment if you have any questions about working with Tableau workbooks in PowerShell. Maybe I’ll cover your topic in the next post!

⦿

Go Beginners: Iota Enumerations

Enum constants in Go have a super power. As opposed to C# where you only have two options for enums, either you assign each value yourself or let the compiler increment each value by one, Go has a third option.

Auto-incremented enum in C#

enum Example
{
     One = 1,
     Two,
     Three
}

Go’s third option is that you can make use of repeating expressions. This works because of Go’s iota enumerator. The best example can by found in the Effective Go docs.

type ByteSize float64

const (
     _           = iota // ignore first value by assigning to blank identifier
     KB ByteSize = 1 << (10 * iota)
     MB
     GB
     TB
     PB
     EB
     ZB
     YB
) 

iota is only valid when assigning constants and with each const block it starts out life with a value of zero. Each time it is used it is incremented by one.

In the example above, the first constant is _ = iota. _ is a throwaway variable in Go so we are ignoring the first value of iota (zero) and starting with a value of 1 on the next line.

KB is assigned an expression that uses iota: 1 << (10 * iota. The subsequent constants don’t have any explicit value so Go repeats the last assignment. Thus, repeating expressions. Because iota increments each time, we end up with a very useful pattern generator.

Flag enumerations are a great use of this. In C# it is a manual process to assign each flag a value that increments in powers of 2.

[Flags]
enum FlagExample : int
{
     Zero = 0,
     Two = 2,
     Four = 4,
     Eight = 8
}

iota makes this much simpler in Go.

const FlagExample (
     Zero = 1 << iota
     Two
     Four
     Eight
)

Of course you don’t have to use iota at all if you don’t want to.

const Example (
     One = 1
     Two = 2
     Four = 4
)

Be sure and read the docs for even more info. Also, here’s a playground example of what we’ve covered.

Go forth and code.

⦿

Tableau via PowerShell, Part 1: Opening Workbooks

In this mini-series I’m going to show you some cookbook style examples for using PowerShell to explore your Tableau workbooks. Follow along, this is going to be fun!

Tableau workbooks (TWB files) are just XML files. Packaged workbooks (TWBX files) are just ZIP files that contain a TWB and various assets, like data extracts for example.

This is wonderful because it means it is very easy to go spelunking through workbook files without a guide.

Opening Workbooks

I mentioned that TWB files are just XML files and that TWBX files are ZIP files that contain a TWB. This means we need different logic for opening a TWB vs a TWBX. Let’s write a simple little PowerShell function to fix that. We’ll call this function Get-TableauWorkbookXml and it will take a file path and return the workbook’s XML. It will abstract away the different ahandling of TWB and TWBX files for us.

If you find this helpful then check out TableauKit; a full on PowerShell module for working with Tableau files. It contains a new and improved version of the function below and much more.

function Get-TableauWorkbookXml {
<#
.SYNOPSIS
    Gets the workbook XML from a specified TWB or TWBX file.

.PARAMETER Path
    The literal file path of the TWB or TWBX file.

.NOTES
    Author: Joshua Poehls ({{site.url}})
#>
    [CmdletBinding()]
    param(
        [Parameter(Mandatory=$true, ValueFromPipeline=$true)]
        [string]$Path
    )

    begin {
        $originalCurrentDirectory = [System.Environment]::CurrentDirectory

        # System.IO.Compression.FileSystem requires at least .NET 4.5
        [System.Reflection.Assembly]::LoadWithPartialName("System.IO.Compression") | Out-Null
    }

    process {
        [System.Environment]::CurrentDirectory = (Get-Location).Path
        $extension = [System.IO.Path]::GetExtension($Path)
        if ($extension -eq ".twb") {
            return [xml](Get-Content -LiteralPath $Path)
        }
        elseif ($extension -eq ".twbx") {
            $archiveStream = $null
            $archive = $null
            $reader = $null

            try {
                $archiveStream = New-Object System.IO.FileStream($Path, [System.IO.FileMode]::Open)
                $archive = New-Object System.IO.Compression.ZipArchive($archiveStream)
                $twbEntry = ($archive.Entries | Where-Object { $_.FullName -eq $_.Name -and [System.IO.Path]::GetExtension($_.Name) -eq ".twb" })[0]
                $reader = New-Object System.IO.StreamReader $twbEntry.Open()

                [xml]$xml = $reader.ReadToEnd()
                return $xml
            }
            finally {
                if ($reader -ne $null) {
                    $reader.Dispose()
                }
                if ($archive -ne $null) {
                    $archive.Dispose()
                }
                if ($archiveStream -ne $null) {
                    $archiveStream.Dispose()
                }
            }
        }
        else {
            throw "Unknown file type. Expected a TWB or TWBX file extension."
        }
    }

    end {
        [System.Environment]::CurrentDirectory = $originalCurrentDirectory
    }
}

Let’s try it out from PowerShell.

Get-TableauWorkbookXml .\MyWorkbook.twb

TWBX files are just as easy.

Get-TableauWorkbookXml .\MyPackagedWorkbook.twbx

Cool, right? How about opening a bunch of workbooks at once? This is PowerShell after all.

Get-ChildItem *.twb* | Get-TableauWorkbookXml

Continue the adventure with Part 2: Saving Changes.

⦿

ERRORLEVEL is not %ERRORLEVEL%

DO use IF ERRORLEVEL 1 ECHO error level is 1 or more to check for errors in your batch/command line scripts.

DO NOT use IF %ERRORLEVEL% NEQ 0 ECHO error level is not equal to 0.

Most of the time %ERRORLEVEL% will be what you expect but it isn’t guaranteed. %ERRORLEVEL% is simply an environment variable like any other. It just happens to fallback to returning the ERRORLEVEL value if an environment variable with that name isn’t defined.

This is how I lost about an hour of my morning. Fortunately Google guided me to this excellent post and saved my sanity.

Remember %ERRORLEVEL% is not the same as ERRORLEVEL.

⦿