Skip to main content

Task Type - Sfdc Data Mask

Overview

Sfdc Data Mask task is a specialized task which can be used to mask the data that is in Salesforce. This is typically applicable when you create a full or partial copy sandboxes which copies complete or some of production data into sanboxes and you would want to mask that data to ensure security/privacy/legal compliance.

Noteworthy Capabilities

Here are some noteworthy capabilities of this task.

  • Mask one or more objects in serial or parallel mode
  • Incremental or Full mask for each Run
  • Mask all or specified fields
  • Lots of Mask Types, including type to generate Fake Data
  • Disable Automations before mask and enable after
  • Capability to mask all data types including Attachments
  • Retry errors to ensure all records are masked
  • Combine with other Realtask supported tasks to satisfy all masking needs including deleting records
  • Support Bulk Api for faster masking
  • Mask in split range to be able to mask Objects even with 100s millions of records
  • Preview records to ensure correct masking configuration

Manage Tasks

To be able to mask add a Task to existing Job and select tye Task type as Sfdc Data Mask. To edit an existing task, click on the pencil icon next to corresponding Task. Once task dialog is shown, select the Salesforrce connection that you want to Mask and click on Data Mask Config which should show another dialog where you can configure the mask details. See below section for configuration details.

Job can have any number of Sfdc Data Mask tasks. Each such task can be configured to mask one or more objects, so one mask task should be sufficient in most of the cases.

Configuration

Global Settings

Global Options defines the settings that apply for all Objects and data masking process as a whole. Note that if there are multiple Sfdc Data Mask tasks in a Job, then each such task will have its own Global Settings.

To access global setting, click on the Data Mask Global Settings button in the configuration dialog.

Global Settings - Options

FieldTypeDefaultDescription
Bulk ApiPicklistDefaultControls if Bulk Api is enabled or not. If it is Default, then global settings will be applied.
Bulk Api ThresholdNumberr10000Indicates number of records after which bulk api should be used. For ex., if it is 10000, then bulk api will be used if system needs to process 10000 or more records in in a batch.
Write Batch SizeNumber200Number of records to include in one batch api save request. This is applicable only for Batch Api
Split Range MinsNumber5040See below section Split Range Mins for more info
Start Masking FromDateTimeIf you want masking to start from particular datetime, you can specify the datetime. This setting can be overwritten at Object level.
Mask Dataset ModePicklistIncrementalIndicates how to pick records for masking. See Mask Dataset Mode for more info.
Objects Process ModePicklistParallelIndicates how Objects are processed, Parallel or Serial
Objects Process ThreadsNumber5Applicable of Objects Process Mode = Parallel. Indicates how many objects are processed at a time.
Create Status File ForPicklist'Errors'Indicates when to create status file. See Create Status File For for more info.

Global Settings - Data Masking

This section allows you to configure data masking properties by field type.

Global Settings - Edit Field Type Defaults

Click on the button to open dialog where you can select field type and specify defaults properties. Click on + icon to select a field type and click on that item in the data grid to show its properties. If you select No Mask, then by default that type of field will not be masked unless those fields are specified in the Fields list.

About Mask Types and available properties, see Mask Types section for more info.

Manage Objects

In the Data Mask Config dialog, Objects List shows the currently selected Objects.

Add Objects

  • Click on + icon in the data grid toolbar to select new Object. It would show below dialog
  • Note: that Select Objects dialog shows only Standard Data, Custom Data, Custom Settings objects
  • Select one or more objects you want to Mask and click Ok
  • It should add selected objects to Objects List grid.

Delete Objects

If you don't want to mask one or more objects, select and click on Trash icon to be delete them. Once deleted, they will be removed from the list and will not be masked.

Inactivate Objects

If you want to temporarily not mask one or more objects (as part of testing), you can inactivate them. Click on the checkbox in the "Active" column in the Objects List to inactivate. In the sample picture below, Account object is Inactivated but Contact is activated.

Configure Object

To configure an object, select that object. To select an object, you can click on the checkbox in left more column of that row or just click on the row somewhere. If you have selected more than one object, then object/fields configuration will be hidden. In that case, unselect other objects.

Once an object is selected, right side panel shows the various Object Options and bottom panel shows the fields list. See those sections for more info.

Object Options

These options are applicable for currently selected Object in Objects List. If no or more than one Object is selected, then these options will not be shown.

FieldTypeDefaultDescription
Bulk ApiPicklistDefaultControls if Bulk Api is enabled or not. If it is Default, then global settings will be applied.
Bulk Api ThresholdNumberr10000Indicates number of records after which bulk api should be used. For ex., if it is 10000, then bulk api will be used if system needs to process 10000 or more records in in a batch.
Bulk Api ModePicklistParallelIf bulk api needs to be used, indicates how those records are to be processed by Salesforce.
Write Batch SizeNumber200Number of records to include in one batch api save request. This is applicable only for Batch Api
Split Range MinsNumber5040See below section Split Range Mins for more info
Start Masking FromDateTimeSee below section Start From At for more info
Disable AutomationsPicklistDefaultSee below section Disable Automations for more info

Split Range Mins

Realtask masks records in batches called Split Ranges. It splits the duration of dataset based on the CreatedDate into batch of split ranges and processes each range at a time serially. This helps to avoid SOQL query time-outs and helps recover from errors as masking re-starts from next split range. This option controls how big those split ranges are in minutes. It defaults to 1 week (5040 mins) but if object doesn't have lots of records or has much more records, you can either increase or decrease the range size.

Start From At

If you want to start masking records from a particular datetime (based on CreatedDate), select the date/time. Note that if Masking Dataset Mode is Incremental, then Realtask will use later of this date or Masked Till date.

Mask Dataset Mode

Mask Dataset Mode indicates which records are considered as the source for masking. It can be either Incremental or Full.

Incremental If incremental, mask process maintains the last CreatedDate till when masking has been completed, and it continues to masking from that date onwards. This is great to process whole org (in case of post-refresh activities) as when you re-run to recover from errors, it will continue from where it left off.

Full If it is Full, then mask process always fetches all records irrespective of if some of those records are already processed.

Create Status File For

This option indicates when to create status file.

Success/Errors: Status file is created for all records, both success and error records. Keep in mind that if you are masking millions of records, then this may not be best option as it can create status file upto few gbs.

Errors: Status file is created only when a record are errored with that error details. This is the default and recommended setting.

None: Status file is not created. If there are any errors, you will get only the count of errors but not error details.

Manage Fields

Fields List panel allows you to configure which fields are queried and which fields are masked for currently selected object. If no object is selected, then fields panel would show message Select an Object to manage its Mask fields. To manage, select an object from Objects List panel.

Fields to Mask

Fields to Mask setting controls which fields are masked for selected object, as follows.

  • Specified: Realtask will include only the fields specified in the Fields List to be queried and masked.
  • All: Realtask will include all updatable fields of current object with configuration for each field is overridden with what is specified in the Fields List. If field is not specified in the Fields List, then default masking options will be applied based on the Global Settings

Fields List

When an object is selected, all specified field will be displayed in the Fields List. Select one of the field to manage its configuration.

Add Fields

To add new fields,

  • Click on the + icon the Fields List grid toolbar.
  • It should bring up below dialog which allows you select one or more fields.
  • Click on the icon to show some values for selected field. Note that long-text area fields cannot be filtered to exclude null values so system will fetch some random 100 records and hence some of them may be null. For others, it will fetch non-null values.
  • Select one or more Fields and click Ok

Selected fields should be added to Fields List. Now select any one of those fields to configure its properties.

Delete Fields

To delete one or more fields, select them using checkbox in left-most column and click on Trash icon. Once deleted, fields will be removed from the list. If Fields to Mask is set to All then those fields will still be masked but using Field Type default settings.

Activate/Inactivate Fields

If you want to keep the fields in the list but want to exclude them temporarily from masking process, then inactivate them. To Inactivate, click on the checkbox in Active column in the Fields list. To Activate, check those checkboxes.

Configure Field

Select a field to configure its masking properties which will be shown in right panel under Mask Options

See Mask Types for various Mask Type and their properties. Click on Data Mask Playground to open mask playground where you can try out various mask options, its properties and how they mask the data.

Mask Types

Masking can be configured in a couple of places viz., Fields List and Field Type Defaults. In such places, system allows you to pick Mask Type and configure its properties. This section documents various Mask Types available in the system and it's properties.

Mask Type - Default

Default mask type means, use the applicable default based on the field type for that field in following order.

Mask Type - No Mask

If a field is defined as No Mask, as name indicates, no masking will be applied. This is useful to include the field so that its data can be fetched and used to mask other fields (in case of Binary masking or Formulas) but field itself should not be masked.

If you select No Mask for a field type, then there is no default masking for that field type. This is useful to skip masking all fields of particular type. For ex., leave all number fields as unmasked.

Mask Type - Fake Data

This mask type allows you to generate realistic looking various pieces of data.

Fake TypeDescription
Address - Postal Code
Address - City
Address - Street
Address - County
Address - Country Name
Address - Country ISO2 Code
Address - State Name
Address - State Code
Address - Latitude
Address - Longitude
Time Zone
Commerce - Product Name
Commerce - Product Description
Company - Name
Company - Url
Date - Past
Date - Future
Date - Between
Date - Recent
DateTime - Past
DateTime - Future
DateTime - Between
DateTime - Recent
Finance - Account
Finance - Account Name
Finance - Account Number
Finance - Amount
Finance - Transaction Type
Finance - Currency Code
Finance - Currency Name
Finance - Currency Symbol
Finance - Credit Card Number
Finance - Credit Card CVV
Internet - Email
Internet - Url
Internet - Domain Name
Internet - Domain Suffix
Internet - Domain Word
Internet - ip
Internet - ipv6
Lorem - Words
Lorem - Sentences
Lorem - Paragraphs
Person - FirstName
Person - LastName
Person - FindName
Person - JobTitle
Person - Gender
Person - Prefix
Person - Suffix
Person - JobDescriptor
Person - JobArea
Person - JobType
Phone - Number
Phone - NumberFormat
Phone - Formats
Random - Number
Random - Float
Random - uuid
Random - Boolean
Random - Words
Random - Locale
Random - Alpha
Random - AlphaNumeric

Mask Type - Email Skew

Used to change part of email value. This applies only if there is an existing value. If incoming value is blank, then output will be blank too.

Local Part Local part of an email is value before @ char. Available options to change this part are,

  • Retain: It will be left as is
  • Random: Some random value will be generated
  • Char Transpose: Transposes the existing chars

Domain Part Domain part of an email is value after @. Available options to change this part are,

  • Retain: It will be left as is
  • Random: Some random value will be generated
  • Fixed: Sets to specified value in Domain Value input field
  • Char Transpose: Transposes the existing chars

Remove Invalid Chars Sometimes email can contain invalid chars in them and this causes problems while saving as Salesforce will error out. Checking this option will clean up and removes invalid chars before updating

Mask Type - File Name Skew

Allows you to modify parts of a file name. Used to change the file name fields like Attachment.Name or ContentVersion.PathOnClient etc.,
This mask type applies only if there is an existing value. If incoming value is blank, then output will be blank too.

Name Part Name part of a file name is value before last . char. If there is no . then whole string is considered as name part. Available options to change this part are,

  • Retain: It will be left as is
  • Char Transpose: Transposes the existing chars

Extn Part Extn (Extension) part of a file name is value after last .. If there is no . char in the value, then extn part will be blank. Available options to change this part are,

  • Retain: It will be left as is
  • Fixed: Sets to specified value in Domain Value input field
  • Char Transpose: Transposes the existing chars

Mask Type - Picklist Skew

This mask type is applicable for fields of type Picklist/Multipicklist. When run, this will randomly produces one of the valid picklist values of that field.

Mask Type - Replace First

This mask type allows you to replace first n number of chars in input string with particular mask chars and leaves the rest as is. This mask type is great to skew phone number by replacing first 7 chars with x and and leaving the last 4 chars as is.

Char Count Number of chars in incoming string to replace. If not specified, then defaults to all.

Match Chars Char dataset which should be considered. If there is a char which is not in this input string, then it is left as is. Defaults to alphanumeric chars.

Action What do with matched chars.

  • Char Transpose: Char transpose the matching chars
  • Replace: Replace with specified chars (default)

Replace Chars If we need to replace, specify the chars to replace with. Defaults to 'x'

Mask Type - Nullify

This mask type blanks out the input.

Mask Type - Char Transpose

This mask type replaces all alphanumeric chars with random char from its char pool. For ex., digits are replaced with random digits, lower case chars are replaced with random lowercase char and uppercase chars are replaced with random uppercase chars. It leaves out the non-alphanumeric chars as is.

Mask Type - Values

This mask type takes list of values and produces one value each time it is used either in random or in specified order.

List of Values Enter the list of values as one value per line. If no values are specified, it produces blank value.

Match Type How values should be produced/matched.

  • Specified Order: Values are produced in the specified order and when it reaches end, it starts from beginning again.
  • Random: Random value from list of values is produced

Mask Type - Formula

This is an advanced mask type which gives you full control on how incoming data needs to be changed. You can use any of the available formula functions and record values to produce new value. If you want to refer to a field, but you do not want to mask that field, add that field and set the mask type as No Mask so that field is queried and used in other field formulas.

Data Mask Playground

We have built a Data Mask Playground which allows you to experiment with various mask types. It can also be accessed by clicking on the Data Mask Playground link next to Mask Type combobox.

Data Mask Preview

Data Mask Preview allows you to preview how masking works as you are configuring it. Along with Data Mask Playground, preview helps you confidently configure the data masking per your business needs.

To preview,

  • Select single Object from Objects list. If you select none or more than one, then Fields list, Preview button will be disabled.
  • Click on Preview Data button
  • It should bring up below dialog showing the preview
  • Top grid named Source: shows the records as they exist in the Org
  • Bottom grid named Masked: shows what records looks like after they are masked

Few things to keep in mind.

  • Preview shows maximum 100 records
  • Record selection logic applies the filtering logic you specified in the Object options

Execution

To execute a Task, execute the Job to which this task belongs to. See Job Execution for more info on Job execution.

Mask Runs

To see the Mask progress or completion details, follow these steps.

  • Once Job started the execution, it would create a Run record. You can see the Run records from Runs tab in Job page as shown below. If you don't see the Run, click on Refresh icon.

  • Click on the Run link which should bring into Run page. In Run page click on the View Additional Info icon. If you have multiple tasks as part of the Job, make sure you click on the icon associated with the Mask task you are interested in.

  • It should open the Mask Run Details dialog which shows complete information about the Masking. See next section for more details.

Mask Run Details

Data Mask Run Details dialog shows current status of Mask process

Data grid shows one row each Object configured in the Mask config. List of columns in data grid are as follows.

ColumnDescription
ActionsIf there are any files associated with that Object, then Files icon will be shown. Click on the icon to show files dialog.
NameName of the Object being masked
StatusCurrent status of the mask process for that Object.
Masked TillRealtask masks the records in batches. At the end of each batch, it makes a note of batch end-date and that date is shown here. If mask type is incremental, next time mask is run, it will mask records from this date onwards.
Start AtWhen masking started for this Object
DurationHow long the mask process took
TotalTotal records masked
SuccessNumber of records successfully masked
SkippedNumber of records skipped
ErrorsNumber of records errored. If this column shows any number, click on the Files icon to view status files which should give details as to why records failed.
MessageAny message associated with this Object