Data Sensitivity
Introduction
Database related technologies allow the storage of data, and some of this data may be sensitive in nature, for example, confidential information such as:
- Salary
- Bonus
- First Name
- Last Name
- Contact details
- etc.
When analyzing this type of data, CAST has the ability to tag a resulting object with a specific sensitivity level property, and this property can then be seen and exploited in CAST Imaging Viewer, for example:
How does it work?
There are various types of sensitive data that CAST can detect during an analysis:
Custom
A list of key words (i.e. names of objects that contain sensitive data) together with their sensitivity level must be configured in a plain text file with the extension .datasensitive before the analysis is run and this file must be delivered with the source code. When a key word defined in the .datasensitive file matches an object created during an analysis, a property will be added to the object that flags it with the defined sensitivity level. This property can then be seen and exploited in CAST Imaging Viewer.
Built-in for Table Columns
The com.castsoftware.datacolumnaccess extension provides a predefined list of key words to match data sensitive table column objects. The list of key words is documented in the extension itself. The extension also supports custom key words.
GDPR and PCI-DSS specific
GDPR and PCI-DSS - this is automatically detected by CAST Imaging Console ≥ 1.26 for all supported technologies (see below) using a predefined list of key words. The list of key words provided on each node is as follows:
GDPR key words
#=================================================
# Name: "Sensitive" data
#=================================================
Name=Sensitive
FirstName=Sensitive
First-Name=Sensitive
LastName=Sensitive
Last-Name=Sensitive
#=================================================
# Phone numbers: "Sensitive" data
#=================================================
Phone=Sensitive
PhoneNo=Sensitive
Phone-No=Sensitive
PhoneNb=Sensitive
Phone-Number=Sensitive
#=================================================
# Payment card data: "Very sensitive" data
#=================================================
#---- Primary Account Number
PAN=Very sensitive
#---- Cardholder Name
CardholderName=Very sensitive
Cardholder-Name=Very sensitive
#---- Expiration Date
ExpirationDate=Very sensitive
Expiration-Date=Very sensitive
#---- Service Code
ServiceCode=Very sensitive
Service-Code=Very sensitive
#=================================================
# ID numbers: "Very sensitive" data
#=================================================
IdCard=Very sensitive
Passport=Very sensitive
SSID=Very sensitive
#=================================================
# Location data: "Sensitive" data
#=================================================
Address=Sensitive
#=================================================
# Online identifiers: "Very sensitive" data
#=================================================
Login=Very sensitive
Password=Very sensitive
#=================================================
# Criminal convictions: "Highly sensitive" data
#=================================================
CriminalRecord=Highly sensitive
Criminal-Record=Highly sensitive
Offences=Highly sensitive
#=================================================
# Race, Gender, Birthdate: "Very sensitive" data
#=================================================
Race=Very sensitive
Sex=Very sensitive
Gender=Very sensitive
Birthday=Very sensitive
Birthdate=Very sensitive
#=================================================
# Medical information: "Very sensitive" data
#=================================================
MedicalExamination=Very sensitive
Medical-Examination=Very sensitive
MedicalReport=Very sensitive
Medical-Report=Very sensitive
MedicalIssue=Very sensitive
Medical-Issue=Very sensitive
PCI-DSS key words
#=================================================
# Category 1 - Cardholder Data
#=================================================
# Primary Account Number: "Very sensitive" data
#=================================================
PAN=Very sensitive
#=================================================
# Cardholder Name: "Very sensitive" data
#=================================================
CardholderName=Very sensitive
Cardholder-Name=Very sensitive
#=================================================
# Expiration Date: "Very sensitive" data
#=================================================
ExpirationDate=Very sensitive
Expiration-Date=Very sensitive
#=================================================
# Service Code: "Very sensitive" data
#=================================================
ServiceCode=Very sensitive
Service-Code=Very sensitive
#
#=================================================
# Category 2 - Sensitive Authentication Data
#=================================================
# Track data (magnetic-stripe data or equivalent on a chip): "Sensitive" data
#=================================================
FullTrackData=Sensitive
Full-Track-Data=Sensitive
MagneticData=Sensitive
Magnetic-Data=Sensitive
ChipData=Sensitive
Chip-Data=Sensitive
#=================================================
# CVV numbers: "Sensitive' data
#=================================================
CAV2=Sensitive
CVC2=Sensitive
CVV2=Sensitive
CID=Sensitive
#=================================================
# PIN/PIN blocks: "Sensitive' data
#=================================================
PIN=Sensitive
PIBLOCK=Sensitive
PIN-BLOCK=Sensitive
Which technologies are supported for data sensitivity detection?
Technology | Custom key words | Built-in key words | GDPR/PCI-DSS | Targeted object types | Required extension |
---|---|---|---|---|---|
Mainframe | ✔️ | ❌ | ✔️ | Cobol File Link, JCL Dataset, IMS Segment | com.castsoftware.mainframe.sensitivedata |
NoSQL for Java | ✔️ | ❌ | ✔️ | Collections | com.castsoftware.nosqljava (≥ 1.6.16) |
NoSQL for .NET | ✔️ | ❌ | ✔️ | Collections | com.castsoftware.nosqldotnet (≥ 1.7.0) |
SQL | ✔️ | ✔️ | ✔️ | Table Columns | com.castsoftware.datacolumnaccess - note that this extension provides a default list of key words for data sensitive table columns, but custom key words can also be added. |
SQL | ✔️ | ❌ | ✔️ | Tables | com.castsoftware.sqlanalyzer (≥ 3.6.10) - see also SQL Analyzer - RDBMS Table Sensitive Data. |
Cloud Storage | ✔️ | ❌ | ✔️ | AWS Buckets, Azure Blob Container, GCP Cloud Storage Bucket | com.castsoftware.nodejs (≥ 2.10.2), com.castsoftware.typescript (≥ 1.13.0), com.castsoftware.awsdotnet (≥ 1.0.4-funcrel), com.castsoftware.awsjava (≥ 1.2.5-funcrel) |
Configuration instructions
Custom key words
Define the .datasensitive file
First define the key words which will be used to identify the corresponding objects which you want to flag. To do this, you will need to create an empty text file with the extension .datasensitive (it can be named anything). You should then fill this file with your key word definitions, using the format shown below:
- one key word per line
- three levels of sensitivity - these are case sensitive and must respect the format listed below otherwise they will be ignored:
keyword=Highly sensitive
keyword=Very sensitive
keyword=Sensitive
For example:
UserDetails=Highly sensitive
UserContacts=Very sensitive
UserID=Sensitive
Deliver the .datasensitive file
The .datasensitive file must be delivered with your source code. It should be located in as follows:
Extension | Location |
---|---|
com.castsoftware.mainframe.sensitivedata | In a dedicated folder called Database specifically for the .datasensitive file. |
com.castsoftware.nosqljava | In the root folder along side the source code. |
com.castsoftware.nosqldotnet | In the root folder along side the source code. |
com.castsoftware.datacolumnaccess | In the root folder along side the source code. |
com.castsoftware.sqlanalyzer | In the root folder along side the source code. |
com.castsoftware.awsdotnet | In the root folder along side the source code. |
com.castsoftware.awsjava | In the root folder along side the source code. |
For example:
Note that CAST Imaging Console does not expose the .datasensitive file in the Overview panel:
GDPR and PCI-DSS files
There is no configuration required for GDPR and PCI-DSS: CAST Console will automatically retrieve the necessary files before analysis, so you do not need to provide them (as they are standard files).
What results can we expect?
Once the analysis/snapshot generation has been completed, you can view the results in the normal manner (for example via CAST Imaging). Some examples are shown below:
Custom sensitive property
When an object name matches a key word defined in the .datasensitive file delivered with the source code:
Built-in sensitive property
These are provided by the com.castsoftware.datacolumnaccess extension for Table columns - note that CAST Imaging Viewer does not currently expose Table columns in the view interface:
GDPR sensitive property
When an object name matches a GDPR key word:
PCI-DSS sensitive property
When an object name matches a PCI-DSS key word: