Understanding J+Track Monitoring & Data Privacy

Overview

J+Track is a global tool and therefore its feature set is available to all markets. While its core functionality is the setting up of tracking plans and exporting GTM configurations, it does have features such as Monitoring that can raise Privacy concerns.

Data Privacy is an important requirement in many markets, failure to adhere to these policies such as GDPR can have significant implications for the business such as legal and finance impacts and must be taken incredibly seriously.

The Monitoring aspect of J+Track does have privacy concerts and this article details the impact and considerations around this.

Important Data Details:

Collected data values are uglified/obfuscated
Google services and data are stored and hosted in the EU
- This currently can not be changed
Any data collected can be deleted upon request
Data collected is automatically deleted after 13 months
- This can also be adjusted upon request
Collection must adhere to any user consent enabled for the website
- This must be done manually by the practitioner in GTM

Potential Impact

J+Track monitoring is designed to run via Google Tag Manager and therefore can end up being deployed on a production website depending on the requirements and goal of the Monitoring. This essentially will collect data and information from a user's browser session.

The key data Monitoring collects is:

dataLayer implemented on the website (dataLayer)
Google Tag Manager Tags triggered on the website (GTM Tags)
Additional supporting data points. (Other)

Data collected in detail:

dataLayer
- Structure only
- Event names
- Keys
- Values are obfuscated
GTM Tags
- Tag Id
- Execution Time
- Execution Status
- Triggering dataLayer event
Other
- Browser URL (obfuscated as required)
- Browser User Agent
- Timestamp
- GTM Container ID
- GTM Container Version
- J+Track Related Plan ID (Internal Ids for J+Track)

Key Actions if suspected PII is captured:

Pause Monitoring Tag in GTM
- Publish these changes as priority
Inform the J+Track of your Client/Project and request a data deletion
- J+Track team will purge BigQuery of all collected monitoring data for that client and project
Document the PII capture
- Detail the sequence of events
- Detail the nature of the collection
- Details the steps taken to remediate the issues
- Discuss with Client Project Team

Collecting dataLayer

Potential Risk

A dataLayer by nature is information intended to be shared by the business to empower analytics and marketing activities. Its implementation does not necessarily imply that it will contain personal identifying information (PII) or be in breach of any privacy concerns.

A dataLayer exists for that user's session, only containing information provided to it. It does not store data in cookies or persist across new page loads. It will often simply represent the current state of data at the time for that user. On its own it does not transmit the data to any third party service.

From the above perspective it is easy to assume that collecting the dataLayer for audit or analysis purposes would not be problematic. However this is not the case.

As we are not able to control what a client adds to the dataLayer or how they structure it, the values it contains do have the potential to contain PII or other data that may breach a user's privacy.

For example you could store in plain text a user's name and email in the dataLayer. While this is bad practice, on its own this is not necessarily breaching a user's privacy. You can consider it the same as showing the user their own email.

However if this value is taken from the dataLayer and then shared to 3rd party services (including J+Track Monitoring), international or unintentionally then it will likely breach privacy policies.

Addressing the Risk

Given that it's the values in the dataLayer that pose the greatest risk, the J+Track Monitoring has been designed to uglify and obfuscate these values and therefore only collect the dataLayer structure and or design.

This is true for all dataLayer values except for the ‘event’ key as this is essential for the Monitoring to function correctly.

Example Collection Value Obfuscation (network request)

Remaining Risk

The setup is designed to remove the risk of the values, which as mentioned above is the greatest risk of capturing unintended PII.

However it is possible, although very unlikely that the following could occur which has the potential to capture PII.

dataLayer events contain PII
- As the event key is excluded from the obfuscation, if a dataLayer by bad design has programmed events to contain PII this will be captured.
- This will likely be problematic beyond Monitoring and will need to be addressed with the client promptly
dataLayer keys contain PII
- dataLayer keys are also excluded from the obfuscation
- Keys could be set to user id’s or other PII values
  - This would be extremely bad dataLayer design

Both of the above examples are considered extremely rare as it would indicate very poor dataLayer design and misuse of the dataLayer. That however does not rule out the potential for such a setup to exist.

Unfortunately the event name and keys are required for monitoring to function, without these data points the dataLayer cannot be collected and compared to a tracking plan. These two points will have to be accepted as an unlikely risk.

Collecting Supporting Data

Potential Risk

The supporting data helps provide context and functionality to the monitoring reports.

Data Point	Potential Risk	Details
Browser URL	Low	The current browser url of the user.
Browser User Agent	Very Low	The current browser agent of the user
Timestamp	None	The current UNIX time the event is collected
GTM Container ID	None	The GTM Unique Identifier. Publicly available on the website
GTM Container Version	None	The current deployed version of GTM as a number e.g. 101
J+Track Related Plan ID	None	Internal Ids used for J+Track plan matching and association

Reviewing the above table we have two low risks, the browser URL and capturing the Browser User Agent. These risks are not limited to Monitoring alone as most 3rd party services will also capture these two data points, and an issue in the URL will likely impact all services.

Browser URL’s are often a potential entry point for PII. This type of data in the URL is often a result of bugs/issues within a website code (such as a GET FORM failing) or by poor design.

A common data point captured via the url is a user's email or identifier. In extreme cases payment and or form information can make its way into the url.

Common cases of PII occurs in:

URL Query String (most likely)
Path Paths (unlikely)
Anchors (very unlikely)

Capture the User Browser Agent will contain information about their device and browser being used. This data point alone is often not enough to identify any specific user. In combination with many other data points it may be possible to fingerprint a user for identification, but a User Agent String alone should not pose a high risk on its own.

Addressing the Risk

J+Track Monitoring has been designed to uglify and obfuscate the users Browser URL Query String. This will keep the URL intact but all queries strings will be obfuscated. Therefore removing the greatest risk posed by the URL

Capturing the Browsers User Agent is considered low risk, however if this is a concern of the client it can simply be removed from the Monitoring Tag in GTM by removing the variable reference.

Remaining Risk

In the case that a piece of PII data presents itself in the url as part of the page path, then this will be a potential risk for Monitoring as the Path Path of the URL is excluded from the obfuscation.

This is considered an unlikely risk. Should such data be present it would be a significant concern beyond monitoring and would need to be addressed by the client as soon as possible.

Collecting Tags

Potential Risk

Collecting the Tags triggered in GTM should pose no risk at all to user privacy. There are minimal data points all of which relate directly to Google Tag Manager configurations and status of an executing Tag.

None of these data points are connected to user information.

Addressing the Risk

Given the nature of the Tag collection and potential risk there are no steps required to safeguard this collection of data.

Remaining Risk

Currently and to the understanding of the J+Track team there are no significant or potential risks in capturing this information.

The only data exposure here would be listing the tags/3rd party services a client uses on their website. This information is already publicly available due to the nature of Tagging and usage of GTM.

Uglify/Obfuscation Details

The Data is transformed on the client side (the website of the client to apply modification to be compliant to the GDPR).

All the values are uglified following the rules:

letter --> a / number 0,
special characters such as / - _ & are kept to still give the practitioner a way of validating the data structure.

Example Collection Value Obfuscation (output collection):

In case you need further assistance please reach out to the J+ Track team on slack channel #help-jplus-track