How will the GDPR Affect the Use of Alternative Data?
Investment firms relying on alternative data to drive their trading strategies will certainly be impacted by the GDPR in relation to how the funds actually obtain and use alternative data (if that data contains what is considered the personal data of individuals located in the EU).
Data sets that potentially contain the personal data of individuals in the EU includes things like: credit card data, social media data, app usage data, geolocation data, etc. If a fund purchases this type of data, there is a chance that the fund may come into possession of personally identifiable information (PII). Given the broad nature of GDPR’s definition of PII, the fund’s focus should not only be on receiving actual PII on an individual but also whether it is possible to back into or reverse engineer the data or combine it with other data sets to potentially identify an individual.
Tip: Fund managers buying data from an alternative data vendor will push for additional representations from the vendor around GDPR. These representations will include assurances from the vendor that it is not collecting or receiving anything that constitutes personal data under the GDPR. Or, if the vendor is receiving personal data, that the vendor is GDPR-compliant and has proper legal/compliance oversight in place. Vendors must be ready with all documentation for this.
In addition to data sets purchased from a third party, it is possible for a fund’s internal webscraping activities to be scrutinized under the GDPR. For example, if a fund is scraping data off of a social media platform, it may come into contact with EU-resident personal information. Funds should avoid collecting specific usernames, first name/last name, locations, etc. of users on such platforms. But, if this data is collected, fund managers need to alter the way they store the personal data protected by the GDPR.
What about anonymized/de-identified data? Even with all the constraints that the GDPR imposes, the use of anonymized or de-identified data is a way to bypass the GDPR requirements. At the end of the day, PII does not actually add any value to the data. The specifics around who made a purchase for X dollars does not influence the data analysis or enrich it in any way. Additionally, receiving this data (GDPR aside), is a large legal and compliance risk altogether.
Tip: Although data may be purchased as anonymized and/or agreements contain provisions such as “we do not want any personal information”, data cleansing is unfortunately, not perfect. There are endless instances where stray email addresses, residential addresses, first/last names, credit card numbers, make their way into a deliverable. The way to solve this on the fund end is to have a separate environment where an additional data cleansing occurs prior to the data being delivered to an investment team. On the vendor end, there needs to be a written process set forth in the agreement that governs what the vendor will do if any PII inadvertently shows up in a data set.
How can Funds Ensure GDPR Compliance?
When buying data, funds need to obtain robust representations and warranties from data vendors and in the service agreement, inform the vendor that the fund does not want to receive personal data, including data that can be used to back into the identity of an EU individual. Prior to receiving any data from a vendor, the fund needs to conduct thorough diligence of their own. This includes reviewing sample data, data dictionary, dummy sets of the data – all to determine whether it is possible to receive PII or somehow back into identifying information on an EU individual using other available data fields. If it does, then the GDPR could be implicated. This also includes asking GDPR targeted questions on methodology, privacy policies, internal vendor compliance frameworks, whether external counsel has reviewed the vendor’s approach to GDPR, insurance coverage/indemnity in the event of a data breach, etc.