Web analytics with AI and ChatGPT
von Katrin Nebermann
Avoid accidental identifiers
With etracker Analytics, the IP address is automatically anonymized in the working memory of the data acceptance server – i.e. at the earliest possible point in time. In addition, there is now an automatic mechanism for anonymizing identifiers in page URLs in accordance with the privacy-by-design principle.
Please note: As such IDs can have very different characteristics, the automatic heuristic procedure cannot replace individual checks in reporting.
Example of a URL with ID:
https://shop.demoshop.de/index.php/payment/customer_id/38027161-e6f9-304e-b25c-8a24ea780395/target/payment
Result in reporting with automatic anonymization:
https://shop.demoshop.de/index.php/payment/customer_id/.../target/payment
In URL parameters, on the other hand, IDs are only recorded if the corresponding parameters are explicitly included in the recording. In this case, automatic anonymization applies analogously.
Why are identifiers critical?
If session or user IDs are recorded, data protection aspects must be taken into account on the one hand, and on the other hand the evaluability of the data and the loading time of the reports are affected due to the unnecessarily increased cardinality.
In addition to the negative effects on the analyses, identifiers can also have a personal reference. In this case, there may be a violation of Art. 5 GDPR, which requires data minimization as one of the principles of processing personal data: Personal data must therefore be “limited to what is necessary for the purposes of the processing”.
According to the judgment of the General Court of the European Union (EGC) of 26.4.2023 (Ref.: T-557/20), the following conditions must be met for IDs to have a possible personal reference:
- In contrast to anonymous data, the person behind the ID can be re-identified by using additional, separately stored information.
- The data recipient(s) have this information for re-identification or have legal means to access such information.
This means that when using Google Analytics, website operators must also consider, among other things, what possibilities Google itself has for re-identification, as Google is not only a processor, but also a data recipient.
What to do if identifiers are explicitly requested?
Two cases must be distinguished here:
- The actual ID is irrelevant in the evaluation. It is only a question of whether an ID is available, for example, in order to be able to use it. Distinguish between visits with and without login.
As the anonymization replaces the parameter value, but not the parameter as such, you can simply use, for example can be filtered according to the corresponding parameter, e.g:
oid=… - The respective ID is relevant for remarketing purposes or other matching. In this case, a separate segment dimension can be used either at visitor or user level. To do this, go to Settings → Account → Data enrichment → Custom dimensions.
The new automation thus strengthens the data protection friendliness of etracker Analytics and at the same time makes it easier and faster to analyze.
Data protection under control: simply block external content