Effective on 2018-03-01, we will be changing the platform default log format for managed nginx web servers. It will log only truncated IP addresses which makes it impossible to identify individual users. This change is motivated by recent developments in data protection regulations.
Managed web servers on our hosting platform include a default configuration which currently logs full IP addresses to /var/log/nginx/access.log by default, e.g.:
184.108.40.206 - - [03/Feb/2018:19:05:47 +0100] "GET /channel/ HTTP/1.1" 200 2053 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/6 2001:470:50f1:1:58e:e9b0:7faf:ae6b - - [03/Feb/2018:19:07:31 +0100] "GET /sounds/door.mp3 HTTP/1.1" 304 276 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6" "-"
From March 1st on, log records will contain only anonymized IP addresses by default:
220.127.116.11 - - [03/Feb/2018:19:05:47 +0100] "GET /channel/ HTTP/1.1" 200 2053 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/6 2001:470:50f1:: - - [03/Feb/2018:19:07:31 +0100] "GET /sounds/door.mp3 HTTP/1.1" 304 276 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6" "-"
IPv4 addresses will be truncated to 24 bits and IPv6 addresses to 48 bits.
The European General Data Protection Regulations require “privacy by default” designs. Although there is still some debate, recent court decisions seem to converge at the conclusion that IP addresses should be considered sensitive data. They are thus subject to data protection regulations. This means that storing IP addresses of arbitrary visitors for no apparent reason is getting harder to justify.
We think that our hosting platform should come with privacy friendly and regulatory compliant defaults so we decided to change the default logging format accordingly.
Q & A
Are you anonymizing addresses in nginx as a whole or only in the logs?
Just in the logs. The full IP address is still available, for example, in nginx variables like $remote_addr. You may pass it to upstream HTTP/FCGI services using proxy headers as usual.
I need access logs with full IP addresses. How do I do that?
The old log format is still available under the name nonanonymized. Include a snippet like the following in your nginx configuration:
access_log /var/log/nginx/fullip.log nonanonymized;
Don’t forget to delete the logs if you don’t need them any more!
Does this change break web analytics?
Not necessarily. The truncated IP addresses are still long enough to identify the network of origin and to get meaningful results from GeoIP lookups. We recommend running a private instance of Matomo (Piwik) or similar to all customers who need to get detailed statistics. This sort of analytics software can be operated in a way which is complying with European data protection regulations. Contact our support for assistance.
There are still some old instances of awstats running. Since they seem not to be used much any more, we will turn them off by March 1st.
Cover image by Hans-Peter Gauster