Home About Tags RSS
Table of contents Table of contents - click to hide

Everyone against chatcontrol

The chatcontrol regulation (see here for explanation) has been heavily criticized for various different reasons. In the last few weeks, it was possible to submit feedback to the european comission about the proposal. Many citizens and NGOs used this opportunity to express their objections or approval. I downloaded and analyzed all 414 comments from feedback page. Here's what I found:

Data analysis

The EU provides a mechanism to submit comments on current proposals. Feedback may consist of up to 4000 characters and a file. With each feedback, the EU also collects & provides information about the user type, (e.g. if the user is a EU citizen, an NGO, a business association) as well as their country.

Other information was gathered by myself: I read through each comment and noted whether a comment was in favor of the proposal or against it. I'll publish the details of this process in a future blogpost.

Here are the results:

Sankey Diagram showing how many comments are in favor or against the proposal

As you can see, the overwhelming majority of comments are against the proposal. I also took some more notes about which criticisms specifically were mentioned, but first: What does the pro side have to say?

Positive feedback

There are only 34/414 comments which are in favor of the regulation. Of these more than two thirds (24/34) are by NGOs for child protection (including Thorn, the company which lobbied for this legislation in the first place). Most of them criticize that voluntary detection will be prohibited.

One feedback by a company (actually a business association?) for cloud infrastructure providers supports the goals, notes no objections and suggests only minor changes to make sure cloud infrastructure providers are not affected by the regulation.

One NGO (IJM) submitted the same statement three times, once from the international organization once form the german branch, and once from the netherlands branch.

Six citizens submitted very short statements in favor of the regulation.

Unclear feedback

There are 16/414 comments which are unclear to me:

One comment has been excluded as it was written by a user which posted twice to bypass the 4000 character limit.

Breakdown by Nationality

The feedback data also includes the country of origin of the feedback. Using this, we can break down the positions by Nationality: Bar graph showing how many people are against/infavor of the proposal. Germany submitted much more feedback than other countries, more than 97% against For better visibility, I only included countries which submitted more than one feedback. Not shown in the graph are the countries Belarus, Brazil, Cyprus, Ecuador, India, Latvia, New Zealand, Norway, Russia, Switzerland (one submission against each) as well as Philippines (one submission in favor by the NGO "International Justice Mission" (IJM)).

As you can see, germany submitted by far the most feedback, with 156 total, making up 38% of all submissions! Only 2 submission are clearly in favor: One by Stiftung Digitale Chancen and one by IJM Germany. One is unclear (by Weisser Ring, criticizes proposal) and one was the excluded comment.

The next country with the most submissions is France, also completely opposed to this proposal. The one comment marked as unclear criticizes the "danger of the law" and proceeds a solution so much worse I'm not sure it isn't satirical.

Next up is Belgium, the country with the most (8) comments in favor of the proposal. All organizations from Belgium which submitted feedback in favor of the proposal include:

Unclear are the positions of BSA The Software Alliance and DIGITALEUROPE which suggests so many changes it'd be a different proposal.

For the other countries, it's similar. The only country where there was more positive than negative feedback was the Philippines, where the there was only one submission. Here are the remaining organizations in favor, sorted by amount of submissions and country name:

In the Netherlands:

USA:

Great Britain:

Austria:

Finland:

Greece:

Italy:

Philippines:

Poland:

I didn't bother to sort the positive feedback by the 6 citizens in favor. Their feedback is rather boring (basically only saying they're in favor): 1, 2, 3, 4, 5, 6.

Negative feedback

The vast majority (363/414) is against the proposal. Almost 90% (322/363) of these have been written by EU Citizens. If you wrote one of these - thank you.

I've further classified the negative comments to count how often the most common concerned are named.

The most common criticism, raised by almost 95% of commenters, was that the proposal will cause harm to privacy rights. Many commenters said that the proposal amounted to mass surveillance, others were (slightly) concerned that unauthorized parties may inadvertently see their private content. In the statistic, this is counted as "Privacy concerns".

Bar graph displaying how often each criticism was named

Another common concern (~50% of comments) was that the measures were ineffective and/or disproportionate. Many commenters feared that criminals will simply switch to another platform, thus rendering the scanning useless and therefore disproportionate. Interestingly, there were a few commenters which didn't comment on the privacy aspects but just objected on the grounds that the proposal wouldn't help victims.

Commenters (~36%) also raised concerns that the mass scanning will be abused in the future, possibly by the EU (or one of its countries) itself. Why by the EU itself? Because once in place, this scanning system can, from a technical standpoint, be easily to match any content. It's possible that this system will be expanded to terrorist content or even used for detection of copyright. (Expansion to terrorist content seems very likely; It appears one MEP's secretary quoted an MEP plans with plans to do so.) Even if not, commenters feared the scanning might set a precedent which inspires other regimes to establish/enforce their own scanning requirements. References to China and Russia are common in these comments.

A (to me) surprisingly small amount of comments (~26%) criticizes the accuracy issues the mass scanning will have. False positives will have distastrous results in that they'll both harm innocent people and cause police to waste resources.

Multiple reasons can (and often were) presented in each comment. Hardly anyone (<2%) opposed the regulation with naming any reason at all.

Although it's not their job, 56 commenters (15.4%) presented alternatives that may prove more helpful than this mass surveillance proposal.

Concerning duplicates

Some comments are duplicates - this is the case for both comments in favor and comments opposing the regulation. Making it even more difficult, some comments were clearly copy-pasted and slightly modified, either from older comments or from other sources. However, even if completely copy-pasted, the criticism was still written by a citizen which wanted their voice to be heard. This made it difficult for me to classify which comment to include and which to exclude, so I decided to simply include every published comment in the analysis, as they are likely written by different authors. The only comment I didn't include in the criticism statistic above was the comment posted to bypass the 4000 character limit.

Selected interesting points

By reading through the comments, I saw various points I had missed or not even considered in my previous criticism of chatcontrol.

The proposal puts LGBT children at risk of being outed

Once commmenter wrote about it mentioned it, it seemed obvious: In case a false positive is reported while two LGBT are chatting, their relationship will be revealed to the police. Depending on the country, this could have pretty bad consequences.

Will voluntary detection remain prohibited?

Both Ylva Johansson and one NGO in favor of the proposal claim that privacy will be improved, as the current proposal prohibits scanning more than necessary. However, most NGOs in favor of the proposal suggest changes to the proposal that voluntary detection will be allowed.

Extensive mandatory detection is bad enough, but forcing mandatory detection on everyone while removing the few safeguards limiting the extent of surveillance is obviously worse. Let's hope this doesn't happen.

Scanning for unknown abuse material allows backdoors

Scanning for known material can be somewhat controlled by removing hashes causing false positives. However, to detect new/previously unknown child sexual abuse material (CSAM), an AI classifier needs to be trained using real CSAM. Given that companies will likely have access to the classifier and (for obvious reasons) not the data behind it, this would likely be done by the newly planned EU center. However, they could insert backdoors which are impossible to detect to deliberately cause false positives for specific images:

We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation

Since it's impossible for companies to train their own classifier, they'll likely be forced to use this potentially backdoored classifier.

Manual reporting is surprisingly effective

One commenter worked on a study analysing which measures different services take to prevent abuse and which of these they find useful. The companies studied collectively serve 2 billion users. One category of abuse examined by the study was called "child sexual exploitation" and included grooming and enticement (but not images!). The commenter writes on page 4:

"Overall, user reporting and automated scanning were deemed equally useful for detecting grooming. Even among the subset of survey respondents that said they currently employ automated scanning to detect abuse (a subset that necessarily excludes end-to-end encrypted service providers), just as many said that either metadata analysis or user reporting is most useful against CSE as said automated scanning is."

The study is linked (footnote 7 in the feedback) and can be read here (pdf link).

The statement (pdf!) by the INHOPE Network (an NGO which operates hotlines for reporting CSAM and organizing Notice and Takedown with hosters) confirms that manual reporting is effective, however, this saying that reporting is also more effective for new images.

Public reporting is currently the primary source of 'new or previously unidentified CSAM' thus we believe that public reporting of CSAM should be strongly encouraged and supported. Maintaining a channel of reporting and communication from the public to the professional is essential.

Also interesting is the sentence before, which confirms what what experts having saying:

automated detection activities via hash matching as proposed in the proposed Regulation is only adequate to detect 'known or previously identified' CSAM when deployed by online platforms. These systems are not designed to detect 'new or previously unidentified CSAM'. While Artificial Intelligence technologies are improving and show some potential in the identification of 'new; CSAM, they are not yet a proven technology.

In short, manual reporting is effective for both new CSAM and grooming detection, while AI still isn't, even according to INHOPE, one of the biggest proponents of the proposal.

How to handle self-hosted or decentralised services?

There are many issues with scanning communication services which can be operated by different persons. In case a user hosts their own communication service, would they be legally required to run detection on content they upload? Seems unlikely, as the risk of abuse seems low. But what if this communication server is open to third party users? Or if it is part of a decentralised end-to-end encrypted system? As one commenter wrote:

"I would like to see clarified how decentralised communication providers (e.g. Matrix) or open-source communication frameworks where the user has control over the messaging client would function with such regulation: would users be expected to self-report CSAM related content? Would it be legal for users to have modified clients that did not report such infractions?"

The proposal was heavily criticized internally

It is not the first time that the regulation is heavily critcized, in fact the regulation was already criticized before it was even published. A document leaked in February 2022, written by the internal "Regulatory Scrutiny Board", notes that

The report is not sufficiently clear on how the options that include the detection of new child sexual abuse material or grooming would respect the prohibition of general monitoring obligations

Thanks to EDRi for sharing this piece of information, I haven't seen it anywhere before.

Microsoft says it's own grooming detection is not good enough

I previously wrote that the detection of grooming isn't working as well as the commission claimed and mentioned that it only supports english. In their statement, Microsoft writes (pdf, page 5, emphasis by me):

The Commission's Impact Assessment of the Regulation refers to a Microsoft technology that can detect child solicitation with 88% accuracy. [...] We recommend against reliance on this figure when discussing EU policy. This figure relates to a single English-language technique trained on a small data set of known instances of solicitation within historic text-based communications, and in all cases merely serves to flag potential solicitation for human review and decision as part of a wider moderation process.

Age controls will be a huge mess

Mandatory age verification will be very complicated, as the the means to verify the age vary from country to country. It also means that providers which respect privacy such as Wikimedia will be forced to collect data they would otherwise never have needed, increasing the possibility of data breaches.

Depending on the implementation, online anonymity could suffer. So would App Stores and children, as

unless app stores can prove that abuse material will not be exchanged on a particular app, they would be forced to deny access to such apps to under-18s. This could prevent young people from communicating online.

- Feedback by EDRi

Suggested alternatives

There were a 56 comments which suggested alternatives to chatcontrol. They mostly fall into these categories:

Interesting feedback

Many different groups opposing chatcontrol submitted their feedback. Here is a selection of those I find especially interesting:

Interesting feedback by groups

Interesting feedback by individuals

Everyone against Chatcontrol

Here's the full list of everyone who submitted feedback opposing chatcontrol:

Of course, this is not everyone who opposes the regulation - some organizations which oppose the proposal have not sent in their feedback. Just in germany over 163_000 people have signed a petition opposing chatcontrol, as well as at least three child protection associations which have expressed their opposition. Multiple people referred to a scientific paper opposing client side scanning written by the who's who of cryptographers, as the authors unfortunately didn't send in feedback themselves.

Bonus: Submissions by date

Comments by date The last two peaks are the Friday and Monday before the submission deadline, where most NGOs in favor submitted their comments. If you've been waiting for this post - now you know why it took so long.

Final Notes

I'll soon write/post a "making of" blog post which will include my notes, the data as well as the scripts I used to download the data and plot the graphics you've seen here. Follow the RSS feed or my fediverse account to know when it's ready. I'll try to publish it within two weeks - at least I don't have to read through hundreds of comments this time.

Edit 2022-10-04: Try was the right word choice, I'm still nowhere near done. I'll at least publish the code & data soon-ish.

Edit 2022-10-09: Code & data available here.

Written on 2022-09-19
Tags: statistics, politics, mass surveillance, chatcontrol