AI Policy

With most of the tech industry involved in a headlong rush toward generative AI inclusion in pretty much everything, many users actively dislike such a pervasive intrusion into their daily life.

We see generative AI as problematic in several ways:

CPU and/or GPU requirements are enormous, particularly in comparison with other requirements of a system like Euravox; consequently:

Power requirements are incompatible with measures necessary to combat global warming.

Cost and space requriments are vastly greater than we'd otherwise require, which is neither justifiable or appropriate, given our aim to return a substantial percentage of income to creators.

Current generative AI is overhyped and underdelivers. It's failure rate is far too high to make it useful. The transformer architecture underlying current generative LLMs is unlikely to scale in a way that will fix these problems; new algorithms will be needed, and there are no promising leads at this point. It may be decades before significant improvements are seen.

We therefore do not intend to integrate generative LLM with our platform - doing so would increase our costs unacceptably, whilst annoying many of our users.

AI in Advertising, Trust and Safety

We have a need to classify posts by subject. This is an unavoidable requirement, and is a fundamental precursor for the following:

Discovery.  Euravox can show you either a date-based feed based on posts by people you follow, or a discovery feed based on content you're likely to be interested in. Text classification is a prerequisite for creating a discovery feed.

Advertisement Targeting.  For many of our users, their access to the site is free at point of use, paid for by advertising. It is therefore necessary to be able to allow advertisers to target their adverts at specific interests. Whilst we can get some of this information from context (e.g., if a user posts on fishing groups, they are probably interested in fishing), but it is also necessary to be able to figure out context from text.

Trust and Safety.  We wish to avoid harmful content on our platform. There are only two ways this can be achieved: human moderators, and machine learning-based classifiers. Whilst human moderation is and always will be the gold standard, it will never be practical for this to be sufficient on its own. The sheer volume of posts and comments on the site, as well as the need to act nearly instantaneously to block harmful content, makes this impractical. Secondly, and perhaps more importantly, some kinds of harmful content may be harmful to human moderators. A suitably trained machine learning classifier can handle first-line moderation automatically, without human intervention.

Both of these can be achieved with machine learning. Text classification is far older than the current crop of LLMs, and can be based on neural networks, statistical techniques or some combination of these. Image classification can be achieved with neural networks; it is also common practice to look for images matching cryptographic hashes in order to detect known harmful images.

To be absolutely clear: this is not generative AI, and can not be used to generate text or images, or to violate our users' copyright.

Machine learning classifiers and the right to delete your data

It is one of our fundamental privacy principles that all of our users have the right to delete their data. We retrain our models on a rolling basis, so user data that has been deleted will cease to affect the training of our classifiers. Since such training lags behind real-time because of the amount of computation needed, we undertake to ensure that all classifiers are retrained before the deletion window expires (worst case 90 days, but in practice far less than that).

We do reserve the option to retain anonymized copies of data that has been verified as harmful for machine learning training purposes, since this serves the greater good of our user community.