It’s been three years since OpenAI shook the digital world with the launch of ChatGPT, with now 200 million people logging on to ChatGPT weekly. And those numbers get a lot higher if you also include other generative AI solutions, such as Gemini or Claude: smart large language models from Big Tech companies that can generate text or images, translate, or code.
But while AI assistants are everywhere and bring these big, productive promises, it’s getting hard to determine what has been discussed more: The opportunities that AI will create, or the privacy issues that come with it. Think of data exfiltration, unchecked surveillance, and biased profiling.
Let’s look at why and how Big Tech is collecting your data and what best practices you can implement to protect your data when using AI.
Big Tech and AI: The data grab
Any large language model is only as good as the data it’s trained on. That’s why AI models of Big Tech companies, such as Google, Microsoft, and Amazon, are actively using your conversations with their chatbots to improve their models.
A recent Stanford study compared the privacy policies of six Big Tech AI platforms. Their research revealed that:
All of them use your chat input for their training purposes by default (except for Amazon’s Nova AI agent, for which the data isn’t clear).
In half of the cases, these conversations are saved on their servers indefinitely, so there is no limit on the amount of time that they keep your data.
The opt-out methods can be hard to find. In two out of six cases, the mechanisms to opt out of the chat training were even unclear or unspecified.
What’s more, not all companies clearly state that they de-identify your personal information before using it for their training purposes.
And some of the Big Tech platforms allow humans to review your chat transcripts for their model training goals.
These data-collecting mechanisms provide AI companies with a lot of control.
Lack of transparency due to cloud centralization makes building this control is easy for them. It’s expensive to run independent AI models. A lot of AI platforms are hosted in large, centralized cloud environments. The remote servers process the inputs and return an answer. While this is efficient, it also concentrates control: the data leaves the local environment, leading to a shift in governance from the user to the provider.
In other words, Big Tech is gaining a lot of power by collecting data while also keeping its data processing out of our sight.
Why « free » AI tools don’t exist
A lot of AI tools have seemingly free or cheap versions, while in reality, users pay by giving up their data, as data is the currency of the AI platform. In other words: If it’s free, you are the product.
The AI privacy concerns that come with Big Tech data collection
You might wonder, what are the issues with these AI data collection and processing mechanisms, as long as they provide you with a smart and efficient business tool?
First of all, there is the issue of unchecked surveillance and bias.
Data collection has been growing for years, and your AI model is only as good as the data that supports it. This can lead to unfair profiling and inequality, especially if the training data is biased.
A couple of real-life AI bias examples include:
An English tutoring company used AI recruiting software that automatically rejected female applicants over 55 and male applicants over 60, leading to a $356,000 settlement for age discrimination.
A Brookings Institution study showed AI credit scoring systems can reproduce racial disparities. That’s because historical credit data correlates strongly with race. When an AI model uses this data for its credit scoring, it can lead to unequal loan approvals and interest rates.
In 2016, Microsoft’s chatbot Tay began posting racist and antisemitic messages within 24 hours of launch after learning toxic behavior from user interactions on Twitter.
Then there is the issue of data theft and leakage. AI systems can become attractive targets for hackers because they process large amounts of sensitive information in one place.
They can also leak data by showing parts of previous conversations or sensitive data when they generate responses. Or the AI system conflates information and produces fake news.
That happened, for example, when a journalist was falsely described as « a 54-year-old child molester » by Microsoft’s AI tool Copilot. As he had been reporting on criminal court cases on child abuse, the AI bot wrongly made him into a criminal.
That’s just the tip of the iceberg of privacy concerns. Other issues surrounding AI include the collection of data without consent, training on children’s data, and the use of data without permission.
What about the EU’s AI Act to protect us from Big Tech’s abuses of power?
At this point, you might be wondering: Isn’t regulation supposed to address these issues?
The EU’s AI Act was designed exactly for that purpose: to set guardrails around how AI systems are developed and used. Once fully applied in August 2026, it would introduce rules aimed at improving transparency, accountability, and risk management for AI providers.
But the tech industry, especially the Big Tech players, didn’t just stand by to let this happen. They argued that parts of the AI Act were too complex and burdensome for innovation, and they have been actively pushing for change.
(Digital industry lobbying in the EU has grown significantly from about €113 million annual spending in 2023 to roughly €151 million today, an increase of more than 30% in just two years.)
And their efforts paid off, as the EU introduced a new proposal, « the Digital Omnibus« , which aims to simplify compliance and reduce regulatory overhead under the AI Act in certain areas.
The debate on AI now continues.
Supporters say the changes could make it easier for companies to adopt AI while maintaining safeguards. Critics, however, warn that this new proposal could weaken some of the protections originally intended by the AI Act and other digital regulations.
Best practices: Transparency, limited data collection, and control
For most organizations, the question is no longer whether AI is used, but under which conditions. In most cases today, using AI means sending data to external services.
And as this blog post shows, for organizations working with sensitive data, this can become a problem with legal, technical, and ethical consequences.
Does that mean you shouldn’t use AI at all?
The issue is that if you don’t provide AI you can trust, people will likely start using their favorite services like ChatGPT for work. And that means losing control of your sensitive data.
Minimal data collection, so your organization’s data stays safe
Full control over how your data is being processed
High performance and intuitive workflows, ensuring people use trusted internal tools instead of relying on external AI services, so employees don’t feel the need to use external AI tools for their daily work
An AI tool like the Nextcloud Assistant ticks all of these boxes, making sure you can safeguard your online data without having to give up on the efficiency and comfort of an AI solution.
How Nextcloud approaches AI: Ethical AI and other pointers
Nextcloud’s prime approach to AI is that it should never be fixed to any particular provider. In other words, the administrators can choose between different providers, including self-hosted options.
With our ethical AI rating, they can get specific guidance in this choice, based on a four-level, color-coded rating scale.
Apart from that foundation, we have specific tools for both admins and users to make the most of this privacy-first AI solution, ensuring governance, compliance, and data protection.
For admins
• We aim to ensure that for each AI function in Nextcloud, there exists at least one fully green-rated AI option, that is, an option that is fully open source in terms of model, training data & running code.
• We aim to give granular control over the various AI features so that different models/solutions can be employed for different functions.
• By default, all AI features are disabled. You can run the AI fully on-premises without sharing any data with providers, and our Ethical AI rating helps understand how data sharing works in provider-hosted models.
For users
• You’ll find clear indicators showing when AI is being used and how data is processed.
• Features are opt-in whenever possible, or easily disabled. AI is also only introduced where it adds real value, avoiding unnecessary automation.
• The AI functionality is either available from our the Nextclout Assistant’s dedicated interface, or through deep integrations in the various applications, such as AI-generated subtitles during a video call.
What’s new in Nextcloud Hub 26 Winter: the Nextcloud Assistant?
As AI in Nextcloud is designed to be sovereign, organizations decide where their models run, which models are used, and what happens to their data. By doing so, your organization can benefit from AI-supported collaboration without giving up responsibility for its data and stay in control.
Our latest release of Nextcloud Hub 26 Winter continues to build on this foundation.
AI is still optional and configurable, instead of a mandatory layer imposed on all users or workflows. This allows organizations to adopt AI at their own pace, align it with internal policies, and decide which use cases make sense in their environment.
When AI is enabled, it becomes part of the collaboration environment instead of an external dependency. It integrates into existing workflows without breaking governance or compliance frameworks.
What can the Nextcloud Assistant do for you in practice?
Improve and generate texts, media, and documents
Answer questions based on organizational data
Summarize meetings and conversations in Nextcloud Talk
Provide live transcription and translation for multilingual collaboration
Integrate AI capabilities directly into email, chat, meetings, and file workflows
Nextcloud Hub 26 Winter also makes compliance easier: You can generate images and documents in various apps and automatically label content with watermarks. This ensures your organization is in line with the latest regulations, such as the AI Act in the EU.
The Nextcloud Assistant adds a watermark to an AI-generated image.
In short: privacy-first AI solutions such as the Nextcloud Assistant give organizations the efficiency and convenience of AI, while keeping governance, compliance, and data ownership exactly where they belong: under their control.
Regain your digital autonomy with Nextcloud Hub 26 Winter
Our latest release of Nextcloud Hub 26 Winter is here! Discover the latest Nextcloud features.
Le moment de reprendre le contrôle de vos données, c’est maintenant. Présentation du nouveau Nextcloud Hub, une puissante plateforme de collaboration open source qui vous met aux commandes. Découvrez les améliorations en matière de performance, de design et de sécurité, ainsi que de nombreux outils nouveaux et améliorés pour votre travail et votre vie quotidienne.
Nextcloud Hub 25 Autumn facilite le démarrage d'une collaboration puissante tout en gardant le contrôle de vos données. Des mises à jour globales du design à l'amélioration de la convivialité et des performances, découvrez notre dernière version dans ce blog.
Les organisations, petites et grandes, ont besoin d'un moyen d'assurer la résilience et la souveraineté numérique de leurs opérations - une alternative à Teams, open-source et respectueuse de la vie privée. Aujourd'hui, nous vous présentons cette solution - Nextcloud Talk.
Nous vous présentons une mise à jour majeure de l'assistant Nextcloud IA, ainsi que de nouvelles informations sur notre collaboration avec plusieurs grands fournisseurs d'hébergement tels que IONOS et OVHcloud pour vous proposer des options d'IA en tant que service !
Bechtle et Nextcloud ont annoncé aujourd'hui une plateforme de collaboration entièrement administrée pour le secteur public, qui ne nécessite pas d'appel d'offres et peut être déployée immédiatement.
Découvrez comment passer de ownCloud à Nextcloud. Notre outil d'aide à la migration fournit des informations sur le processus de migration et vous aide à effectuer la transition en douceur.
Au cours de la dernière année, l'IA est devenue un sujet à la mode. Il y a de l'engouement, mais aussi du fondement. Il y a du positif et du négatif. Nous voulons vous offrir le positif, pas le négatif, et ignorer le battage médiatique ! […]
Following the overwhelming success of last year’s Nextcloud Summit, Nextcloud is proud to launch its second edition of the Nextcloud Summit, taking place on 9 June in Munich, Germany. During this unique event, we want to address the wider market around digitally sovereign collaboration technology, providing a space for organizations, governments, and experts to connect, […]
Nous enregistrons certains cookies pour compter les visiteurs et faciliter l'utilisation du site. Ces données ne quittent pas notre serveur et ne sont pas destinées à vous suivre personnellement ! Consultez notre politique de confidentialité pour plus d'informations Personnaliser
Les cookies utilisés pour enregistrer les données saisies dans les formulaires, telles que le nom, l'adresse électronique, le numéro de téléphone et la langue préférée.
Nom du cookie :nc_form_fields
Description du cookie :Mémorise les données saisies dans les formulaires pour une prochaine visite (nom, adresse électronique, numéro de téléphone et langue préférée).
Expiration du cookie :30 jours
Consentement
Nom du cookie :nc_utm_parameters
Description du cookie :We use cookies to store UTM parameters from your visit so we can understand how you arrived at our website.
Les cookies statistiques collectent des informations de manière anonyme et nous aident à comprendre comment nos visiteurs utilisent notre site web. Nous utilisons la solution open source de mesure de statistiques web Matomo
Service:Matomo
Description du cookie :
_pk_ses*: Compte la première visite de l'utilisateur
_pk_id*: Aide à ne pas compter deux fois les visites.
mtm_cookie_consent: Se souvient que l'utilisateur a donné son accord pour le stockage et l'utilisation de cookies.
Expiration du cookie :_pk_ses*: 30 minutes
_pk_id*: 13 mois
mtm_cookie_consent: 30 jours