It’s been three years since OpenAI shook the digital world with the launch of ChatGPT, with now 200 million people logging on to ChatGPT weekly. And those numbers get a lot higher if you also include other generative AI solutions, such as Gemini or Claude: smart large language models from Big Tech companies that can generate text or images, translate, or code.
But while AI assistants are everywhere and bring these big, productive promises, it’s getting hard to determine what has been discussed more: The opportunities that AI will create, or the privacy issues that come with it. Think of data exfiltration, unchecked surveillance, and biased profiling.
Let’s look at why and how Big Tech is collecting your data and what best practices you can implement to protect your data when using AI.
Big Tech and AI: The data grab
Any large language model is only as good as the data it’s trained on. That’s why AI models of Big Tech companies, such as Google, Microsoft, and Amazon, are actively using your conversations with their chatbots to improve their models.
A recent Stanford study compared the privacy policies of six Big Tech AI platforms. Their research revealed that:
All of them use your chat input for their training purposes by default (except for Amazon’s Nova AI agent, for which the data isn’t clear).
In half of the cases, these conversations are saved on their servers indefinitely, so there is no limit on the amount of time that they keep your data.
The opt-out methods can be hard to find. In two out of six cases, the mechanisms to opt out of the chat training were even unclear or unspecified.
What’s more, not all companies clearly state that they de-identify your personal information before using it for their training purposes.
And some of the Big Tech platforms allow humans to review your chat transcripts for their model training goals.
These data-collecting mechanisms provide AI companies with a lot of control.
Lack of transparency due to cloud centralization makes building this control is easy for them. It’s expensive to run independent AI models. A lot of AI platforms are hosted in large, centralized cloud environments. The remote servers process the inputs and return an answer. While this is efficient, it also concentrates control: the data leaves the local environment, leading to a shift in governance from the user to the provider.
In other words, Big Tech is gaining a lot of power by collecting data while also keeping its data processing out of our sight.
Why “free” AI tools don’t exist
A lot of AI tools have seemingly free or cheap versions, while in reality, users pay by giving up their data, as data is the currency of the AI platform. In other words: If it’s free, you are the product.
The AI privacy concerns that come with Big Tech data collection
You might wonder, what are the issues with these AI data collection and processing mechanisms, as long as they provide you with a smart and efficient business tool?
First of all, there is the issue of unchecked surveillance and bias.
Data collection has been growing for years, and your AI model is only as good as the data that supports it. This can lead to unfair profiling and inequality, especially if the training data is biased.
A couple of real-life AI bias examples include:
An English tutoring company used AI recruiting software that automatically rejected female applicants over 55 and male applicants over 60, leading to a $356,000 settlement for age discrimination.
A Brookings Institution study showed AI credit scoring systems can reproduce racial disparities. That’s because historical credit data correlates strongly with race. When an AI model uses this data for its credit scoring, it can lead to unequal loan approvals and interest rates.
In 2016, Microsoft’s chatbot Tay began posting racist and antisemitic messages within 24 hours of launch after learning toxic behavior from user interactions on Twitter.
Then there is the issue of data theft and leakage. AI systems can become attractive targets for hackers because they process large amounts of sensitive information in one place.
They can also leak data by showing parts of previous conversations or sensitive data when they generate responses. Or the AI system conflates information and produces fake news.
That happened, for example, when a journalist was falsely described as “a 54-year-old child molester” by Microsoft’s AI tool Copilot. As he had been reporting on criminal court cases on child abuse, the AI bot wrongly made him into a criminal.
That’s just the tip of the iceberg of privacy concerns. Other issues surrounding AI include the collection of data without consent, training on children’s data, and the use of data without permission.
What about the EU’s AI Act to protect us from Big Tech’s abuses of power?
At this point, you might be wondering: Isn’t regulation supposed to address these issues?
The EU’s AI Act was designed exactly for that purpose: to set guardrails around how AI systems are developed and used. Once fully applied in August 2026, it would introduce rules aimed at improving transparency, accountability, and risk management for AI providers.
But the tech industry, especially the Big Tech players, didn’t just stand by to let this happen. They argued that parts of the AI Act were too complex and burdensome for innovation, and they have been actively pushing for change.
(Digital industry lobbying in the EU has grown significantly from about €113 million annual spending in 2023 to roughly €151 million today, an increase of more than 30% in just two years.)
And their efforts paid off, as the EU introduced a new proposal, “the Digital Omnibus“, which aims to simplify compliance and reduce regulatory overhead under the AI Act in certain areas.
The debate on AI now continues.
Supporters say the changes could make it easier for companies to adopt AI while maintaining safeguards. Critics, however, warn that this new proposal could weaken some of the protections originally intended by the AI Act and other digital regulations.
Best practices: Transparency, limited data collection, and control
For most organizations, the question is no longer whether AI is used, but under which conditions. In most cases today, using AI means sending data to external services.
And as this blog post shows, for organizations working with sensitive data, this can become a problem with legal, technical, and ethical consequences.
Does that mean you shouldn’t use AI at all?
The issue is that if you don’t provide AI you can trust, people will likely start using their favorite services like ChatGPT for work. And that means losing control of your sensitive data.
Minimal data collection, so your organization’s data stays safe
Full control over how your data is being processed
High performance and intuitive workflows, ensuring people use trusted internal tools instead of relying on external AI services, so employees don’t feel the need to use external AI tools for their daily work
An AI tool like the Nextcloud Assistant ticks all of these boxes, making sure you can safeguard your online data without having to give up on the efficiency and comfort of an AI solution.
How Nextcloud approaches AI: Ethical AI and other pointers
Nextcloud’s prime approach to AI is that it should never be fixed to any particular provider. In other words, the administrators can choose between different providers, including self-hosted options.
With our ethical AI rating, they can get specific guidance in this choice, based on a four-level, color-coded rating scale.
Apart from that foundation, we have specific tools for both admins and users to make the most of this privacy-first AI solution, ensuring governance, compliance, and data protection.
For admins
• We aim to ensure that for each AI function in Nextcloud, there exists at least one fully green-rated AI option, that is, an option that is fully open source in terms of model, training data & running code.
• We aim to give granular control over the various AI features so that different models/solutions can be employed for different functions.
• By default, all AI features are disabled. You can run the AI fully on-premises without sharing any data with providers, and our Ethical AI rating helps understand how data sharing works in provider-hosted models.
For users
• You’ll find clear indicators showing when AI is being used and how data is processed.
• Features are opt-in whenever possible, or easily disabled. AI is also only introduced where it adds real value, avoiding unnecessary automation.
• The AI functionality is either available from our the Nextclout Assistant’s dedicated interface, or through deep integrations in the various applications, such as AI-generated subtitles during a video call.
What’s new in Nextcloud Hub 26 Winter: the Nextcloud Assistant?
As AI in Nextcloud is designed to be sovereign, organizations decide where their models run, which models are used, and what happens to their data. By doing so, your organization can benefit from AI-supported collaboration without giving up responsibility for its data and stay in control.
Our latest release of Nextcloud Hub 26 Winter continues to build on this foundation.
AI is still optional and configurable, instead of a mandatory layer imposed on all users or workflows. This allows organizations to adopt AI at their own pace, align it with internal policies, and decide which use cases make sense in their environment.
When AI is enabled, it becomes part of the collaboration environment instead of an external dependency. It integrates into existing workflows without breaking governance or compliance frameworks.
What can the Nextcloud Assistant do for you in practice?
Improve and generate texts, media, and documents
Answer questions based on organizational data
Summarize meetings and conversations in Nextcloud Talk
Provide live transcription and translation for multilingual collaboration
Integrate AI capabilities directly into email, chat, meetings, and file workflows
Nextcloud Hub 26 Winter also makes compliance easier: You can generate images and documents in various apps and automatically label content with watermarks. This ensures your organization is in line with the latest regulations, such as the AI Act in the EU.
The Nextcloud Assistant adds a watermark to an AI-generated image.
In short: privacy-first AI solutions such as the Nextcloud Assistant give organizations the efficiency and convenience of AI, while keeping governance, compliance, and data ownership exactly where they belong: under their control.
Regain your digital autonomy with Nextcloud Hub 26 Winter
Our latest release of Nextcloud Hub 26 Winter is here! Discover the latest Nextcloud features.
Time to own your data is now. Introducing the new Nextcloud Hub, a powerful open source collaboration platform that puts you in control. Discover improvements in performance, design, and security, and lots of new and improved tools for your daily work and life.
With the Nextcloud awards we want to highlight projects and individuals who are not simply adopting the language of sovereignty, but who are actually making a difference.
Nextcloud Hub 25 Autumn makes it easier to get started with powerful collaboration while fully in control of your data. From global design updates to improved usability and performance, discover our latest release in this blog.
In the Nextcloud 2024 wrap-up, we want to take a moment to celebrate this year's achievements. Join us as we continue to reimagine what’s possible - shaping a world where open source, privacy and connection come together and drive progress for the greater good.
Organisations, small and large, need a way to ensure the resiliency and digital sovereignty of their operations – an open-source, privacy-respecting alternative to Teams. And today, we present that solution - Nextcloud Talk.
Nextcloud has been recognized with the World Summit Award Germany that selects and promotes local digital innovation improving society, aiming to contribute to the United Nations' agenda of sustainable development goals.
DIE ZEIT, a prominent German outlet, interviewed Nextcloud’s founder Frank Karlitschek for an article on Microsoft’s anti-competitive behaviour on the European office software market. Read for a recap of the article and the key takeaways.
MagentaCLOUD’s migration to Nextcloud in 2021 resulted in a fully equipped Online Storage with an integrated online office suite that further improves the user experience, flexibility and security for customers.
We bring you a major update to the Nextcloud AI Assistant, plus the news we work with several big hosting providers like IONOS and OVHcloud to bring AI-as-a-Service options to you!
Bechtle and Nextcloud announce today a complete managed collaboration platform for the public sector that requires no tender and can be deployed immediately.
Discover how to make the switch from ownCloud to Nextcloud. Our quick guide provides insights into the migration process, helping you make the transition smoothly.
Today, US-based file sync & share vendor Kiteworks announced their acquisition of ownCloud and Dracoon. Kiteworks points out that their customers now have access to their file-sharing application. It is to be expected they will not maintain 3 similar products, but customers will have to migrate to the US firms’ platform or look for another […]
As part of Schleswig-Holstein's state digitization strategy, the state chancellery has announced they will work with Nextcloud to develop AI for working with government documents. This comes just after we announced the first private AI assistant last weekend with Hub 6. The German state already uses Nextcloud and their AI strategy aligns with our work on ethical, local AI technologies.
Over the last year, AI has become a popular topic. Some is hype, some is substance. Some is good, some is bad. We want to give you the good, not the bad, and ignore the hype! AI has a ton of opportunity – but also risk. So we put you in control – off by […]
Following the overwhelming success of last year’s Nextcloud Summit, Nextcloud is proud to launch its second edition of the Nextcloud Summit, taking place on 9 June in Munich, Germany. During this unique event, we want to address the wider market around digitally sovereign collaboration technology, providing a space for organizations, governments, and experts to connect, […]
We save some cookies to count visitors and make the site easier to use. This doesn't leave our server and isn't to track you personally!
See our Privacy Policy for more information. Customize
Statistics cookies collect information anonymously and help us understand how our visitors use our website. We use cloud-hosted Matomo
Service:Matomo
Cookies description:
_pk_ses*: Counts the first visit of the user
_pk_id*: Helps not to double count the visits.
mtm_cookie_consent: Remembers that consent for storing and using cookies was given by the user.
Cookies expiry:_pk_ses*: 30 minutes
_pk_id*: 13 months
mtm_cookie_consent: 30 days