After DuckDuckGo was busted in a deal with Microsoft, I had started using other privacy focused search engines, and Qwant was one of my favorites. But I just don’t trust these companies as they can too easily be taken over by shares and board seats… And that’s when I started running my own privacy proxy search engines (SearxNG and Whoogle) that strip tracking elements and conceal your IP from the search engines, and with SearxNG you have control over what search engines it queries on your behalf. And I’m not that keen on the AI angle being employed, as that will just be a fancy way of censoring content and feeding you propaganda. But this paragraph below is interesting in the big tech companies sharing click and query data with competitors.
“The opportunity just has gotten a lot better,” he said. “With the [EU’s] Digital Markets Act, for the first time, ‘click and query data,’ for example, is going to be shared by other search engines — so we have access to that. Also access to platforms is different than it used to be. So we’ve been thinking about this for a long time, but now it’s the right moment to actually do it.”
By Natasha Lomas & Ivan Mehta
Qwant, France’s privacy-focused search engine, and Ecosia, a Berlin-based not-for-profit search engine that uses ad revenue to fund tree planting and other climate-focused initiatives, are joining forces on a joint venture to develop their own European search index.
The pair hopes this move will help drive innovation in their respective search engines — including and especially around generative AI — as well as reducing dependence on search indexes provided by tech giants Microsoft (Bing) and Google. Both currently rely on Bing’s search APIs while Ecosia also uses Google’s search results.
Rising API costs are one clear motivator for the move to shrink this Big Tech dependency, with Microsoft massively hiking prices for Bing’s search APIs last year.
Neither Ecosia nor Qwant will stop using Bing or Google altogether. However, they aim to diversify the core tech supporting their services with their own index. It will lower their operational costs, and serve as a technical base to fuel their own product development as GenAI technologies take up a more central role in many consumer-facing digital services.
Both search engines have already dabbled in integrating GenAI features. Expect more on this front, although they aren’t planning to develop AI model development themselves. They say they will continue to rely on API access to major platforms’ large language models (LLMs) to power these additions.
The pair is also open to other European firms joining in with their push for more tech stack sovereignty — at least as fellow customers for the search index, as they plan to license access via an API. Other forms of partnership could be considered too, they told TechCrunch.
“The door is open and we are ready to talk to anyone,” said Qwant CEO Olivier Abecassis. “But we also want to focus and really secure the capacity to invest with our existing shareholders.”
“We know that we will fuel the company for the next years, and we know that our shareholders are ready to support it, and really expect us to move fast,” he added. “We will discuss with investors to speed up the developments and to do more — and with others to join in the partnership. So the plan is really to move as fast as possible.”
AI generating risks and opportunities
AI is driving a dual sense of urgency for both parties, as it rapidly creates a landscape of new opportunities and potential pitfalls.
“With the emergence of AI tools there is a different demand now for a search index,” Ecosia CEO Christian Kroll suggested. “The two providers, Bing and Google, are basically getting more reluctant to make their index accessible. And of course, as a search engine, we need an index. So that’s partially why we want to make sure we have access.”
“But also there is now a unique moment where you can use that type of index to build a very different experience — using generative AI to create a different experience — and we don’t want to be restricted in using that technology.”
Kroll also pointed to a regulatory environment in Europe that is keen to foster homegrown tech innovation in order to bolster the bloc’s strategic autonomy as another reason for making a bet on a homebrew search index now.
“The opportunity just has gotten a lot better,” he said. “With the [EU’s] Digital Markets Act, for the first time, ‘click and query data,’ for example, is going to be shared by other search engines — so we have access to that. Also access to platforms is different than it used to be. So we’ve been thinking about this for a long time, but now it’s the right moment to actually do it.”
“We believe that if we want to get a meaningful GenAI user experience, we need access to LLM models,” added Abecassis. “But we also need access to search tech.”
The combination of GenAI models with up-to-date information pulled in through search queries will be key to advancing search product utility, he argued.
“We believe that the combination of the two will be the next user experience for search,” he said. “Search and GenAI are not exactly the same. We believe that both will take benefit from the other, and the mix will be unique.”
“Google decided to have two strong products, but not mix them. And I can understand when I look at the legacy business model of Google. But in the future, something will happen between [these technologies] and that’s what we want to experience. And for that, any player on the market will need access to a search technology. That’s why we want to propose [this] to the market.”
Towards a European perspective
The pair’s new joint venture, which is being called European Search Perspective, is being set up with a 50:50 ownership split. (Note: EUP is their chosen acronym, rather than ESP.)
Ecosia and Qwant are not disclosing how much they’re each investing but said their shareholders are supportive. Plus, as a separate entity, EUP will sit outside the former’s not-for-profit business model — allowing it to raise external capital (assuming investors can be persuaded to get on board).
The index is expected to start serving France-based search engine traffic for Ecosia and Qwant by the first quarter of next year. It will then expand to include a “significant portion” of traffic in Germany by the end of 2025.
English would be the third language they’d look to add, the pair said, adding that more European languages could follow in the future if momentum builds.
On the operational side, Qwant’s engineering team will be moving to EUP, while Abecassis — who took on the CEO role at the search engine just over a year ago — will be CEO of the joint venture, too.
Qwant was acquired by a cloud technology group called Synfonium last year, which is backed by the founders of French cloud computing firm OVHcloud, with a goal of building a “European champion” for cloud services.
Discussing the plan for EUP in a call with TechCrunch, Abecassis explained Qwant had been working on developing its own search index even before it was acquired by Synfondium. Those efforts will now move over to EUP, he confirmed, with both team and IP assets transferring over.
Joining forces with Ecosia bolsters the chance of success, he suggested, as it expands the pool of data available for developing the index, as well as increasing investment in the project and enabling faster development, such as by being able to hire more engineers.
Ecosia has around 20 million monthly users globally, while Qwant has some 6 million users in France.
“If we want to be really efficient, we have to involve more people… and be more ambitious,” said Abecassis, recounting how Qwant approached Ecosia to ask it to consider a partnership on developing the search index.
“For Qwant, it’s a major opportunity to build better tech — because search technologies are good if they are used… So the more tech is used, the more money you can invest, but also the more data you get. One of the reasons why Google is so strong is it’s based on tons of data.”
The two firms share a few characteristics that make a partnership look like a good cultural fit, with both search alternatives being developed in Europe and having business models that seek to do something different compared to Big Tech’s standard surveillance capitalism playbook. EUP, meanwhile, will be headquartered in Paris.
“Building such a technology from scratch is almost impossible,” added Abecassis. “The more user[s] we have and the more data sets we have will make the technology more valuable.”
Kroll said Ecosia is bringing expertise, data and financing to the partnership — noting that as well as developing the search engine there will be other technologies that EUP will need to develop, such as widgets that can be served as part of search results.
The pair expects the partnership to boost the efficiency of search results they can deliver their respective users, as EUP hones its ranking algorithms — even as each search engine will continue to develop its own distinctive user experience.
Search ranking alternatives
Rival search engine Brave, which much like Qwant has a sales pitch that foregrounds privacy, has already built its own search index. It even removed the last API calls for text-based searches to Bing in April last year when it touted its service as a “real alternative to Big Tech search.”
Asked about this, Abecassis suggested Brave’s index cleaves closer to Google and Bing in the technical approach. Whereas he emphasized that EUP is being built from scratch, claiming it will be “very different” and will deliver more diverse search results.
“We don’t just copy Google or Microsoft and learn from them,” he stressed. “We really index all the documents that are available. We understand the documents, and then we have a team that works to find the best match between a document and the [search] query.”
“So it is true that there are probably some shortcuts to build such a tech by copying the main guys. We decided to go in a different direction and build everything from scratch. It’s harder but, we believe, it’s more sustainable.”
One big difference compared to Big Tech search is that EUP’s search index will serve up “privacy-first” results. What does that mean in practice? Abecassis said this is a result of tech developed by Qwant that does not personalize search results based on the user (as Google does).
“We’re going to continue to work without any [user] data [personalizing results],” he said. “Then we will improve our algorithm based on the data that are available.”
“I think it’s a big win — big privacy win,” added Kroll of the choice of technical approach. But he also emphasized the strategic value of having search infrastructure that’s made in Europe at a time of increasing geopolitical instability.
“From a European perspective… what does [search infrastructure reliance] mean for the dependency of the European Union? Especially considering [the U.S] election results… If the U.S. government decided that they would not want to provide search results to Europeans anymore, we in Europe would have to go back to phone books.”
“There is a privacy element, but then there’s also an element of data sovereignty, which I think is very important,” he added. “I of course hope that the U.S. and Europe will always stay strong allies. But I don’t know where the U.S. is heading, and I also don’t know where Europe is heading. So this is a very important element.”
A costly business?
TechCrunch asked Brave about its own decision to build a search index. It told us that prior to switching to its own tech it “always risked Microsoft imposing restrictions on us or simply cutting us off” — so the move was intended to free the business from a risky dependency.
“According to our quality assessment team, which does blinded assessments for quality of results, we are on par with Google and better than Bing in the countries we measure (those in which Brave Search is the default for Brave browser users),” the company also told us, adding that Brave Search is “the fastest growing search engine since Bing” with over 1 billion queries per month.
Discussing the costs of developing the index, Brave described the process as “long and very expensive” — pointing back to its 2021 acquisition of the open source Tailcat search engine, a technology whose development it said dates back to 2014.
“There is a reason why there are only three fully-fledged independent search indexes in the West,” Brave added.
The company licenses its search index via the Brave Search API. The API is being used by “many leading companies in the AI space,” per Brave, which added that it’s quickly becoming a “significant” source of revenue.
TechCrunch also asked search engineer Peter Popov about the costs involved in building a search index. Popov spent 15 years at Russian search giant Yandex, working on core search and ranking, and is now VP of ads at VK.
“Very roughly, a search index, which includes hardware and the cost of writing a search, does not cost much more than $10 million,” Popov told us, couching such an outlay as “not a very large investment.” He suggested that advances in AI have made it easier to produce quality search results without needing vast amounts of users feeding in data,”by using modern LLM models that contain knowledge of search semantics out of the box.”
At the same time, he warned there is growing challenge over where search bots can — or can’t — freely crawl. This is a problem as a search index needs wide access to information sources in order to usefully serve users’ queries.
“Proprietary platforms are often quite unfriendly to attempts to collect information,” Popov told TechCrunch.
“Creating a search index for the entire Internet is not such a difficult task from a technical point of view. The volume of useful information on the internet grows more slowly than computing power. By the way, one of the problems for scaling AI is precisely the relatively small volume of such information.”
“The useful-for-information-search internet is not that big,” he continued. “There is no internet of sites to search right now. And wait, of these [mainstream web] platforms, only Wikipedia is open to search. So that leaves Wikipedia search.
“After Wikipedia there are not many useful sites like arxiv.org or large online libraries. Information of this kind can be used in two ways — either by providing data during network training, or by feeding the neural network with search results during inference, in which case the search is one of the components of LLM working under the hood.”
In other words, in order for a search index to be useful, it also needs to be able to freely crawl the internet. But with Big Tech more jealously guarding info inside its own platforms these days, as giants compete to monetize user data afresh for training LLMs, this is also complicating the business of trying to get out from under their shadow by indexing the internet for search… From a rock to a hard place, then.