Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I looked into the Cliqz browser a few years a ago. Back then they claimed to be a privacy respecting alternative to Chrome. What I found back then is that they sent every keystroke I type in the URL bar to their own servers. They outright lied about that in their terms.

For me their mission is pretty clear: Google ate Burda's ad cookies and now they are trying to get their hands into the cookie jar again. Given that today we have widespread TLS adoption the war about the endpoint has begun. Cliqz, alias Burda Media, is just another combatant - the one who controls the browser controls the ads.

Previous thread with more info:

https://news.ycombinator.com/item?id=18626790



They have an autosuggest feature in the browser which shows search results as you type. Goes without saying that this requires a request per keystroke. If each request contained a unique user id, that would be concerning, but that is not the case. that the search query itself goes to their servers is just common sense, but those requests are anonymized. (Disclaimer: I'm a former employee)


Oh, the requests contained much more than the keystrokes and I looked into the anonymization claims as well. I'd have to dig out my Wireshark traces to tell you exactly what was transmitted, but I'm pretty sure it was enough to id the user. And by the way Cliqz never claimed that the anonymization happens at the endpoint, from what I remember they were very proud of their proprietary anonymization technology in the backend.

And just one more thing: In Google Chrome at least I can turn off auto-suggest. This was not possible in the version of the Cliqz browser I tested.


Thanks for the feedback, if you find something fishy in the requests then that's like a bug and we will gladly fix it ASAP.

Disclaimer: I work at Cliqz.


For the current Cliqz Browser (1.30.1) when I type "pythux" in the URL bar the request looks like this:

GET /api/v2/results?nrh=1&q=pythuxs=UTlnJzAh4ULkORaiZgLPO6LW9LOWZZoO&n=5&qc=0&lang=en&locale=en-US&platform=0&o=%5B%5B%22custom-search%22%5D%5D&country=de&adult=0&loc_pref=ask&count=10&suggest=0

HTTP/1.1

Host: api.cliqz.com

1. I saw UTlnJzAh4ULkORaiZgLPO6LW9LOWZZoO in other requests too, but not all of them. Can you explain what this parameter is good for and which information it encodes?

2. Can you tell if it is possible to disable auto-suggest (or Quick Search how it seems to be called in your terms)? If it is possible then how do I do it? I couldn't find it in the UI.

3. Can you specify what the exact legal relationship between Cliqz GmbH and FoxyProxy LLC is and if there is any shared ownership or if there are any common parent companies?

4. As you can see above the requests go directly to api.cliqz.com. While the terms go into great length to explain that information is routed via a third party owned proxy the information we are talking about here is exempt from that. I quote the relevant passages here:

> This channel collects signals about WHAT you search and where you land. That is why we do not collect any personal identifier here, which makes it impossible to associate searches with users. Moreover, all query entries and clicks on website suggestions are evaluated only as a single event, disentangling these signals from everything else. Thus, we are neither able to combine data from multiple entries or multiple clicks on website suggestions, nor to link this information with personal information like your email address or an IP address, either.

> Query logging data is used to further improve the Cliqz backend. More specifically:

> To be able to suggest websites in real-time while you are typing into Cliqz’ combined browser-and-search-bar, Cliqz sends your keystrokes to our servers. With every new keystroke, our backend scans our index and predicts the most relevant results for your search query.

> “Relevant” to that regard is (very simplified) defined by the frequency a given website is clicked on for a given query. In other words, Cliqz predicts the most probable site you will navigate to, based on the (partial) query that you type. In order to further improve this mechanism of relevancy, Cliqz logs the clicks in its drop down menu and the respective queries.

I wish the terms were more clear about the fact that crucial information is indeed sent directly to Cliqz and is not sent via the FoxyProxy route.


Hi, Disclaimer: I work for Cliqz.

Thank you the questions, we are always looking for constructive feedback on and off HN.

1. These random values are used for grouping partial queries together, and they reset when you press enter or start a new query. Source code on how it's generated: https://github.com/cliqz-oss/browser-core/blob/master/module... We actually take one additional precaution of using crypto random and not plain Math.random(), which could potentially be used to link multiple sessions together.

2. There is no feature to disable auto-suggest. But I will pass your feedback to the team.

3. No there is no shared ownership, we don't have access to their servers. We also do additional encryption with bucketing on the payload sizes that we route through foxyProxy, so that the proxy provider cannot learn anything about the content of the message. We will have a blogpost explaining this on Wednesday - 4th December. Also, we are looking to add an option, where user can choose their own proxy provider too.

4. There are two parts: a. You can select the option from Control Center (Q menu) icon in the toolbar -> search -> search via proxy. (Now the calls should go through FoxyProxy) b. All calls to api.cliqz.com go through proxy when in private mode. The only reason it's not default is latency. As to what goes through FoxyProxy by default is: all Human Web data.

Once again, we appreciate you looking into details, and please keep digging, we would be happy to answer, improve our documentation and if there are bugs specially related to privacy and security they are on our uttermost priority.


Man, you really got them here haha.

Thanks for taking the time to dig into things. Comments like these are why I continue to regularly read Hacker News.


IP Addresses are generally pretty sticky and make a great unique id for users. Even without uids being sent in the request, keying on ip address should eliminate a ton of anonymity.


Keystrokes + IP = NSA knows exactly what I'm typing.

This "feature" should be off by default on any software that claims to be a privacy respecting alternative to Chrome.


It's been my impression that privacy from the NSA is, realistically, not something any commercially viable web company can offer you. An exception can probably be made for a site that operates outside of US influence serving customers outside US influence.


Uhh, most products can offer plenty enough defense against opportunistic blanket surveillance (but broadcasting your keystrokes is probably not going to help there). Protection against targeted attacks is a whole different story.


If you are ok with some extra latency you can enable our “search via proxy” feature which proxies all queries and hide your IP. Maybe that is what you are looking for?


Can you ID someone positively by their typing patterns and an IP?


You most likely can. Just listening to people type on a keyboard gives a very unique feel with each person. Try it sometime, listen to family members and colleagues type or find some youtube video. I can ID all my family members and some coworkers by the sound of their typing.

Obviously, if you can get the content of what they're typing, it gets much easier still. I think I've seen papers where they ID programmers based on the code they've produced. This applies to other types of writing too.

You can build a model from keystroke timings and figure out people's SSH passwords too. https://www.usenix.org/legacy/events/sec01/full_papers/song/...


Now do it with random network jitter.

The main problem here is that even if you could do it, why would you? There are over 3.4 billion Internet users in the world. Given that people share IP addresses and even browsers, what's the actually gain identifying someone through typeahead search keystroke jitter? This would spend a lot of effort, and and then tell you what that a cookie doesn't?

I can't imagine that it's actually worth the effort.


Yes. Also search WW2 war stories about radio operators, this technique dates back since then. US Pacific operators always knew which Japanese radio operator was transmitting Morse codes by listening to their typing patterns. And if a human was able to do it, imagine the accuracy of computers in this regard.


I was a ham radio operator for several years. It was easy to ID my frequent contacts by their morse code transmission patterns, and they me. An operator's pattern was called their "fist".


Yes, just like you can recognize someone by their footsteps. Most likely you don't even need the IP.


> I looked into the Cliqz browser a few years a ago. Back then they claimed to be a privacy respecting alternative to Chrome. What I found back then is that they sent every keystroke I type in the URL bar to their own servers. They outright lied about that in their terms.

That's called typeahead search, and that's how it has always worked since the invention asynchronous HTTP requests.

If you know of someone that sends a trie down to the client containing every possible completion a priori, I'd like know about it.


" they sent every keystroke I type in the URL bar to their own servers"

Well, that's how you implement an autocomplete feature in the 1st place. If they'd sent every keystroke you typed outside their URL bar, now that's something to be aware off. So, did they?


Sorry you feel that way. In the Cliqz browser you search as you type. To search and serve results we need to know the query. We will have an article on exactly this product in a week or so.

We will also be sharing more details on the process on data collection in the blog posts scheduled in the next 3 days. Of course, you don't need to trust what we say, all the code used for collecting data is open-source for transparency and auditing: https://github.com/cliqz-oss/browser-core

Lastly, privacy policies are legally binding documents, and we take the law very seriously. We are located in Germany, where privacy laws are as tough as they get (e.g. GDPR was loosely just a re-wording of the existing data protection directive in Germany).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: