Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

docker hub is surprisingly difficult to replace because of how docker registries work.

Traditional package managers have two distinct concepts: a repository and packages within those repositories.

For example, if ubuntu took down their apt repo server, I could run my own with all the same packages and change a single sources.list entry and all my servers, ansible roles installing packages, etc, would operate the same.

This is possible because the package name+version is an identifier everything else uses and the only thing that cares about the repository is apt itself; all other tooling doesn't need to know about the repository the package is sourced from.

Docker conflates those two things. Each client doesn't just send a package name, it sends a url + package name + version (e.g. foo.registryurl.com/image:version). Because every single client has the detail of "foo.registryurl.com" baked in, it's difficult to change that. I can't change a single "repository-mapping" file that the docker daemon reads to quickly update it.

Instead, I have to update every single client.

The idea of decoupling those is not new. In 2014 it was proposed [1], and various implementations that would help make it easier to migrate off the default registry have been proposed and rejected [2]

This doesn't even get into the lack of tooling for chasing down the transitive dependencies building my images has on various registries with each FROM.

[1]: https://github.com/moby/moby/issues/8329

[2]: https://github.com/moby/moby/pull/5821#issuecomment-49492924



> Instead, I have to update every single client.

To combat this, every single image we use from hub.docker.com is "proxied" into our registry with a one-line Dockerfile:

   FROM image:version
Building the "proxy" image and publishing it in our registry is entirely automated (using CI+Registry of a self-hosted Gitlab). Then we make everything point to our version in our registry. Should hub.docker.com go belly-up, then we have 1. a cache of versions in use (current and past), and 2. full control of the images (possibly making our own FROM scratch) without having to change a single line in downstream consumers. Initially we did this to be safe from hub.docker.com possibly intermittent availability that would delay image pulls on deployments.


Do you do it per project or as a separate project that houses all the proxy images? How do you version the proxy images? What namespace do you push them into? Is it easy enough to deal with that it doesn’t waste a lot of time?

I’ve been trying to insulate myself from docker too and the FROM proxy strategy seems to break the least stuff. Have you hit any pain points?


This is a single 'gitlab.example.com/docker/library' project.

We use orphan branches, one per image, although other strategies are possible (like using the commit diff and directory name).

Proxy images are versioned using branch names (e.g postgres vs postgres-9.6), images are pushed to gitlab.example.com/docker/library/postgres, and using version detection we generate docker image tags (e.g a 'postgres' branch will create postgres:latest, plus extracting version from postgres --version also pushes postgres:10 and postgres:10.1 images.

See this .gitlab-ci.yml[0]. Yes, there is one per branch. This can be generalised further (especially with Gitlab's new import system for .gitlab-ci.yml) but works well enough in practice, it's very low maintenance, and updates are a mere commit+push away.

In fact we use this not just for proxying images but for all "generalised", "utility", or "dependency" images that are not the result of a given full-blown app project in its own repo (those have their own CI/CD process in their respective repo)

[0]: https://gitlab.com/snippets/1705998


But here we are talking about Docker Hub closing, which is way easier to solve. This is the registry that is used as default (for all non-local images without a URL identifier, like "alpine:latest").

All you need is an option to set the default registry. Probably it's already there, didn't google.


Google it. It’s a fun topic.


Does it actually send the URL? I think it uses the URL to make the network request, but doesn’t actually send the URL to the registry. If it did, you could use an alternate registry as a pull through cache and have it go upstream for everything. AFAIK that’s not possible.


Docker's registry protocol is surprisingly complicated. It is stateful, it is not trivially cacheable, and it's a right pain in the ass to deal with




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: