Working around Docker's COPY command copying as root using GNU M4

I’ve been working with Docker a lot recently, and have been switching to non-root users within the Dockerfile using the USER instruction (not for security reasons, but in order to test software not as root). As a result, I’ve been suffering a little from this issue - essentially, files copied into a Docker image using ADD or COPY are copied with the owner as root, irrespective of the ownership outside the Docker image. According to this comment, this behaviour in Dockerfiles won’t change, at least in the near future.

Irrespective of the merits of this, I needed a solution. There is an (inelegant) way to work around this, which is to switch to root, copy the file in, change its ownership, and switch back to a regular user. I could do this directly inside the Dockerfile, but this requires a lot of extra boilerplate code bulking out the Dockerfile.

As a result, I turned to an ancient templating technology to make my Dockerfile more manageable - m4. I’ve created an m4 macro file which can be included in a Dockerfile. The full version (called docker-copy-as-user) is on github, but a small flavour is:

define(`COPYASUSER',
`USER root
COPY $1 $2
RUN chown testuser $2/$1
USER $3
')

A file that contains this macro can be included in a Dockerfile (which I now like to call Dockerfile.m4) with the incantation include(`docker-copy-as-user'). This can then be used by including the ‘meta-instruction’ COPYASUSER filename target/dir user-to-own-it. For example, Dockerfile.m4 might look like:

FROM ubuntu:15.04

RUN useradd -ms /bin/bash testuser
USER testuser
WORKDIR /home/testuser

COPYASUSER externalfile.zip /tmp testuser
RUN unzip externalfile.zip

This can then be pre-processed before you run docker build with:

m4 -I directory/where/docker-copy-as-user/is Dockerfile.m4 > Dockerfile

Note: I am very much a newbie to both Docker and m4. Please don’t read the above as being the optimal answer. In particular, your final (processed) Dockerfile still ends up pretty bulky, so this will still have performance implications due to the number of layers. I’d be glad to hear any suggested improvements.