3/5 - (4 votes)

The term “container breakout” is used to indicate a situation in which a program running inside a Docker container can overcome isolation mechanisms and gain additional capabilities or access to confidential information on the host. To prevent such breakthroughs, the number of container capabilities by default is reduced. For example, the Docker daemon runs by default under the root, however it is possible to create a user-level namespace or to remove potentially dangerous container capabilities.

Best practices

Capabilities, which the application doesn’t need should be removed.

  • CAP_SYS_ADMIN is especially dangerous in terms of security, since it gives the right to perform a significant number of superuser-level operations: mounting file systems, entering core namespaces, ioctl etc.
  • To have container capabilities equivalent to regular user rights, create an isolated user namespace for your containers. If possible, avoid running containers with uid 0.
  • If you still cannot do without a privileged container, make sure that it is installed from a trusted repository.

Closely monitor cases of mounting potentially dangerous host resources: /var/run/docker.sock), / proc, / dev, etc. Usually, these resources are needed to perform operations related to the basic functionality of containers. Make sure that you understand why and how you need to limit the access of processes to this information. Sometimes just setting the read-only mode is enough. Never give write permission without making sure why it is needed. In any case, Docker uses copy-on-write to prevent changes that occur in the running container from getting into its base image and potentially into other containers that will be created based on this image.

Examples

The root user of the Docker container can create devices by default. Probably you want to disallow this:

# sudo docker run --rm -it --cap-drop=MKNOD alpine sh
/ # mknod /dev/random2 c 1 8
mknod: /dev/random2: Operation not permitted

 

Root can also change the permissions of any file. This is easy to verify: create a file under any regular user, run chmod 600 (read and write is available only to the owner), log in as root and make sure that the file is still available to you.

This can also be fixed, especially if you have mounted folders with confidential user data.

# sudo docker run --rm -it --cap-drop=DAC_OVERRIDE alpine sh

 

Create a regular user and go to his home directory. Then:

~ $ touch supersecretfile
~ $ chmod 600 supersecretfile
~ $ exit
~ # cat /home/user/supersecretfile
cat: can't open '/home/user/supersecretfile': Permission denied

 

Many security scanners and malware collect their network packets from 0. This behavior can be disabled in this way:

# docker run --cap-drop=NET_RAW -it uzyexe/nmap -A localhost

Starting Nmap 7.12 ( https://nmap.org ) at 2017-08-16 10:13 GMT
Couldn't open a raw socket. Error: Operation not permitted (1)

 

If you create a container without a namespace, then by default, processes running inside the container, from the point of view of the host, will work on behalf of the superuser.

# docker run -d -P nginx
# ps aux | grep nginx
root     18951  0.2  0.0  32416  4928 ?        Ss   12:31   0:00 nginx: master process nginx -g daemon off;

 

However, we can create a separate user namespace. To do this, add the conf key to the /etc/docker/daemon.json file (be careful, follow the json syntax rules):

"userns-remap": "default"

 

Restart Docker. This will create the dockremap user. The new namespace will be empty.

# systemctl restart docker
# docker ps

 

Run the nginx image again:

# docker run -d -P nginx
# ps aux | grep nginx
165536   19906  0.2  0.0  32416  5092 ?        Ss   12:39   0:00 nginx: master process nginx -g daemon off;

 

Now the nginx process runs in a different (user) namespace. Thus, we are able to improve the insulation of the containers.