Container Internals

A) Registery and image layers

Container registry is just a server which will be hosting images(.tar files along with .json file) to be used by multiple remote users. It is similar to docker hub(Your own private dockerhub!). For understanding image layers we will try to manually download each layer and try to create the file system(we are trying to mimic the procedure for create isolated file system which happens when we run a container :)

  1. For starting a registry server we can run the command(automatically pulls the image from official docker repo and runs the container):
docker run -d -p 5000:5000 --restart=always --name registry registry:latest

2. Pushing our custom image to our new repo:

a) Changing tag name of alpine-

docker pull alpine:latest
docker tag alpine:latest
docker push

b) Confirming push:


3. Pulling the image:

a) For seeing what actually happens in background we can proxy the podman traffic via burpsuite-

export HTTP_PROXY=

Note: for those who are unfamiliar with podman. It is another better option than using docker to manage contain.(It does not require a daemon running all the time in background. Therefore it can run the container with user permission instead by default running container as root as in docker)

4. Recreating file system manually:

a) First we try to find how many layers are there in the image-


We found 2 layers. Let’s try to download them one by one.

b) Download layers one by one-

curl -s --output l1curl -s --output l2

c) Checking type of files(it will be either .json or .tar)-

.json contains metadata whereas .tar contains actual file and folders.

d) Now we can just extract the .tar to get actual filesystem-

tar -xvf l2

B) Cgroups:

It is a Kernel feature used for resource management for processes. It has wo versions 1 and 2(1 is in use by default). Resources like CPU, RAM, network access, and so on, are called controllers in the cgroup terminology. e.g controller types:
c)pid etc.

  1. Creating a memory cgroup-
sudo cgcreate -g memory:test

2) Setting 4mb max restriction for memory usage-

sudo cgset -r memory.limit_in_bytes=4194304 test

3) Applying restriction and creating new process(memmeat):

memeat is a memory stress testing tool. It will get killed if it does not get specified amount of memory. as we have specified 4mb memory in memory cgroup test. , processes under this cgroup are not allowed to exceed this limit.

sudo cgexec -g memory:test ~/Downloads/memeat 1M
1 mb working fine
sudo cgexec -g memory:test ~/Downloads/memeat 3M

But now if we try to consume 5 mb. Process stopped automatically.

sudo cgexec -g memory:test ~/Downloads/memeat 5M

You can install this memeat app from repo below. It requires you to compile it :) first.

C) Namespsaces:

While cgroups control how much resources a process can use, Namespaces control what a process can see and access. It helps in isolation of global system resources between independent processes. lsns command will show all available namespaces


  1. PID - Isolates the PIDs
  2. UTS -Isolates hostnames and domains
  3. Network -Isolates network interfaces
  4. IPC -isolate interprocess communication (IPC) resources
  5. Mount -Isolates filesystem mount points
  6. User - isolate UID/GID number spaces
  7. Cgroup - isolate cgroup root directory
Cgroup      CLONE_NEWCGROUP   Cgroup root directory
IPC CLONE_NEWIPC System V IPC, POSIX message queues
Network CLONE_NEWNET Network devices, stacks, ports, etc.
Mount CLONE_NEWNS Mount points
User CLONE_NEWUSER User and group IDs
UTS CLONE_NEWUTS Hostname and NIS domain name

Let’s some examples-

unshare is a command which can be used to create namespaces.(

a) PID namespace-

sudo unshare --fork --pid --mount-proc bash

As you can see we are unable to see other processes running. Our bash process is in a new isolated PID namespace.

We can also assign same namespace to two or more containers! for e.g lets put two containers in same pid namespace:

option for doing this is-

docker run -it --name alpine1 alpine sh
docker run -it --pid=container:alpine1 --name alpine2 alpine sh

Here we have just started two containers named alpine1 and alpine2 in same pid namespace.

b) Network namespace:

i) Creating new network namespace-

ip netns add myns

ii) Listing namespaces-

ip netns list

iii) Running bash inside myns network namespace:

ip netns exec myns bash

Note: This interface will not be visible outside

c) Nsenter command-

nsenter -t <PID_of_container> --all /bin/bash

D) Manually creating container like process:

i) First we need a filesystem which we will mount at root(/) for the container process.

i) Downloading and extracting files and folders:

We can either do this manually layer by layer like below-

curl -s --output l2
mkdir ./rootfs
tar -xvf ../l2

ii)Or we can directly copy filesystem of a running container-

a) Add the below lines in conf file(/etc/containers/registries.conf) to be able to use custom repository:


b) Pulling the image and running container-

podman run -it

c) Now we will be able to copy

d) Run the below script:

cd /root/Desktop/fs
cgcreate -g cpu,cpuacct,memory:apna_cgroup
cgset -r cpu.shares=512 apna_cgroup
cgset -r memory.limit_in_bytes=1000000000 my_cgroup
cgexec -g "cpu,cpuacct,memory:my_crgoup" unshare -f -m -u -i -p -n --mount-proc chroot /root/Desktop/fs /bin/sh -c "/bin/mount -t proc proc /proc && /bin/sh"

Thanks for reading!




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store