A) Registery and image layers
Container registry is just a server which will be hosting images(.tar files along with .json file) to be used by multiple remote users. It is similar to docker hub(Your own private dockerhub!). For understanding image layers we will try to manually download each layer and try to create the file system(we are trying to mimic the procedure for create isolated file system which happens when we run a container :)
- For starting a registry server we can run the command(automatically pulls the image from official docker repo and runs the container):
docker run -d -p 5000:5000 --restart=always --name registry registry:latest
2. Pushing our custom image to our new repo:
a) Changing tag name of alpine-
docker pull alpine:latest
docker tag alpine:latest 127.0.0.1:5000/my-alpine
docker push 127.0.0.1:5000/my-alpine
b) Confirming push:
3. Pulling the image:
a) For seeing what actually happens in background we can proxy the podman traffic via burpsuite-
Note: for those who are unfamiliar with podman. It is another better option than using docker to manage contain.(It does not require a daemon running all the time in background. Therefore it can run the container with user permission instead by default running container as root as in docker)
4. Recreating file system manually:
a) First we try to find how many layers are there in the image-
We found 2 layers. Let’s try to download them one by one.
b) Download layers one by one-
curl -s http://192.0.0.7:5000/v2/my-alpine/blobs/sha256:0a97eee8041e2b6c0e65abb2700b0705d0da5525ca69060b9e0bde8a3d17afdb --output l1curl -s http://192.0.0.7:5000/v2/my-alpine/blobs/sha256:97518928ae5f3d52d4164b314a7e73654eb686ecd8aafa0b79acd980773a740d --output l2
c) Checking type of files(it will be either .json or .tar)-
.json contains metadata whereas .tar contains actual file and folders.
d) Now we can just extract the .tar to get actual filesystem-
tar -xvf l2
It is a Kernel feature used for resource management for processes. It has wo versions 1 and 2(1 is in use by default). Resources like CPU, RAM, network access, and so on, are called controllers in the cgroup terminology. e.g controller types:
- Creating a memory cgroup-
sudo cgcreate -g memory:test
2) Setting 4mb max restriction for memory usage-
sudo cgset -r memory.limit_in_bytes=4194304 test
3) Applying restriction and creating new process(memmeat):
memeat is a memory stress testing tool. It will get killed if it does not get specified amount of memory. as we have specified 4mb memory in memory cgroup test. , processes under this cgroup are not allowed to exceed this limit.
sudo cgexec -g memory:test ~/Downloads/memeat 1M
sudo cgexec -g memory:test ~/Downloads/memeat 3M
But now if we try to consume 5 mb. Process stopped automatically.
sudo cgexec -g memory:test ~/Downloads/memeat 5M
You can install this memeat app from repo below. It requires you to compile it :) first.
GitHub - cristiklein/memeat: A utility to eat memory under Linux
This utility uses mmap() and mlock() to eat up RAM. This prevents the kernel from using RAM for more useful purposes…
While cgroups control how much resources a process can use, Namespaces control what a process can see and access. It helps in isolation of global system resources between independent processes. lsns command will show all available namespaces
- PID - Isolates the PIDs
- UTS -Isolates hostnames and domains
- Network -Isolates network interfaces
- IPC -isolate interprocess communication (IPC) resources
- Mount -Isolates filesystem mount points
- User - isolate UID/GID number spaces
- Cgroup - isolate cgroup root directory
Cgroup CLONE_NEWCGROUP Cgroup root directory
IPC CLONE_NEWIPC System V IPC, POSIX message queues
Network CLONE_NEWNET Network devices, stacks, ports, etc.
Mount CLONE_NEWNS Mount points
PID CLONE_NEWPID Process IDs
User CLONE_NEWUSER User and group IDs
UTS CLONE_NEWUTS Hostname and NIS domain name
Let’s some examples-
unshare is a command which can be used to create namespaces.(https://man7.org/linux/man-pages/man1/unshare.1.html)
a) PID namespace-
sudo unshare --fork --pid --mount-proc bash
As you can see we are unable to see other processes running. Our bash process is in a new isolated PID namespace.
We can also assign same namespace to two or more containers! for e.g lets put two containers in same pid namespace:
option for doing this is-
docker run -it --name alpine1 alpine sh
docker run -it --pid=container:alpine1 --name alpine2 alpine sh
Here we have just started two containers named alpine1 and alpine2 in same pid namespace.
b) Network namespace:
i) Creating new network namespace-
ip netns add myns
ii) Listing namespaces-
ip netns list
iii) Running bash inside myns network namespace:
ip netns exec myns bash
Note: This interface will not be visible outside
c) Nsenter command-
nsenter -t <PID_of_container> --all /bin/bash
D) Manually creating container like process:
i) First we need a filesystem which we will mount at root(/) for the container process.
i) Downloading and extracting files and folders:
We can either do this manually layer by layer like below-
curl -s http://22.214.171.124:5000/v2/my-alpine/blobs/sha256:97518928ae5f3d52d4164b314a7e73654eb686ecd8aafa0b79acd980773a740d --output l2
tar -xvf ../l2
ii)Or we can directly copy filesystem of a running container-
a) Add the below lines in conf file(/etc/containers/registries.conf) to be able to use custom repository:
b) Pulling the image and running container-
podman run -it 126.96.36.199:5000/my-alpine:latest
c) Now we will be able to copy
d) Run the below script:
cgcreate -g cpu,cpuacct,memory:apna_cgroup
cgset -r cpu.shares=512 apna_cgroup
cgset -r memory.limit_in_bytes=1000000000 my_cgroup
cgexec -g "cpu,cpuacct,memory:my_crgoup" unshare -f -m -u -i -p -n --mount-proc chroot /root/Desktop/fs /bin/sh -c "/bin/mount -t proc proc /proc && /bin/sh"
Thanks for reading!
Container Security - Pentester Academy
The Container Security labs exhaustively cover the following: Introduction to popular container technologies like…