Linux Namespaces

November 23rd, 2014

Starting from kernel 2.6.24, there are 6 different types of Linux namespaces. Namespaces are useful in isolating processes from the rest of the system, without needing to use full low level virtualization technology.

  • CLONE_NEWIPC: IPC Namespaces: SystemV IPC and POSIX Message Queues can be isolated.
  • CLONE_NEWPID: PID Namespaces: PIDs are isolated, meaning that a PID inside of the namespace can conflict with a PID outside of the namespace. PIDs inside the namespace will be mapped to other PIDs outside of the namespace. The first PID inside the namespace will be ‘1’ which outside of the namespace is assigned to init
  • CLONE_NEWNET: Network Namespaces: Networking (/proc/net, IPs, interfaces and routes) are isolated. Services can be run on the same ports within namespaces, and “duplicate” virtual interfaces can be created.
  • CLONE_NEWNS: Mount Namespaces. We have the ability to isolate mount points as they appear to processes. Using mount namespaces, we can achieve similar functionality to chroot() however with improved security.
  • CLONE_NEWUTS: UTS Namespaces. This namespaces primary purpose is to isolate the hostname and NIS name.
  • CLONE_NEWUSER: User Namespaces. Here, user and group IDs are different inside and outside of namespaces and can be duplicated.

Let’s look first at the structure of a C program, required to demonstrate process namespaces. The following has been tested on Debian 6 and 7.

First, we need to allocate a page of memory on the stack, and set a pointer to the end of that memory page. We use alloca to allocate stack memory rather than malloc which would allocate memory on the heap.

void *mem = alloca(sysconf(_SC_PAGESIZE)) + sysconf(_SC_PAGESIZE);

Next, we use clone to create a child process, passing the location of our child stack ‘mem’, as well as the required flags to specify a new namespace. We specify ‘callee’ as the function to execute within the child space:

mypid = clone(callee, mem, SIGCHLD | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNS | CLONE_FILES, NULL);

After calling clone we then wait for the child process to finish, before terminating the parent. If not, the parent execution flow will continue and terminate immediately after, clearing up the child with it:

while (waitpid(mypid, &r, 0) < 0 && errno == EINTR)
{
	continue;
}

Lastly, we’ll return to the shell with the exit code of the child:

if (WIFEXITED(r))
{
	return WEXITSTATUS(r);
}
return EXIT_FAILURE;

Now, let’s look at the callee function:

static int callee()
{
	int ret;
	mount("proc", "/proc", "proc", 0, "");
	setgid(u);
	setgroups(0, NULL);
	setuid(u);
	ret = execl("/bin/bash", "/bin/bash", NULL);
	return ret;
}

Here, we mount a /proc filesystem, and then set the uid (User ID) and gid (Group ID) to the value of ‘u’ before spawning the /bin/bash shell.
Read the rest of this entry »

Setting up an LVM filesystem

October 20th, 2009

Setting up an LVM filesystem is quite easy assuming you have the right tools installed and a recent kernel. LVM has a lot of advantages, most notably the ability to take snapshots of the current filesystem – this is why LVM is often used in live database environments.

Assuming a Debian Lenny machine, get the relevant packages. Some may already be installed:  apt-get install lvm2 dmsetup mdadm

In this example, we will assuming that /dev/sda is your boot drive, and that you want to leave it out of your LVM array, but include /dev/sdb and /dev/sdc. Both /dev/sdb and /dev/sdc should be of equal sizes.

Firstly, using fdisk, remove any existing partitions with ‘d’, on /dev/sdb and /dev/sdc, and create one new partition to span the drive. Change the partition type to ‘8e’ which is the LVM type.

Now prepare your physical disk for LVM with the ‘pvcreate’ tool:

pvcreate /dev/sdb1 /dev/sdc1

Note that you can reverse this with pvremove. You can also use pvdisplay now to display information on all physical volumes.

Oh – you do realie that you can use /dev/mdX just as easily to create LVM on your RAID devices?

Now, we need to create a ‘volume group’: vgcreate myvg /dev/sdb1 /dev/sdc1

Read the rest of this entry »

Local and Remote Kernel Upgrades – Failsafe Grub

February 28th, 2009

Grub (and LILO too for that matter) has a useful ‘failsafe’ feature that can be configured. This proves especially useful for remote kernel upgrades, where a failed boot will render your machine offline and unavailable.

Here is my standard grub config. I have just added my new 2.6.28 kernel.
Read the rest of this entry »

Linux PPTP (Poptop) VPN Setup with MPPE and MPPC

February 15th, 2009

Here’s a quick guide that I write as I’m setting up PPTP/MPPE/MPPC on a Linux server. My preferred VPN technology is OpenVPN mainly because it’s so quick and easy to set up and use, however in some cases PPTP is required chiefly when the Client wants to use the inbuilt Windows VPN capabilities rather than having to deploy 3rd party software.

My server is a Debian (of course) etch machine, with 2.6.24 (from source) kernel. My client is Windows XP Pro SP3.
Read the rest of this entry »

APNIC Box – Linux on a Mikrotik 532a, Part 3 – Installing Debian, Prebuilt Disk Image

October 5th, 2008

Follow on from 01 Oct 08 APNIC Box – Linux on a Mikrotik 532a, Part 2

The device runs a 2.4.30 kernel on a debian woody (mipsel) environment. If anyone can contribute anything for 2.6.x and debian etch, that would be great.

Installation instructions:

Read the rest of this entry »

Linux virtualization, vmware, xen, hosting, and squeezing the most out of your resources

September 14th, 2008

I’d guess that 90% of hosting providers ‘oversell’. This essentially means that should they have 1,000GB allocated, they might offer 15 packages of 100Gb to 15 of their customers, banking on the fact that no one will fully use their 100GB allocation – Selling 5 Virtual Machines with 256MB RAM on a 1GB host, assuming that no one will use their full RAM allocation. This is bad, because you’ll generally be able to confirm that you’ve been allocated the resources, but nonetheless benchmark tests will show that you’re just not getting them, and your environment will be sluggish and unresponsive. This is the same as airlines selling 110 seats on a 100 seat plane. When that 101st paying customer does show up to claim his seat, he’s stuck without a flight.

The general consensus is that a VPS is a cheaper and lower-grade option than a dedicated service, however VPSs have a number of indisputable advantages over dedicated servers and I’m going to discuss why almost all the dedicated machines I manage are hosts for a range of VPSs.
Read the rest of this entry »