LXD 2.0: Live migration [9/12]

This is the ninth blog post in this series about LXD 2.0.

LXD logo

Introduction

One of the very exciting feature of LXD 2.0, albeit experimental, is the support for container checkpoint and restore.

Simply put, checkpoint/restore means that the running container state can be serialized down to disk and then restored, either on the same host as a stateful snapshot of the container or on another host which equates to live migration.

Requirements

To have access to container live migration and stateful snapshots, you need the following:

  • A very recent Linux kernel, 4.4 or higher.
  • CRIU 2.0, possibly with some cherry-picked commits depending on your exact kernel configuration.
  • Run LXD directly on the host. It’s not possible to use those features with container nesting.
  • For migration, the target machine must at least implement the instruction set of the source, the target kernel must at least offer the same syscalls as the source and any kernel filesystem which was mounted on the source must also be mountable on the target.

All the needed dependencies are provided by Ubuntu 16.04 LTS, in which case, all you need to do is install CRIU itself:

apt install criu

Using the thing

Stateful snapshots

A normal container snapshot looks like:

stgraber@dakara:~$ lxc snapshot c1 first
stgraber@dakara:~$ lxc info c1 | grep first
 first (taken at 2016/04/25 19:35 UTC) (stateless)

A stateful snapshot instead looks like:

stgraber@dakara:~$ lxc snapshot c1 second --stateful
stgraber@dakara:~$ lxc info c1 | grep second
 second (taken at 2016/04/25 19:36 UTC) (stateful)

This means that all the container runtime state was serialized to disk and included as part of the snapshot. Restoring one such snapshot is done as you would a stateless one:

stgraber@dakara:~$ lxc restore c1 second
stgraber@dakara:~$

Stateful stop/start

Say you want to reboot your server for a kernel update or similar maintenance. Rather than have to wait for all the containers to start from scratch after reboot, you can do:

stgraber@dakara:~$ lxc stop c1 --stateful

The container state will be written to disk and then picked up the next time you start it.

You can even look at what the state looks like:

root@dakara:~# tree /var/lib/lxd/containers/c1/rootfs/state/
/var/lib/lxd/containers/c1/rootfs/state/
├── cgroup.img
├── core-101.img
├── core-102.img
├── core-107.img
├── core-108.img
├── core-109.img
├── core-113.img
├── core-114.img
├── core-122.img
├── core-125.img
├── core-126.img
├── core-127.img
├── core-183.img
├── core-1.img
├── core-245.img
├── core-246.img
├── core-50.img
├── core-52.img
├── core-95.img
├── core-96.img
├── core-97.img
├── core-98.img
├── dump.log
├── eventfd.img
├── eventpoll.img
├── fdinfo-10.img
├── fdinfo-11.img
├── fdinfo-12.img
├── fdinfo-13.img
├── fdinfo-14.img
├── fdinfo-2.img
├── fdinfo-3.img
├── fdinfo-4.img
├── fdinfo-5.img
├── fdinfo-6.img
├── fdinfo-7.img
├── fdinfo-8.img
├── fdinfo-9.img
├── fifo-data.img
├── fifo.img
├── filelocks.img
├── fs-101.img
├── fs-113.img
├── fs-122.img
├── fs-183.img
├── fs-1.img
├── fs-245.img
├── fs-246.img
├── fs-50.img
├── fs-52.img
├── fs-95.img
├── fs-96.img
├── fs-97.img
├── fs-98.img
├── ids-101.img
├── ids-113.img
├── ids-122.img
├── ids-183.img
├── ids-1.img
├── ids-245.img
├── ids-246.img
├── ids-50.img
├── ids-52.img
├── ids-95.img
├── ids-96.img
├── ids-97.img
├── ids-98.img
├── ifaddr-9.img
├── inetsk.img
├── inotify.img
├── inventory.img
├── ip6tables-9.img
├── ipcns-var-10.img
├── iptables-9.img
├── mm-101.img
├── mm-113.img
├── mm-122.img
├── mm-183.img
├── mm-1.img
├── mm-245.img
├── mm-246.img
├── mm-50.img
├── mm-52.img
├── mm-95.img
├── mm-96.img
├── mm-97.img
├── mm-98.img
├── mountpoints-12.img
├── netdev-9.img
├── netlinksk.img
├── netns-9.img
├── netns-ct-9.img
├── netns-exp-9.img
├── packetsk.img
├── pagemap-101.img
├── pagemap-113.img
├── pagemap-122.img
├── pagemap-183.img
├── pagemap-1.img
├── pagemap-245.img
├── pagemap-246.img
├── pagemap-50.img
├── pagemap-52.img
├── pagemap-95.img
├── pagemap-96.img
├── pagemap-97.img
├── pagemap-98.img
├── pages-10.img
├── pages-11.img
├── pages-12.img
├── pages-13.img
├── pages-1.img
├── pages-2.img
├── pages-3.img
├── pages-4.img
├── pages-5.img
├── pages-6.img
├── pages-7.img
├── pages-8.img
├── pages-9.img
├── pipes-data.img
├── pipes.img
├── pstree.img
├── reg-files.img
├── remap-fpath.img
├── route6-9.img
├── route-9.img
├── rule-9.img
├── seccomp.img
├── sigacts-101.img
├── sigacts-113.img
├── sigacts-122.img
├── sigacts-183.img
├── sigacts-1.img
├── sigacts-245.img
├── sigacts-246.img
├── sigacts-50.img
├── sigacts-52.img
├── sigacts-95.img
├── sigacts-96.img
├── sigacts-97.img
├── sigacts-98.img
├── signalfd.img
├── stats-dump
├── timerfd.img
├── tmpfs-dev-104.tar.gz.img
├── tmpfs-dev-109.tar.gz.img
├── tmpfs-dev-110.tar.gz.img
├── tmpfs-dev-112.tar.gz.img
├── tmpfs-dev-114.tar.gz.img
├── tty.info
├── unixsk.img
├── userns-13.img
└── utsns-11.img

0 directories, 154 files

Restoring the container can be done with a simple:

stgraber@dakara:~$ lxc start c1

Live migration

Live migration is basically the same as the stateful stop/start above, except that the container directory and configuration happens to be moved to another machine too.

stgraber@dakara:~$ lxc list c1
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc move c1 s-tollana:

stgraber@dakara:~$ lxc list c1
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

Limitations

As I said before, checkpoint/restore of containers is still pretty new and we’re still very much working on this feature, fixing issues as we are made aware of them. We do need more people trying this feature and sending us feedback, I would however not recommend using this in production just yet.

The current list of issues we’re tracking is available on Launchpad.

We expect a basic Ubuntu container with a few services to work properly with CRIU in Ubuntu 16.04. However more complex containers, using device passthrough, complex network services or special storage configurations are likely to fail.

Whenever possible, CRIU will fail at dump time, rather than at restore time. In such cases, the source container will keep running, the snapshot or migration will simply fail and a log file will be generated for debugging.

In rare cases, CRIU fails to restore the container, in which case the source container will still be around but will be stopped and will have to be manually restarted.

Sending bug reports

We’re tracking bugs related to checkpoint/restore against the CRIU Ubuntu package on Launchpad. Most of the work to fix those bugs will then happen upstream either on CRIU itself or the Linux kernel, but it’s easier for us to track things this way.

To file a new bug report, head here.

Please make sure to include:

  • The command you ran and the error message as displayed to you
  • Output of “lxc info” (*)
  • Output of “lxc info <container name>”
  • Output of “lxc config show –expanded <container name>”
  • Output of “dmesg” (*)
  • Output of “/proc/self/mountinfo” (*)
  • Output of “lxc exec <container name> — cat /proc/self/mountinfo”
  • Output of “uname -a” (*)
  • The content of /var/log/lxd.log (*)
  • The content of /etc/default/lxd-bridge (*)
  • A tarball of /var/log/lxd/<container name>/ (*)

If reporting a migration bug as opposed to a stateful snapshot or stateful stop bug, please include the data for both the source and target for any of the above which has been marked with a (*).

Extra information

The CRIU website can be found at: https://criu.org

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 2 Comments

Directly interacting with the LXD API

The next post in the LXD series is currently blocked on a pending kernel fix, so I figured I’d do an out of series post on how to use the LXD API directly.

LXD logo

Setting up the LXD daemon

The LXD REST API can be accessed over either a local Unix socket or over HTTPs. The protocol in both case is identical, the only difference being that the Unix socket is plain text, relying on the filesystem for authentication.

To enable remote connections to you LXD daemon, run:

lxc config set core.https_address "[::]:8443"

This will have it bind all addresses on port 8443.

To setup a trust relationship with a new client, a password is required, you can set one with:

lxc config set core.trust_password <some random password>

Local or remote

curl over unix socket

As mentioned above, the Unix socket doesn’t need authentication, so with a recent version of curl, you can just do:

stgraber@castiana:~$ curl --unix-socket /var/lib/lxd/unix.socket s/
{"type":"sync","status":"Success","status_code":200,"metadata":["/1.0"]}

Not the most readable output. You can make it a lot more readable by using jq:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket s/ | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": [
  "/1.0"
 ]
}

curl over the network (and client authentication)

The REST API is authenticated by the use of client certificates. LXD generates one when you first use the command line client, so we’ll be using that one, but you could generate your own with openssl if you wanted to.

First, lets confirm that this particular certificate isn’t trusted:

curl -s -k --cert ~/.config/lxc/client.crt --key ~/.config/lxc/client.key https://127.0.0.1:8443/1.0 | jq .metadata.auth
"untrusted"

Now, lets tell the server to add it by giving it the password that we set earlier:

stgraber@castiana:~$ curl -s -k --cert ~/.config/lxc/client.crt --key ~/.config/lxc/client.key https://127.0.0.1:8443/1.0/certificates -X POST -d '{"type": "client", "password": "some-password"}' | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {}
}

And now confirm that we are properly authenticated:

stgraber@castiana:~$ curl -s -k --cert ~/.config/lxc/client.crt --key ~/.config/lxc/client.key https://127.0.0.1:8443/1.0 | jq .metadata.auth
"trusted"

And confirm that things look the same as over the Unix socket:

stgraber@castiana:~$ curl -s -k --cert ~/.config/lxc/client.crt --key ~/.config/lxc/client.key https://127.0.0.1:8443/ | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": [
  "/1.0"
 ]
}

Walking through the API

To keep the commands short, all my examples will be using the local Unix socket, you can add the arguments shown above to get this to work over the HTTPs connection.

Note that in an untrusted environment (so anything but localhost), you should also pass LXD the expected server certificate so that you can confirm that you’re talking to the right machine and aren’t the target of a man in the middle attack.

Server information

You can get server runtime information with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0 | jq .
 {
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "api_extensions": [],
  "api_status": "stable",
  "api_version": "1.0",
  "auth": "trusted",
  "config": {
   "core.https_address": "[::]:8443",
   "core.trust_password": true,
   "storage.zfs_pool_name": "encrypted/lxd"
  },
  "environment": {
   "addresses": [
    "192.168.54.140:8443",
    "10.212.54.1:8443",
    "[2001:470:b368:4242::1]:8443"
   ],
   "architectures": [
    "x86_64",
    "i686"
   ],
   "certificate": "BIG PEM BLOB",
   "driver": "lxc",
   "driver_version": "2.0.0",
   "kernel": "Linux",
   "kernel_architecture": "x86_64",
   "kernel_version": "4.4.0-18-generic",
   "server": "lxd",
   "server_pid": 26227,
   "server_version": "2.0.0",
   "storage": "zfs",
   "storage_version": "5"
  },
  "public": false
 }
}

Everything except the config section is read-only and so doesn’t need to be sent back when updating, so say we want to unset that trust password and have LXD stop listening over https, we can do that with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X PUT -d '{"config": {"storage.zfs_pool_name": "encrypted/lxd"}}' a/1.0 | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {}
}

Operations

For anything that could take more than a second, LXD will use a background operation. That’s to make it easier for the client to do multiple requests in parallel and to limit the number of connections to the server.

You can list all current operations with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/operations | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "running": [
   "/1.0/operations/008bc02e-21a0-4070-a28c-633b79a46517"
  ]
 }
}

And get more details on it with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/operations/008bc02e-21a0-4070-a28c-633b79a46517 | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "id": "008bc02e-21a0-4070-a28c-633b79a46517",
  "class": "task",
  "created_at": "2016-04-18T22:24:54.469437937+01:00",
  "updated_at": "2016-04-18T22:25:22.42813972+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/blah"
   ]
  },
  "metadata": {
   "download_progress": "48%"
  },
  "may_cancel": false,
  "err": ""
 }
}

In this case, it was me creating a new container called “blah” and the image is tracking the needed download, in this case of the Ubuntu 14.04 image.

You can subscribe to all operation notifications by using the /1.0/events websocket, or if your client isn’t that smart, you can just block on the operation with:

curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/operations/b1f57056-c79b-4d3c-94bf-50b5c47a85ad/wait | jq .

Which will print a copy of the operation status (same as above) once the operation reaches a terminal state (success, failure or canceled).

The other endpoints

The REST API currently has the following endpoints:

  • /
    • /1.0
      • /1.0/certificates
        • /1.0/certificates/<fingerprint>
      • /1.0/containers
        • /1.0/containers/<name>
          • /1.0/containers/<name>/exec
        • /1.0/containers/<name>/files
        • /1.0/containers/<name>/snapshots
        • /1.0/containers/<name>/snapshots/<name>
        • /1.0/containers/<name>/state
        • /1.0/containers/<name>/logs
        • /1.0/containers/<name>/logs/<logfile>
      • /1.0/events
      • /1.0/images
        • /1.0/images/<fingerprint>
          • /1.0/images/<fingerprint>/export
      • /1.0/images/aliases
        • /1.0/images/aliases/<name>
      • /1.0/networks
        • /1.0/networks/<name>
      • /1.0/operations
        • /1.0/operations/<uuid>
          • /1.0/operations/<uuid>/wait
        • /1.0/operations/<uuid>/websocket
      • /1.0/profiles
        • /1.0/profiles/<name>

Detailed documentation on the various actions for each of them can be found here.

Basic container life-cycle

Going through absolutely everything above would make this blog post enormous, so lets just focus on the most basic things, creating a container, starting it, dealing with files a bit, creating a snapshot and deleting the whole thing.

Create

To create a container named “xenial” from an Ubuntu 16.04 image coming from https://cloud-images.ubuntu.com/daily (also known as ubuntu-daily:16.04 in the client), you need to run:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X POST -d '{"name": "xenial", "source": {"type": "image", "protocol": "simplestreams", "server": "https://cloud-images.ubuntu.com/daily", "alias": "16.04"}}' a/1.0/containers | jq .
{
 "type": "async",
 "status": "Operation created",
 "status_code": 100,
 "metadata": {
  "id": "e2714931-470e-452a-807c-c1be19cdff0d",
  "class": "task",
  "created_at": "2016-04-18T22:36:20.935649438+01:00",
  "updated_at": "2016-04-18T22:36:20.935649438+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": null,
  "may_cancel": false,
  "err": ""
 },
 "operation": "/1.0/operations/e2714931-470e-452a-807c-c1be19cdff0d"
}

This confirms that the container creation was received. We can check for progress with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/operations/e2714931-470e-452a-807c-c1be19cdff0d | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "id": "e2714931-470e-452a-807c-c1be19cdff0d",
  "class": "task",
  "created_at": "2016-04-18T22:36:20.935649438+01:00",
  "updated_at": "2016-04-18T22:36:31.135038483+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": {
  "download_progress": "19%"
 },
 "may_cancel": false,
 "err": ""
 }
}

And finally wait until it’s done with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/operations/e2714931-470e-452a-807c-c1be19cdff0d/wait | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "id": "e2714931-470e-452a-807c-c1be19cdff0d",
  "class": "task",
  "created_at": "2016-04-18T22:36:20.935649438+01:00",
  "updated_at": "2016-04-18T22:38:01.385511623+01:00",
  "status": "Success",
  "status_code": 200,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": {
   "download_progress": "100%"
  },
  "may_cancel": false,
  "err": ""
 }
}

Start

Starting the container is done my modifying its running state:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X PUT -d '{"action": "start"}' a/1.0/containers/xenial/state | jq .
{
 "type": "async",
 "status": "Operation created",
 "status_code": 100,
 "metadata": {
  "id": "614ac9f0-f0fc-4351-9e6f-14710fd93542",
  "class": "task",
  "created_at": "2016-04-18T22:39:42.766830946+01:00",
  "updated_at": "2016-04-18T22:39:42.766830946+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": null,
  "may_cancel": false,
  "err": ""
 },
 "operation": "/1.0/operations/614ac9f0-f0fc-4351-9e6f-14710fd93542"
}

If you’re doing this by hand as I am right now, there’s no way you can actually access that operation and wait for it to finish as it’s very very quick and data about past operations disappears 5 seconds after they’re done.

You can however check the container state:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X GET a/1.0/containers/xenial/state | jq .metadata.status
"Running"

Or even get its IP address(es) with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X GET a/1.0/containers/xenial/state | jq .metadata.network.eth0.addresses
[
 {
  "family": "inet",
  "address": "10.212.54.43",
  "netmask": "24",
  "scope": "global"
 },
 {
  "family": "inet6",
  "address": "2001:470:b368:4242:216:3eff:fe17:331c",
  "netmask": "64",
  "scope": "global"
 },
 {
  "family": "inet6",
  "address": "fe80::216:3eff:fe17:331c",
  "netmask": "64",
  "scope": "link"
 }
]

Read a file

Reading a file from the container is ridiculously easy:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X GET a/1.0/containers/xenial/files?path=/etc/hosts
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Push a file

Pushing a file is only more difficult because you need to set the Content-Type to octet-stream:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X POST -H "Content-Type: application/octet-stream" -d 'abc' a/1.0/containers/xenial/files?path=/tmp/a
{"type":"sync","status":"Success","status_code":200,"metadata":{}}

We can then confirm it worked with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X GET a/1.0/containers/xenial/files?path=/tmp/a
abc

Snapshot

To make a snapshot, just run:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X POST -d '{"name": "my-snapshot"}' a/1.0/containers/xenial/snapshots | jq .
{
 "type": "async",
 "status": "Operation created",
 "status_code": 100,
 "metadata": {
  "id": "d68141de-0c13-419c-a21c-13e30de29154",
  "class": "task",
  "created_at": "2016-04-18T22:54:04.148986484+01:00",
  "updated_at": "2016-04-18T22:54:04.148986484+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": null,
  "may_cancel": false,
  "err": ""
 },
 "operation": "/1.0/operations/d68141de-0c13-419c-a21c-13e30de29154"
}

And you can then get all the details about it:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X GET a/1.0/containers/xenial/snapshots/my-snapshot | jq .
{
 "type": "sync",
 "status": "Success",
 "status_code": 200,
 "metadata": {
  "architecture": "x86_64",
  "config": {
   "volatile.base_image": "0b06c2858e2efde5464906c93eb9593a29bf46d069cf8d007ada81e5ab80613c",
   "volatile.eth0.hwaddr": "00:16:3e:17:33:1c",
   "volatile.last_state.idmap": "[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]"
  },
  "created_at": "2016-04-18T21:54:04Z",
  "devices": {
   "root": {
    "path": "/",
    "type": "disk"
   }
  },
  "ephemeral": false,
  "expanded_config": {
   "volatile.base_image": "0b06c2858e2efde5464906c93eb9593a29bf46d069cf8d007ada81e5ab80613c",
   "volatile.eth0.hwaddr": "00:16:3e:17:33:1c",
   "volatile.last_state.idmap": "[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]"
  },
  "expanded_devices": {
   "eth0": {
    "name": "eth0",
    "nictype": "bridged",
    "parent": "lxdbr0",
    "type": "nic"
   },
   "root": {
    "path": "/",
    "type": "disk"
   }
  },
  "name": "xenial/my-snapshot",
  "profiles": [
   "default"
  ],
  "stateful": false
 }
}

Delete

You can’t delete a running container, so first you must stop it with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X PUT -d '{"action": "stop", "force": true}' a/1.0/containers/xenial/state | jq .
{
 "type": "async",
 "status": "Operation created",
 "status_code": 100,
 "metadata": {
  "id": "97945ec9-f9b0-4fa8-aaba-06e41a9bc2a9",
  "class": "task",
  "created_at": "2016-04-18T22:56:18.28952729+01:00",
  "updated_at": "2016-04-18T22:56:18.28952729+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": null,
  "may_cancel": false,
  "err": ""
 },
 "operation": "/1.0/operations/97945ec9-f9b0-4fa8-aaba-06e41a9bc2a9"
}

Then you can delete it with:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket -X DELETE a/1.0/containers/xenial | jq .
{
 "type": "async",
 "status": "Operation created",
 "status_code": 100,
 "metadata": {
  "id": "439bf4a1-e056-4b76-86ad-bff06169fce1",
  "class": "task",
  "created_at": "2016-04-18T22:56:22.590239576+01:00",
  "updated_at": "2016-04-18T22:56:22.590239576+01:00",
  "status": "Running",
  "status_code": 103,
  "resources": {
   "containers": [
    "/1.0/containers/xenial"
   ]
  },
  "metadata": null,
  "may_cancel": false,
  "err": ""
 },
 "operation": "/1.0/operations/439bf4a1-e056-4b76-86ad-bff06169fce1"
}

And confirm it’s gone:

stgraber@castiana:~$ curl -s --unix-socket /var/lib/lxd/unix.socket a/1.0/containers/xenial | jq .
{
 "error": "not found",
 "error_code": 404,
 "type": "error"
}

Conclusion

The LXD API has been designed to be simple yet powerful, it can easily be used through even the most simple client but also supports advanced features to allow more complex clients to be very efficient.

Our REST API is stable which means that any change we make to it will be fully backward compatible with the API as it was in LXD 2.0. We will only be doing additions to it, no removal or change of behavior for the existing end points.

Support for new features can be detected by the client by looking at the “api_extensions” list from GET /1.0. We currently do not advertise any but will no doubt make use of this very soon.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 3 Comments

LXD 2.0: LXD in LXD [8/12]

This is the eighth blog post in this series about LXD 2.0.

LXD logo

Introduction

In the previous post I covered how to run Docker inside LXD which is a good way to get access to the portfolio of application provided by Docker while running in the safety of the LXD environment.

One use case I mentioned was offering a LXD container to your users and then have them use their container to run Docker. Well, what if they themselves want to run other Linux distributions inside their container using LXD, or even allow another group of people to have access to a Linux system by running a container for them?

Turns out, LXD makes it very simple to allow your users to run nested containers.

Nesting LXD

The most simple case can be shown by using an Ubuntu 16.04 image. Ubuntu 16.04 cloud images come with LXD pre-installed. The daemon itself isn’t running as it’s socket-activated so it doesn’t use any resources until you actually talk to it.

So lets start an Ubuntu 16.04 container with nesting enabled:

lxc launch ubuntu-daily:16.04 c1 -c security.nesting=true

You can also set the security.nesting key on an existing container with:

lxc config set <container name> security.nesting true

Or for all containers using a particular profile with:

lxc profile set <profile name> security.nesting true

With that container started, you can now get a shell inside it, configure LXD and spawn a container:

stgraber@dakara:~$ lxc launch ubuntu-daily:16.04 c1 -c security.nesting=true
Creating c1
Starting c1

stgraber@dakara:~$ lxc exec c1 bash
root@c1:~# lxd init
Name of the storage backend to use (dir or zfs): dir

We detected that you are running inside an unprivileged container.
This means that unless you manually configured your host otherwise,
you will not have enough uid and gid to allocate to your containers.

LXD can re-use your container's own allocation to avoid the problem.
Doing so makes your nested containers slightly less safe as they could
in theory attack their parent container and gain more privileges than
they otherwise would.

Would you like to have your containers share their parent's allocation (yes/no)? yes
Would you like LXD to be available over the network (yes/no)? no
Do you want to configure the LXD bridge (yes/no)? yes
Warning: Stopping lxd.service, but it can still be activated by:
 lxd.socket
LXD has been successfully configured.

root@c1:~# lxc launch ubuntu:14.04 trusty
Generating a client certificate. This may take a minute...
If this is your first time using LXD, you should also run: sudo lxd init

Creating trusty
Retrieving image: 100%
Starting trusty

root@c1:~# lxc list
+--------+---------+-----------------------+----------------------------------------------+------------+-----------+
|  NAME  |  STATE  |         IPV4          |                     IPV6                     |    TYPE    | SNAPSHOTS |
+--------+---------+-----------------------+----------------------------------------------+------------+-----------+
| trusty | RUNNING | 10.153.141.124 (eth0) | fd7:f15d:d1d6:da14:216:3eff:fef1:4002 (eth0) | PERSISTENT | 0         |
+--------+---------+-----------------------+----------------------------------------------+------------+-----------+
root@c1:~#

It really is that simple!

The online demo server

As this post is pretty short, I figured I would spend a bit of time to talk about the demo server we’re running. We also just reached the 10000 sessions mark earlier today!

That server is basically just a normal LXD running inside a pretty beefy virtual machine with a tiny daemon implementing the REST API used by our website.

When you accept the terms of service, a new LXD container is created for you with security.nesting enabled as we saw above. You are then attached to that container as you would when using “lxc exec” except that we’re doing it using websockets and javascript.

The containers you then create inside this environment are all nested LXD containers.
You can then nest even further in there if you want to.

We are using the whole range of LXD resource limitations to prevent one user’s actions from impacting the others and pretty closely monitor the server for any sign of abuse.

If you want to run your own similar server, you can grab the code for our website and the daemon with:

git clone https://github.com/lxc/linuxcontainers.org
git clone https://github.com/lxc/lxd-demo-server

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 9 Comments

LXD 2.0: Docker in LXD [7/12]

This is the seventh blog post in this series about LXD 2.0.

LXD logo

Why run Docker inside LXD

As I briefly covered in the first post of this series, LXD’s focus is system containers. That is, we run a full unmodified Linux distribution inside our containers. LXD for all intent and purposes doesn’t care about the workload running in the container. It just sets up the container namespaces and security policies, then spawns /sbin/init and waits for the container to stop.

Application containers such as those implemented by Docker or Rkt are pretty different in that they are used to distribute applications, will typically run a single main process inside them and be much more ephemeral than a LXD container.

Those two container types aren’t mutually exclusive and we certainly see the value of using Docker containers to distribute applications. That’s why we’ve been working hard over the past year to make it possible to run Docker inside LXD.

This means that with Ubuntu 16.04 and LXD 2.0, you can create containers for your users who will then be able to connect into them just like a normal Ubuntu system and then run Docker to install the services and applications they want.

Requirements

There are a lot of moving pieces to make all of this working and we got it all included in Ubuntu 16.04:

  • A kernel with CGroup namespace support (4.4 Ubuntu or 4.6 mainline)
  • LXD 2.0 using LXC 2.0 and LXCFS 2.0
  • A custom version of Docker (or one built with all the patches that we submitted)
  • A Docker image which behaves when confined by user namespaces, or alternatively make the parent LXD container a privileged container (security.privileged=true)

Running a basic Docker workload

Enough talking, lets run some Docker containers!

First of all, you need an Ubuntu 16.04 container which you can get with:

lxc launch ubuntu-daily:16.04 docker -p default -p docker

The “-p default -p docker” instructs LXD to apply both the “default” and “docker” profiles to the container. The default profile contains the basic network configuration while the docker profile tells LXD to load a few required kernel modules and set up some mounts for the container. The docker profile also enables container nesting.

Now lets make sure the container is up to date and install docker:

lxc exec docker -- apt update
lxc exec docker -- apt dist-upgrade -y
lxc exec docker -- apt install docker.io -y

And that’s it! You’ve got Docker installed and running in your container.
Now lets start a basic web service made of two Docker containers:

stgraber@dakara:~$ lxc exec docker -- docker run --detach --name app carinamarina/hello-world-app
Unable to find image 'carinamarina/hello-world-app:latest' locally
latest: Pulling from carinamarina/hello-world-app
efd26ecc9548: Pull complete 
a3ed95caeb02: Pull complete 
d1784d73276e: Pull complete 
72e581645fc3: Pull complete 
9709ddcc4d24: Pull complete 
2d600f0ec235: Pull complete 
c4cf94f61cbd: Pull complete 
c40f2ab60404: Pull complete 
e87185df6de7: Pull complete 
62a11c66eb65: Pull complete 
4c5eea9f676d: Pull complete 
498df6a0d074: Pull complete 
Digest: sha256:6a159db50cb9c0fbe127fb038ed5a33bb5a443fcdd925ec74bf578142718f516
Status: Downloaded newer image for carinamarina/hello-world-app:latest
c8318f0401fb1e119e6c5bb23d1e706e8ca080f8e44b42613856ccd0bf8bfb0d

stgraber@dakara:~$ lxc exec docker -- docker run --detach --name web --link app:helloapp -p 80:5000 carinamarina/hello-world-web
Unable to find image 'carinamarina/hello-world-web:latest' locally
latest: Pulling from carinamarina/hello-world-web
efd26ecc9548: Already exists 
a3ed95caeb02: Already exists 
d1784d73276e: Already exists 
72e581645fc3: Already exists 
9709ddcc4d24: Already exists 
2d600f0ec235: Already exists 
c4cf94f61cbd: Already exists 
c40f2ab60404: Already exists 
e87185df6de7: Already exists 
f2d249ff479b: Pull complete 
97cb83fe7a9a: Pull complete 
d7ce7c58a919: Pull complete 
Digest: sha256:c31cf04b1ab6a0dac40d0c5e3e64864f4f2e0527a8ba602971dab5a977a74f20
Status: Downloaded newer image for carinamarina/hello-world-web:latest
d7b8963401482337329faf487d5274465536eebe76f5b33c89622b92477a670f

With those two Docker containers now running, we can then get the IP address of our LXD container and access the service!

stgraber@dakara:~$ lxc list
+--------+---------+----------------------+----------------------------------------------+------------+-----------+
|  NAME  |  STATE  |         IPV4         |                      IPV6                    |    TYPE    | SNAPSHOTS |
+--------+---------+----------------------+----------------------------------------------+------------+-----------+
| docker | RUNNING | 172.17.0.1 (docker0) | 2001:470:b368:4242:216:3eff:fe55:45f4 (eth0) | PERSISTENT | 0         |
|        |         | 10.178.150.73 (eth0) |                                              |            |           |
+--------+---------+----------------------+----------------------------------------------+------------+-----------+

stgraber@dakara:~$ curl http://10.178.150.73
The linked container said... "Hello World!"

Conclusion

That’s it! It’s really that simple to run Docker containers inside a LXD container.

Now as I mentioned earlier, not all Docker images will behave as well as my example, that’s typically because of the extra confinement that comes with LXD, specifically the user namespace.

Only the overlayfs storage driver of Docker works in this mode. That storage driver may come with its own set of limitation which may further limit how many images will work in this environment.

If your workload doesn’t work properly and you trust the user inside the LXD container, you can try:

lxc config set docker security.privileged true
lxc restart docker

That will de-activate the user namespace and will run the container in privileged mode.
Note however that in this mode, root inside the container is the same uid as root on the host. There are a number of known ways for users to escape such containers and gain root privileges on the host, so you should only ever do that if you’d trust the user inside your LXD container with root privileges on the host.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 11 Comments

LXD 2.0: Remote hosts and container migration [6/12]

This is the sixth blog post in this series about LXD 2.0.

LXD logo

Remote protocols

LXD 2.0 supports two protocols:

  • LXD 1.0 API: That’s the REST API used between the clients and a LXD daemon as well as between LXD daemons when copying/moving images and containers.
  • Simplestreams: The Simplestreams protocol is a read-only, image-only protocol used by both the LXD client and daemon to get image information and import images from some public image servers (like the Ubuntu images).

Everything below will be using the first of those two.

Security

Authentication for the LXD API is done through client certificate authentication over TLS 1.2 using recent ciphers. When two LXD daemons must exchange information directly, a temporary token is generated by the source daemon and transferred through the client to the target daemon. This token may only be used to access a particular stream and is immediately revoked so cannot be re-used.

To avoid Man In The Middle attacks, the client tool also sends the certificate of the source server to the target. That means that for a particular download operation, the target server is provided with the source server URL, a one-time access token for the resource it needs and the certificate that the server is supposed to be using. This prevents MITM attacks and only give temporary access to the object of the transfer.

Network requirements

LXD 2.0 uses a model where the target of an operation (the receiving end) is connecting directly to the source to fetch the data.

This means that you must ensure that the target server can connect to the source directly, updating any needed firewall along the way.

We have a plan to allow this to be reversed and also to allow proxying through the client itself for those rare cases where draconian firewalls are preventing any communication between the two hosts.

Interacting with remote hosts

Rather than having our users have to always provide hostname or IP addresses and then validating certificate information whenever they want to interact with a remote host, LXD is using the concept of “remotes”.

By default, the only real LXD remote configured is “local:” which also happens to be the default remote (so you don’t have to type its name). The local remote uses the LXD REST API to talk to the local daemon over a unix socket.

Adding a remote

Say you have two machines with LXD installed, your local machine and a remote host that we’ll call “foo”.

First you need to make sure that “foo” is listening to the network and has a password set, so get a remote shell on it and run:

lxc config set core.https_address [::]:8443
lxc config set core.trust_password something-secure

Now on your local LXD, we just need to make it visible to the network so we can transfer containers and images from it:

lxc config set core.https_address [::]:8443

Now that the daemon configuration is done on both ends, you can add “foo” to your local client with:

lxc remote add foo 1.2.3.4

(replacing 1.2.3.4 by your IP address or FQDN)

You’ll see something like this:

stgraber@dakara:~$ lxc remote add foo 2607:f2c0:f00f:2770:216:3eff:fee1:bd67
Certificate fingerprint: fdb06d909b77a5311d7437cabb6c203374462b907f3923cefc91dd5fce8d7b60
ok (y/n)? y
Admin password for foo: 
Client certificate stored at server: foo

You can then list your remotes and you’ll see “foo” listed there:

stgraber@dakara:~$ lxc remote list
+-----------------+-------------------------------------------------------+---------------+--------+--------+
|      NAME       |                         URL                           |   PROTOCOL    | PUBLIC | STATIC |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| foo             | https://[2607:f2c0:f00f:2770:216:3eff:fee1:bd67]:8443 | lxd           | NO     | NO     |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| images          | https://images.linuxcontainers.org:8443               | lxd           | YES    | NO     |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| local (default) | unix://                                               | lxd           | NO     | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| ubuntu          | https://cloud-images.ubuntu.com/releases              | simplestreams | YES    | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| ubuntu-daily    | https://cloud-images.ubuntu.com/daily                 | simplestreams | YES    | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+

Interacting with it

Ok, so we have a remote server defined, what can we do with it now?

Well, just about everything you saw in the posts until now, the only difference being that you must tell LXD what host to run against.

For example:

lxc launch ubuntu:14.04 c1

Will run on the default remote (“lxc remote get-default”) which is your local host.

lxc launch ubuntu:14.04 foo:c1

Will instead run on foo.

Listing running containers on a remote host can be done with:

stgraber@dakara:~$ lxc list foo:
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| NAME |  STATE  |         IPV4        |                     IPV6                      |    TYPE    | SNAPSHOTS |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.245.81.95 (eth0) | 2607:f2c0:f00f:2770:216:3eff:fe43:7994 (eth0) | PERSISTENT | 0         |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+

One thing to keep in mind is that you have to specify the remote host for both images and containers. So if you have a local image called “my-image” on “foo” and want to create a container called “c2” from it, you have to run:

lxc launch foo:my-image foo:c2

Finally, getting a shell into a remote container works just as you would expect:

lxc exec foo:c1 bash

Copying containers

Copying containers between hosts is as easy as it sounds:

lxc copy foo:c1 c2

And you’ll have a new local container called “c2” created from a copy of the remote “c1” container. This requires “c1” to be stopped first, but you could just copy a snapshot instead and do it while the source container is running:

lxc snapshot foo:c1 current
lxc copy foo:c1/current c3

Moving containers

Unless you’re doing live migration (which will be covered in a later post), you have to stop the source container prior to moving it, after which everything works as you’d expect.

lxc stop foo:c1
lxc move foo:c1 local:

This example is functionally identical to:

lxc stop foo:c1
lxc move foo:c1 c1

How this all works

Interactions with remote containers work as you would expect, rather than using the REST API over a local Unix socket, LXD just uses the exact same API over a remote HTTPS transport.

Where it gets a bit trickier is when interaction between two daemons must occur, as is the case for copy and move.

In those cases the following happens:

  1. The user runs “lxc move foo:c1 c1”.
  2. The client contacts the local: remote to check for an existing “c1” container.
  3. The client fetches container information from “foo”.
  4. The client requests a migration token from the source “foo” daemon.
  5. The client sends that migration token as well as the source URL and “foo”‘s certificate to the local LXD daemon alongside the container configuration and devices.
  6. The local LXD daemon then connects directly to “foo” using the provided token
    1. It connects to a first control websocket
    2. It negotiates the filesystem transfer protocol (zfs send/receive, btrfs send/receive or plain rsync)
    3. If available locally, it unpacks the image which was used to create the source container. This is to avoid needless data transfer.
    4. It then transfers the container and any of its snapshots as a delta.
  7. If succesful, the client then instructs “foo” to delete the source container.

Try all this online

Don’t have two machines to try remote interactions and moving/copying containers?

That’s okay, you can test it all online using our demo service.
The included step-by-step walkthrough even covers it!

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 5 Comments