All posts by Theuni

Introducing IO limits to achieve more uniform virtual disk performance

With the upcoming release of our Gentoo platform we will start to regulate the disk performance of all virtual machines. The new rules will help to achieve more uniform performance in our cluster and reduce the impact of load peaks from individual VMs to others.

In the past we have provided the entire performance capacity of our cluster based on demand without any regulation. Load peaks requiring many IOPS (input/output operations per second) could thus be processed quickly. In a mixed environment this usually evens out. However, we are seeing more and more periods when too many VMs have load peaks at the same time. Instead of evening out, those periods result in performance penalties for the remaining virtual machines.

We therefore introduce a limit to the number of operations per second, to achieve a satisfying performance for all customers even during periods of increased load.

Continue reading Introducing IO limits to achieve more uniform virtual disk performance

Ceph performance learnings (long read)

We have been using Ceph since 0.7x back in 2013 already, starting when we were fed up with the open source iSCSI implementations, longing to provide our customers with a more elastic, manageable, and scalable solution. Ceph has generally fulfilled its promises from the perspective of functionality. However, if you have been following this blog or searched for Ceph troubles on Google you will likely have seen our previous posts.

Aside from early software stability issues we had to invest a good amount of manpower (and nerves) into learning how to make Ceph perform acceptably and how all the pieces of hard drives, SSDs, raid controllers, 1- and 10Gbit network, CPU and RAM consumption, Ceph configuration, Qemu drivers, … fit together.

Today, I’d like to present our learnings both from a technical and methodical view. Specifically the methodical aspects should be seen in the retrospective of running a production cluster for a comparatively long time by now, going through version upgrades, hardware changes, and so on. Even if you won’t be bitten by the specific issues of the 0.7x series in the future, the methods may prove useful in the future to avoid navigating into troublesome waters. No promises, though. 🙂

Continue reading Ceph performance learnings (long read)

batou – recent improvements and roadmap update

batou is our open source web application deployment utility. We use it to perform simple and complex application deployments on top of the Flying Circus platform as well as into Vagrant VMs or local developer instances.

At the recent Plone Alpine City Sprint we took the time to improve batou’s documentation a lot. You can read it on https://batou.readthedocs.org. The basic concepts (modelling your application and fitting it into an environment) should be easier to understand and we have made a full guide through all the important features if you want to get started. Also, we are covering API reference and CLI commands almost completely now.

Continue reading batou – recent improvements and roadmap update

Support during Christmas and New Year’s 2015/2016

2015 has been a blast and we’re looking forward to a few quiet days. (Although we’ll be working on some more storage improvements next week.)

We hope that you and your loved ones will find some quiet time during the holidays.

To ensure that all your applications in the Flying Circus are running smoothly we will monitor all regular and emergency support as usual. We won’t be performing non-critical work in this time and will catch up with any backlog early in January 2016.

Here’s an overview of the next days and our support availability. The highlighted days are national or local holidays and are only covered for SLA customers:

  • 2015-12-21 (Monday): regular support
  • 2015-12-22 (Tuesday): regular support
  • 2014-12-23 (Wednesday): regular support
  • 2014-12-24 (Thursday): SLA-covered emergency support only
  • 2014-12-25 (Friday): SLA-covered emergency support only
  • 2014-12-26 (Saturday):SLA-covered emergency support only
  • 2014-12-27 (Sunday): SLA-covered emergency support only
  • 2014-12-28 (Monday): emergency support
  • 2014-12-29 (Tuesday): emergency support
  • 2014-12-30 (Wednesday): emergency support
  • 2014-12-31 (Thursday): SLA-covered emergency support only
  • 2015-01-01 (Friday): SLA-covered emergency support only
  • 2015-01-02 (Saturday): SLA-covered emergency support only
  • 2015-01-03 (Sunday): SLA-covered emergency support only
  • 2015-01-04 (Monday): regular support
  • 2015-01-05 (Tuesday): regular support
  • 2015-01-06 (Wednesday): SLA-covered emergency support only
  • 2015-01-07 (Thursday): regular office hours resuming

Happy holidays and see you in 2016!