ApacheCon Berlin, 22-24 October 2019

Apache Cassandra
Popular
Elasticsearch
Apache Kafka
Technical

December 02, 2019
By Paul Brebner

ApacheCon Europe, October 22-24, 2019, Kulturbrauerei Berlin #ACEU19 https://aceu19.apachecon.com/

What’s better than one ApacheCon? Another ApacheCon! This year there were two Apache Conferences, one in Las Vegas and then again in Berlin.

They were similar but different. What were some differences between ApacheCon Berlin and Las Vegas? The location. In contrast to the hyper-real gambling oasis in a bone dry desert of Las Vegas, Berlin is a real capital city steeped in history. The venue was a historic brewery (once the largest in the world, but unfortunately no longer brewing so also “dry”, the Kulturbrauerei):

ApacheCon Berlin 2019 - Kulturbrauerei (Source: Paul Brebner)

It has multiple night clubs, and the concrete-bunker like main auditorium of the Kesselhaus (boiler house) is the perfect venue for a heavy metal concert (and the conference keynotes). You also can’t escape the history, as next door to the conference there was a museum of everyday life in East Germany before the wall came down 30 years ago. The East Germans were often creative to cope with the restrictions and shortages of life under Communism, and came up with innovations such as this “camping car”!

(Source: Paul Brebner)

The Berlin ApacheCon was also smaller, but in a more compact location and with less tracks (General, Community, Machine Learning, IoT, Big Data), so on average the talks had more buzz than Las Vegas, with an environment more conducive to catching up with people more than once for ongoing conservations afterwards. There was also an Open Source Design workshop focusing on Usability). I’d met some of the people involved in this at the speakers’ reception and had a lively dinner conversation (perhaps because I was the only non-designer at the table so asked lots of silly questions). It’s good to see Open Source UX design getting the attention it deserves, as once upon a time “Open Source” was synonymous with “Badly Designed’!

This “State of the feather” talk by David Nalley (Executive Vice President, ASF), espoused the Apache Way resulting in vendor neutrality, independence, trust, and safety for contributors and users (and the photo (Source: Paul Brebner) reveals something of the industrial ambience of the boiler house):

Instaclustr was proud to be one of the ApacheCon EU sponsors:

I had the privilege of kicking off the Machine Learning track held in the (appropriately named) “Maschinenhaus” with my talk “Kafka, Cassandra, and Kubernetes at Scale—Real-time Anomaly detection on 19 billion events a day”.

(Source: Paul Brebner)

My second talk of the day was in the more intimate venue, the “Frannz Salon”, in the IoT track, on “Kongo: Building a Scalable Streaming IoT Application using Apache Kafka”.

I managed to attend some talks by other speakers at ApacheCon Berlin. These are some of the highlights.

This IoT talk intersected with mine in terms of the problem domain (real-time RFID asset tracking), but provided a good explanation of Apache Flink, including Flink pattern matching which is a powerful CEP library: “7 Reasons to use Apache Flink for your IoT Project – How We Built a Real-time Asset Tracking System”.

In the Big Data track, there was a talk on the hot topic of how to use Kubernetes to run Apache software: Patterns and Anti-Patterns of running Apache bigdata projects in Kubernetes.

And finally a talk about an impressive Use Case for Apache Kafka, monitoring the largest machine in the world, the Large Hadron Collider at CERN: “Open Source Big Data Tools accelerating physics research at CERN”. Monitoring data had to be collected from 100s of accelerators and detector controllers, experiment data catalogues, data centres and system logs. Kafka is used as the buffered transport to collect and direct monitoring data, Spark is used to perform intermediate processing including data enrichment and aggregation, storage is provided by HDFS, Elasticsearch etc., and data analysis by Kibana, Grafana, and Zeppelin. This pipeline handles a peak of 500GB of monitoring data a day (with capacity for more). Another innovation is a system called SWAN, to provide ephemeral Spark clusters on Kubernetes, for user-managed Spark resources (provision, control, use, and dispose). A similar scalable pipeline enabled by fully managed Kafka and Elasticsearch is available from Instaclustr.

(Source: Shutterstock)

Berlin was a thought-provoking historical location for ApacheCon Europe. For approaching 30 years (1961-1989) the wall had divided East and West Germany, with incompatible political, economic, and social systems pitted in a stand-off with only meters separating them, and only a few tightly controlled points of access bridging them. However, 30 years ago the wall came down and Berlin was reunified resulting in rapid social and economic changes. In a similar way, over the last 20 years, Apache has shifted the software world from Licences and Lock-in to Free and Open Source, and created a vibrant ecosystem of projects, people, and software. I look forward to repeating the experience next year.

After ApacheCon I had a chance to explore a different sort of history. The Berlin technology museum had an impressive display of old computers, including a reconstruction of the 1st programmable (using punched movie film) digital (but mechanical) computer built in 1938 by Konrad Zuse, the Z1. Perhaps more significantly, in the 1940’s Zuse also designed the first high-level programming language, Plankalkül, evidently conceptually decades ahead of other high-level languages (he included array and parallel processing, and goal-directed execution), and it probably influenced Algol.

Z1 computer in the Berlin Technology Museum (Photograph by Mike Peel)

Finally, the friendly Instaclustr team at our stand at ApacheCon Berlin. If you didn’t have a chance to talk to us in person at ApacheCon, contact us at [email protected].

(Source: Paul Brebner)

ApacheCon Europe, October 22-24, 2019, Kulturbrauerei Berlin #ACEU19 https://aceu19.apachecon.com/

Instaclustr was proud to be one of the ApacheCon EU sponsors:

IoT Overdrive Part 2: Where Can We Improve?

Instaclustr for Apache ZooKeeper ® 3.7.2 and 3.8.4 are Generally Available

Running the Apache Camel™ HTTP Kafka Source Connector on Instaclustr Managed Apache Kafka®