Bio : This talk will be presented by Mickey Perre, Solutions Engineer at Splunk. Mickey sees himself as a traditional hacker that learns by doing.. and using a bit of elbow grease to get code working :). Before Splunk Mickey worked in various roles across Government, Legal, Gambling and healthcare. When Mickey isn’t coding he uses his time to experiment with home brewing. So you could say he likes to crush code, malt or both.
Second Talk – Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and Cassandra
Abstract: Geospatial data makes it possible to leverage location, location, location! Geospatial data is taking off, as companies realize that just about everyone needs the benefits of geospatially aware applications. As a result there are no shortages of unique but demanding use cases of how enterprises are leveraging large-scale and fast geospatial big data processing. The data must be processed in large quantities – and quickly – to reveal hidden spatiotemporal insights vital to businesses and their end users. In the rush to tap into geospatial data, many enterprises will find that representing, indexing and querying geospatially-enriched data is more complex than they anticipated – and might bring about tradeoffs between accuracy, latency, and throughput.
This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can process geospatial data. In order to find location-specific anomalies, we need a way to represent locations, index locations, and query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?”
Bio: This talk will be presented by Paul Brebner. Paul is the Technology Evangelist at Instaclustr. He’s been learning new scalable big data technologies, solving realistic problems and building applications, and blogging about Apache Cassandra, Spark, Zeppelin, and Kafka. Paul has extensive R&D and industry experience in distributed systems, technology innovation, software architecture and engineering, software performance and scalability, grid and cloud computing, and data analytics and machine learning.
5:30 – Welcome, food, drinks, network
6:00 – First talk
6:45 – Second talk
Food, drinks, and giveaways will be provided!