Canberra Innovation Network 1 Moore Street #5 Canberra, ACT 2601 Wednesday 19th February 2020

Meetup – Graph Visualisation and Real-time Geospatial Data Processing

Register Now

Join us for a great lineup of talks to understand Cytoscape JS, an open source visualisation framework with graph functions such as path finding and multi layout rendering and promise and pains of using graph engine in JavaScript. Also gain knowledge on Geospatial data and how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka and Cassandra

First talk: Implementing Graph Visualization in JavaScript

Abstract : Graph Databases and visualisations have seen an increase in popularity over the last 10 years. We are currently spoilt for choice when it comes to what Graph Database we can choose. But what happens if you already have a data lake stood up? Do you really want to store your data in two separate locations? In this talk Mickey will introduce Cytoscape JS, an open source visualisation framework with graph functions such as path finding and multi layout rendering. He will also discuss some of the promises and pains of using a graph engine in JavaScript. He will share his code and tips on how to integrate Cytoscape JS into your own Open Source project.

Bio : This talk will be presented by Mickey Perre, Solutions Engineer at Splunk. Mickey sees himself as a traditional hacker that learns by doing.. and using a bit of elbow grease to get code working :). Before Splunk Mickey worked in various roles across Government, Legal, Gambling and healthcare. When Mickey isn’t coding he uses his time to experiment with home brewing. So you could say he likes to crush code, malt or both.

Second Talk – Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and Cassandra

Abstract: Geospatial data makes it possible to leverage location, location, location! Geospatial data is taking off, as companies realize that just about everyone needs the benefits of geospatially aware applications. As a result there are no shortages of unique but demanding use cases of how enterprises are leveraging large-scale and fast geospatial big data processing. The data must be processed in large quantities – and quickly – to reveal hidden spatiotemporal insights vital to businesses and their end users. In the rush to tap into geospatial data, many enterprises will find that representing, indexing and querying geospatially-enriched data is more complex than they anticipated – and might bring about tradeoffs between accuracy, latency, and throughput.

This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can process geospatial data. In order to find location-specific anomalies, we need a way to represent locations, index locations, and query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?”

Bio: This talk will be presented by Paul Brebner. Paul is the Technology Evangelist at Instaclustr. He’s been learning new scalable big data technologies, solving realistic problems and building applications, and blogging about Apache Cassandra, Spark, Zeppelin, and Kafka. Paul has extensive R&D and industry experience in distributed systems, technology innovation, software architecture and engineering, software performance and scalability, grid and cloud computing, and data analytics and machine learning.

Program Agenda

5:30 – Welcome, food, drinks, network

6:00 – First talk

6:45 – Second talk

Food, drinks, and giveaways will be provided!