Instaclustr Blog Archive
2020
-
- Elasticsearch
- Technical
What Is Elasticsearch™?
Elasticsearch is an open source scalable search and analytics engine. It takes the open source search engine Apache Lucene and enables it to scale across many machines in a cluster to handle large volumes of data at high speed. Much like the index at the back of a book, the core data structure of Lucene...
Learn MoreInstaclustrDecember 21, 2020 -
- Apache Kafka
- Kafka Connect
- Technical
Getting to Know Apache Camel Kafka Connectors (Pipeline Series Part 3)
In Part 1 and Part 2 of this blog series we started a journey building a real-time pipeline to acquire, ingest, graph, and map public tidal data using Apache Kafka, Kafka Connect, Elasticsearch, and Kibana. In this blog, we resume that journey and take an Apache “Camel” (Kafka Connector) through the desert (or the Australian Outback) to see if it is more or less robust than the connectors we previously tried.
Learn MorePaul BrebnerDecember 17, 2020 -
- Apache Cassandra
- Technical
The Instaclustr LDAP Plugin for Apache Cassandra® 2.0, 3.0, and 4.0
LDAP (Lightweight Directory Access Protocol) is a common vendor-neutral and lightweight protocol for organizing authentication of network services. Integration with LDAP allows users to unify the company's security policies when one user or entity can log in and authenticate against a variety of services.
Learn MoreStefan MiklosovicDecember 09, 2020 -
- Apache Kafka
- Elasticsearch
- Kafka Connect
- Technical
Building a Real-Time Tide Data Processing Pipeline: Using Apache Kafka®, Kafka Connect, Elasticsearch™, and Kibana™—Part 2
In Part 1 of this blog, we built a simple real-time data processing pipeline to take streaming tidal data from NOAA stations using Kafka connectors, and graph them in Elasticsearch and Kibana. We also tried viewing the data on a Kibana map but ran into a problem! In Part 2 we add the missing geo_points...
Learn MorePaul BrebnerNovember 11, 2020 -
- Apache Kafka
- Elasticsearch
- Kafka Connect
- Technical
Building a Real-Time Tide Data Processing Pipeline: Using Apache Kafka®, Kafka Connect, Elasticsearch™, and Kibana™—Part 1
ApacheCon@Home is over for 2020 and was a resounding success, with close to 6,000 attendees from every continent. As a Platinum Sponsor, Instaclustr ran an ApacheCon Booth and this blog was originally presented on 30 September 2020 as a booth talk. I was one of the 1% attending from Australasia: This, unfortunately, meant I was...
Learn MorePaul BrebnerNovember 05, 2020 -
- Apache Kafka
- Feature Releases
- Kafka Connect
- Technical
General Availability of Apache Kafka® and Kafka Connect 2.5.1
Instaclustr is pleased to announce the release of Apache Kafka version 2.5.1 on the Instaclustr Managed Platform. The new version is also available paired with version 2.5.1 of Kafka Connect. Apache Kafka is the leading distributed event streaming platform and is open source under the Apache 2.0 License. Instaclustr provides our customers managed and supported...
Learn MorePaul AubreyOctober 22, 2020 -
- Apache Kafka
- Elasticsearch
- Technical
ELK Stack to EKK Stack (Elastic, Kibana, and Apache Kafka®): COVID-19 Data Analysis
A lot of people new to data science don’t know what to do after they write their first Python or R script. While that web scraper may have run well on your laptop—one time, you need to think about a streaming architecture that can handle multiple datasets. You need to not only store the results...
Learn MoreInstaclustrOctober 15, 2020 -
- Apache Kafka
- Feature Releases
- Technical
Dedicated ZooKeeper for Apache Kafka®
Instaclustr is pleased to announce the release of dedicated Apache ZooKeeper nodes as an additional optional feature of our Managed Apache Kafka offering. Apache ZooKeeper is used for the management and coordination of nodes in Kafka.
Learn MorePaul BrebnerOctober 12, 2020