Education Hub
We Are Committed to Open Source
Developed by large communities, open source is delivering benefits such as reduced costs, flexibility, transparency, security, and technology freedom.
10 tips for a successful data architecture strategy
A data architecture strategy is a framework that outlines how an organization manages its data assets to meet business requirements and achieve goals.
6 data architecture principles and how to implement them
Data architecture includes the design and organization of data assets, enabling the management, storage, and use of data within an enterprise.
7 pillars of Apache Spark performance tuning
Apache Spark performance tuning involves optimizing system configurations and application settings to improve the efficiency and performance of Spark jobs.
8 amazing Apache Spark use cases with code examples
Apache Spark is an open-source, distributed computing system for big data processing and analytics.
Apache Cassandra®
The database of choice for scalable, highly available, reliable, and high-performance applications.
Apache Cassandra on AWS: The basics and how to manage
Apache Cassandra is a highly scalable, open-source NoSQL database to handle large amounts of data across many commodity servers.
Apache Kafka®
Build your application on a fast, scalable, and distributed streaming platform.
Apache Kafka on AWS: Features, pricing, tutorial and best practices
Amazon Managed Streaming for Apache Kafka (AWS MSK) is a fully managed service that simplifies the process of running Apache Kafka on AWS
Apache Kafka tutorial: Get started with Kafka in 5 simple steps
Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to the Apache Software Foundation.