Introduction
JSONJet is a document based real-time stream processing engine. It is best explained by a simple example:
create stream sensor_data;
create stream alerts;
create flow data_processor as
sensor_data
| where temperature > 70
| select {
sensor_id: sensor_id,
message: "temperature too high",
temperature: temperature,
timestamp: timestamp,
level: iff(temperature>=100, 'danger', 'warning')
}
| insert_into(alerts);
The key concepts here are:
Stream
A stream is a named data pipe that routes documents through the system in real-time. Streams are the primary data routing mechanism in JSONJet and can be created, populated, and subscribed to. Think of streams as pure data pipes - they receive data from external sources and immediately route it to all active subscribers. If no subscribers are listening, the data is lost (this is correct streaming behavior, not a database).
Flow
A flow is a real-time data processing pipeline that continuously processes documents from one input stream and writes to one or more output streams. Flows are the core processing mechanism in JSONJet, transforming raw data into meaningful insights as it arrives. Each flow operates independently and can filter, transform, aggregate, or route data based on your business logic.
Flows can be composed of multiple operators chained together using the pipe operator (|
). The query language is inspired by Kusto Query Language (KQL) and the SQL Pipe syntax suggested by Google, providing an intuitive way to express complex data transformations as a series of simple operations.