- Add Ingest V2 (#5600, #5566, #5463, #5375, #5350, #5252 #5202)
- Add SQS source (#5374, #5335, #5148)
- Disable control plane check for searcher (#5599, #5360)
- Partially implement
_elastic/_cluster/health
(#5595) - Make Jaeger span attribute-to-tag conversion exhaustive (#5574)
- Use
content_length_limit
for ES bulk limit (#5573) - Limit and monitor warmup memory usage (#5568)
- Add eviction metrics to caches (#5523)
- Record object storage request latencies (#5521)
- Add some kind of throttling on the janitor to prevent it from overloading (#5510)
- Prevent single split searches from different
leaf_search
from interleaving (#5509) - Retry on S3 internal error (#5504)
- Allow specifying OTEL index ID in header (#5503)
- Add a metric to count storage errors and their error code (#5497)
- Add support for concatenated fields (#4773, #5369, #5331)
- Add number of splits per root/leaf search histograms (#5472)
- Introduce a searcher config option to timeout get requests (#5467)
- Add fingerprint to task in cluster state (#5464)
- Enrich root/leaf search spans with number of docs and splits (#5450)
- Add some additional search metrics (#5447)
- Improve GC resilience and add metrics (#5420)
- Enable force shutdown with 2nd Ctrl+C (#5414)
- Add request_timeout_secs config to searcher config (#5402)
- Memoize S3 client (#5377)
- Add more env var config for Postgres (#5365)
- Enable str fast field range queries (#5324)
- Allow querying non-existing fields (#5308)
- Support updating doc mapper through api (#5253)
- Add optional special handling for hex in code tokenizer (#5200)
- Added a circuit breaker layer (#5134)
- Various performance optimizations in Tantivy (https://github.com/quickwit-oss/tantivy/blob/main/CHANGELOG.md)
- Parse datetimes and timestamps with leading and/or trailing whitespace (#5544)
- Restrict maturity period to retention (#5543)
- Wait for merge at end of local ingest (#5542)
- Log PostgreSQL metastore error (#5530)
- Update azure multipart policy (#5553)
- Stop relying on our own version of pulsar-rs (#5487)
- Handle nested OTLP values in attributes and log bodies (#5485)
- Improve merge pipeline finalization (#5475)
- Allow failed splits in root search (#5440)
- Batch delete from GC (#5404, #5380)
- Make some S3 errors retryable (#5384)
- Change default timestamps in OTEL logs (#5366)
- Only return root spans for Jaeger HTTP API (#5358)
- Share aggregation limit on node (#5357)
- Fix existence queries for nested fields (#5581)
- Fix lenient option with wildcard queries (#5575)
- Fix incompatible ES Java date format (#5462)
- Fix bulk api response order (#5434)
- Fix pulsar finalize (#5471)
- Fix pulsar URI scheme (#5470)
- Fix grafana searchers dashboard (#5455)
- Fix jaeger http endpoint (#5378)
- Fix file re-ingestion after EOF (#5330)
- Fix source path in Lambda distrib (#5327)
- Fix configuration interpolation (#5403)
- Fix jaeger duration parse error (#5518)
- Fix unit conversion in jaeger http search endpoint (#5519)
- Remove support for 2-digit years in java datetime parser (#5596)
- Remove DocMapper trait (#5508)
- Bug in the chitchat digest message serialization (chitchat#144)
- Remove some noisy logs (#4447)
- Add
/{index}/_stats
and/_stats
ES API (#4442) - Use
search_after
in ES scroll API (#4280) - Add support for wildcard exclusion in index patterns (#4458)
- Add
.
support in DSL indentifiers (#3989) - Add cat indices ES API (#4465)
- Limit concurrent merges (#4473)
- Add Index Template API and auto create index (#4456) (only available with ingest V2)
- Add support for compressed ES
_bulk
requests (#4506) - Add support for slash
/
character in field names (#4510) - Handle SIGTERM shutdown signal (#4539)
- Add
start_timestamp
andend_timestamp
filter to ES_field_caps
API (#4547) - Limit the number of merge pipelines that can be spawned concurrently (#4574)
- Add support for
_source_excludes
and_source_includes
query parameters in ES API (#4572) - Add gRPC metrics layer to clients and servers (#4591)
- Add additional cluster metrics (#4597)
- Add index patterns query param on GET
/indexes
endpoint (#4600) - Add support for GCS file backed metastore (#4604)
- Add default search fields for OTEL traces index (#4602)
- Add support for delete index in ES API (#4606)
- Add a handler to dynamically change the log level (#4662)
- Add REST endpoint to parse a query into a query AST (#4652)
- Add postgresql index and use
IN
instead of manyOR
(#4670) - Add support for
_source_excludes
,_source_includes
,extra_filters
in_msearch
ES API (#4696) - Handle
track_total_size
on request ES body (#4710) - Add a metric for the number number of indexes (#4711)
- Add various performance optimizations in Quickwit and Tantivy
More details in tantivy's changelog.
- Fix aggregation result on empty index (#4449)
- Fix Gzip file source (#4457)
- Rate limit noisy logs (#4483)
- Prevent the exponential backoff from overflowing after 64 attempts (#4501)
- Remove field presence in ES
_field_caps
API (#4492) - Remove
source
in ES parameter, remove unsupported fieldfields
in response (#4590) - Fix aggregation
split_size
parameter, add docs and test (#4627) - Various fixes in chitchat (gossip): more details in chitchat commit history
- Various fixes in mrecordlog (WAL): more details in mrecordlog commit history
- (Breaking) Add ZSTD compression to chitchat's Deltas
To deploy Quickwit 0.8.0, you must either:
- shutdown down your cluster entirely before deploying, or
- restart all the nodes of your cluster after deploying.
Because we made some breaking changes in the gossip protocol (chitchat), nodes running different versions of Quickwit cannot communicate with each other and crash upon receiving messages that do not match their release version. The new protocol is now versioned, and future updates of the gossip protocol will be backward compatible.
- Add es _count API (#4410)
- Add _elastic/_field_caps API (#4350)
- Make gRPC message size configurable (#4388)
- Add API endpoint to get some control-plan internal info (#4339)
- Add Google Cloud Storage Implementation available for storage paths starting with
gs://
(#4344)
- Return 404 on index not found in ES Bulk API (#4425)
- Allow $ and @ characters in field names (#4413)
- Assign all sources/shards, even if this requires exceeding the indexer #4363
- Fix traces doc mapping (service name set as fast) and update default otel logs index ID to
otel-logs-v0_7
(#4401) - Fix parsing multi-line queries (#4409)
- Fix range query for optional fast field panics with Index out of bounds (#4362)
Quickwit 0.7.1 will create the new index otel-logs-v0_7
which is now used by default when ingesting data with the OTEL gRPC and HTTP API.
In the traces index otel-traces-v0_7
, the service_name
field is now fast. No migration is done if otel-traces-v0_7
already exists. If you want service_name
field to be fast, you have to delete first the existing otel-traces-v0_7
index or create your own index.
- Elasticsearch-compatible API
- Added scroll and search_after APIs and support for multi-index search queries
- Added exists, multi-match, match phrase prefix, match bool prefix, bool queries
- Added
_field_caps
API
- Added support for OTLP over HTTP API (Protobuf only) (#4335)
- Added Jaeger REST endpoints for Grafana tracing support (#4197)
- Added support for injecting custom HTTP headers and moved REST config parameters into REST config section (#4198)
- Added support for OTLP trace data in arbitrary sources
- Commit Kafka offsets on suggest truncate (#3638)
- Honor
auto.offset.reset
parameter in Kafka source (#4095) - Added exact count optimization (#4019)
- Added stream splits gRPC (#4109)
- Adding a split cache in Searchers (#3857)
- Added
coerce
andoutput_format
options for numeric fields (#3704) - Added
PhraseMatchQuery
andMultiMatchQuery
(#3727) - Added Elasticsearch's
TermsQuery
(#3747) - Added GCP PubSub source (#3720)
- Parse timestamp strings (#3639)
- Added Digital Ocean storage flavor (#3632)
- Added new tokenizers:
source_code_default
,source_code
,multilang
(#3647, #3655, #3608)
- Fixed dates in UI (#4277)
- Fixed duplicate splits planned on pipeline crash-respawn (#3854)
- Fixed sorting (#3799)
More details in tantivy's changelog.
-
Improve OTEL traces index config (#4311)
- OTEL endpoints are now using by default indexes
otel-logs-v0_7
andotel-traces-v0_7
instead ofotel-logs-v0_6
andotel-traces-v0_6
- OTEL indexes have more fields stored as "fast" and have Trace and Span ID bytes field in hex format
- OTEL endpoints are now using by default indexes
-
Increased the gRPC payload limits from 10MiB to 20MiB (#4227)
-
Reject malformed Elasticsearch API requests (#4175)
-
Better logging when doc processing fails (#4323)
-
Search performance improvements
-
Indexing performance improvements
The format of the index and internal objects stored in the metastore of 0.7 is backward compatible with 0.6.
If you are using the OTEL indexes and ingesting data into indexes the otel-logs-v0_6
and otel-traces-v0_6
, you must stop indexing before upgrading.
Indeed, the first time you start Quickwit 0.7, it will update the doc mapping fields of Trace ID and Span ID of those two indexes by changing their input/output formats from base64 to hex. This is automatic: you don't have to perform any manual operation.
Quickwit 0.7 will create new indexes otel-logs-v0_7
and otel-traces-v0_7
, which are now used by default when ingesting data with the OTEL gRPC and HTTP API. The Jaeger gRPC and HTTP APIs will query both otel-traces-v0_6
and otel-traces-v0_7
by default.
It's possible to define the index ID you want to use for OTEL gRPC endpoints and Jaeger gRPC API by setting the request header qw-otel-logs-index
or qw-otel-traces-index
to the index ID you want to target.
- Support of phrase prefix queries in the query language.
- Fix timestamp field which was not allowed when defined in an object mapping.
- Fix querying of integer on a JSON field (no document were returned).
- Elasticsearch/Opensearch compatible API.
- New columnar format:
- Fast fields can now have any cardinality (Optional, Multivalued, restricted). In fact cardinality is now only used to format the output.
- Dynamic Fields are now fast fields.
- String fast fields now can be normalized.
- Various parameters of object storages can now be configured.
- The ingest API makes it possible to force a commit, or wait for a scheduled commit to occur.
- Ability to parse non-JSON data using VRL to extract some structure from documents.
- Object storage can now use the
virtual-hosted–style
. date_histogram
aggregation.percentiles
aggregation.- Added support for Prefix Phrase query.
- Added support for range queries.
- The query language now supports different date formats.
- Added support for base16 input/output configuration for bytes field. You can search for bytes fields using base16 encoded values.
- Autotagging: fields used in the partition key are automatically added to tags.
- Added arm64 docker image.
- Added CORS configuration for the REST API.
- Major bug fix that required to restart quickwit when deleting and recreating an index with the same name.
- The number of concurrent GET requests to object stores is now limited. This fixes a bug observed with when requested a lot of documents from MinIO.
- Quickwit now searches into resource attributes when receiving a Jaeger request carrying tags
- Object storage can be figured to:
- avoid Bulk delete API (workaround for Google Cloud Storage).
- Use virtual-host style addresses (workaround for Alibaba Object Storage Service).
- Fix aggregation min doc_count empty merge bug.
- Fix: Sort order for term aggregations.
- Switch to ms in histogram for date type (aligning with ES).
- Search performance improvement.
- Aggregation performance improvement.
- Aggregation memory improvement.
More details in tantivy's changelog.
- Datetime now have up to a nanosecond precision.
- By default, quickwit now uses the node's hostname as the default node ID.
- By default, Quickwit is in dynamic mode and all dynamic fields are marked as fast fields.
- JSON field uses by default the raw tokanizer and is set to fast field.
- Various performance/compression improvements.
- OTEL indexes Trace ID and Span ID are now bytes fields.
- OTEL indexes stores timestamps with nanosecond precision.
- pan status is now indexed in the OTEL trace index.
- Default and raw tokenizers filter tokesn longer than 255 bytes instead of 40 bytes.
- gRPC OpenTelemetry Protocol support for traces
- gRPC OpenTelemetry Protocol support for logs
- Control plane (indexing tasks scheduling)
- Ingest API rate limiter
- Pulsar source
- VRL transform for data sources
- REST API enhanced to fully manage indexes, sources, and splits
- OpenAPI specification and swagger UI for all REST available endpoints
- Large responses from REST API can be compressed
- Add bulk stage splits method to metastore
- MacOS M1 binary
- Doc mapping field names starting with
_
are now valid
- Fix UI index completion on search page
- Fix CLI index describe command to show stats on published splits
- Fix REST API to always return on error a body formatted as
{"message": "error message"}
- Fixed REST status code when deleting unexisting index, source and when fetching splits on unexisting index
- Source config schema (breaking or not? use serde rename to be not breaking?)
- RocksDB replaced by mrecordlog to store ingest API queues records
- (Breaking) Indexing partition key new DSL
- (Breaking) Helm chart updated with the new CLI
- (Breaking) CLI indexes, sources, and splits commands use the REST API
- (Breaking) Index new format: you need to reindex all your data
- Boolean, datetime, and IP address fields
- Chinese tokenizer
- Distributed indexing (Kafka only)
- gRPC metastore server
- Index partitioning
- Kubernetes
- Node config templating
- Prometheus metrics
- Retention policies
- REST API for CRUD operations on indexes/sources
- Support for Azure Blob Storage
- Support for BM25 document scoring
- Support for deletions
- Support for slop in phrase queries
- Support for snippeting
- Fixed cache misses during search fetch docs phase
- Fixed credentials leak in metastore URI
- Fixed GC scalability issues
- Fixed support for multi-source
-
Changed default docstore block size to 1 MiB and compression algorithm to ZSTD
-
Quickwit now relies on sqlx rather than Diesel for PostgreSQL interactions. Migrating from 0.3 should work as expected. Migrating from earlier version however is not supported.
- Removed support for i64 as timestamp field
- Removed support for sorting index by field
- Forbid access to paths with
..
at storage level
- Add support for Google Cloud Storage
- Sort hits by timestamp desc by default in search UI
- Add
description
attribute to field mappings - Display split state in output of
quickwit split list
command
- Clean up local split cache after index deletion
- Fix API URLs displayed for copy and paste in UI
- Fix custom S3 endpoint with trailing
/
- Fix
quickwit index create
command with--overwrite
option
- Embedded UI for displaying search hits and cluster state
- Schemaless indexing with JSON field
- Ingest API (Elasticsearch-compatible)
- Aggregation queries
- Support for Amazon Kinesis
- Switched cluster membership algorithm from S.W.I.M. to Chitchat
- u64 as date field
- Query validation against index schema before dispatch to leaf nodes (#1109, @linxGnu)
- Support for custom S3 endpoint (#1108)
- Warm up terms and fastfields concurrently (#1147)
- Minor bug in leaf search stream (#1110)
- Default index root URI and metastore URI correctly default to data dir (#1140, @ddelemeny)
- QW_ENV environment variable
- Compiled binaries with Rust 1.58.1, which fixes CVE-2022-21658