> ## Documentation Index
> Fetch the complete documentation index at: https://axiom.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Axiom Processing Language (APL)

> Use APL to query logs, traces, and events with a text-based query language.

Axiom Processing Language (APL) is a text-based query language for logs, traces, and events stored in [EventDB](/platform-overview/architecture). It provides the flexibility to filter, manipulate, and summarize your data exactly the way you need it.

<Info>
  You can't use APL to query metrics. To query metrics, use [MPL](/mpl/introduction).
</Info>

## Prerequisites

* [Create an Axiom account](https://app.axiom.co/register).
* [Create a dataset in Axiom](/reference/datasets#create-dataset) where you send your data.

## Build an APL query

APL queries consist of the following:

* **Data source:** The most common data source is one of your Axiom datasets.
* **Operators:** Operators filter, manipulate, and summarize your data.

Delimit operators with the pipe character (`|`).

A typical APL query has the following structure:

```kusto theme={null}
DatasetName
| Operator ...
| Operator ...
```

* `DatasetName` is the name of the dataset you want to query.
* `Operator` is an operation you apply to the data.

<Note>
  Apart from Axiom datasets, you can use other data sources:

  * External data sources using the [externaldata](/apl/tabular-operators/externaldata-operator) operator.
  * Specify a data table in the APL query itself using the `let` statement.
</Note>

## Example query

```kusto theme={null}
['github-issue-comment-event']
| extend isBot = actor contains '-bot' or actor contains '[bot]'
| where isBot == true
| summarize count() by bin_auto(_time), actor
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'github-issue-comment-event'%5D%20%7C%20extend%20isBot%20%3D%20actor%20contains%20'-bot'%20or%20actor%20contains%20'%5Bbot%5D'%20%7C%20where%20isBot%20%3D%3D%20true%20%7C%20summarize%20count\(\)%20by%20bin_auto\(_time\)%2C%20actor%22%7D)

The query above uses a dataset called `github-issue-comment-event` as its data source. It uses the following operators:

* [extend](/apl/tabular-operators/extend-operator) adds a new field `isBot` to the query results. It sets the values of the new field to true if the values of the `actor` field in the original dataset contain `-bot` or `[bot]`.
* [where](/apl/tabular-operators/where-operator) filters for the values of the `isBot` field. It only returns rows where the value is true.
* [summarize](/apl/tabular-operators/summarize-operator) aggregates the data and produces a chart.

Each operator is separated using the pipe character (`|`).

## Example result

As a result, the query returns a chart and a table. The table counts the different values of the `actor` field where `isBot` is true, and the chart displays the distribution of these counts over time.

| actor                | count\_ |
| -------------------- | ------- |
| github-actions\[bot] | 487     |
| sonarqubecloud\[bot] | 208     |
| dependabot\[bot]     | 148     |
| vercel\[bot]         | 91      |
| codecov\[bot]        | 63      |
| openshift-ci\[bot]   | 52      |
| coderabbitai\[bot]   | 43      |
| netlify\[bot]        | 37      |

<Note>
  The query results are a representation of your data based on your request. The query doesn’t change the original dataset.
</Note>

## Quote dataset and field names

If the name of a dataset or field contains at least one of the following special characters, quote the name in your APL query:

* Space (` `)
* Dot (`.`)
* Dash (`-`)

To quote the dataset or field in your APL query, enclose its name with quotation marks (`'` or `"`) and square brackets (`[]`). For example, `['my-field']`.

For more information on rules about naming and quoting entities, see [Entity names](/apl/entities/entity-names).

## Common patterns

### Handle nested JSON

A common scenario is dealing with fields that contain JSON objects. Use `parse_json` to access nested data.

```kusto theme={null}
['sample-http-logs']
| extend parsed_headers = parse_json(req_duration_ms)
| where isnotempty(method)
| project _time, method, status, geo.city
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%5Cn%7C%20extend%20parsed_headers%20%3D%20parse_json\(req_duration_ms\)%5Cn%7C%20where%20isnotempty\(method\)%5Cn%7C%20project%20_time%2C%20method%2C%20status%2C%20%5B'geo.city'%5D%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

* `extend parsedField = parse_json(...)` converts JSON text into a structured field you can access with dot notation.
* `project _time, method, status, geo.city` selects only the fields you need.

### Filter and project early

A well-written query runs faster, consumes fewer resources, and gets you answers more efficiently. The two most important principles:

1. **Filter early.** Reduce the amount of data as soon as possible.
2. **Project only what you need.** Avoid selecting unnecessary fields.

Some datasets are wide, containing hundreds or thousands of fields. When you query these datasets with APL, use `project` to select only the fields you need. Without `project`, Axiom retrieves all fields for each event, which slows down queries.

**Sub-optimal:**

```kusto theme={null}
['sample-http-logs']
| sort by _time desc
| take 10
```

This retrieves all fields for each of the 10 events.

**Optimized:**

```kusto theme={null}
['sample-http-logs']
| project _time, method, status, uri, resp_body_size_bytes
| sort by _time desc
| take 10
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%5Cn%7C%20project%20_time%2C%20method%2C%20status%2C%20uri%2C%20resp_body_size_bytes%5Cn%7C%20sort%20by%20_time%20desc%5Cn%7C%20take%2010%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

By adding `project`, the query ignores all other fields, minimizing I/O and reducing data sent over the network.

<Tip>
  * Always use `project` or `project-away` after your `where` filters to reduce data volume. `project` keeps specified fields, `project-away` removes them.
  * Place your most restrictive `where` filters as early as possible in the query.
</Tip>

### Virtual fields

[Virtual fields](/query-data/virtual-fields) let you define new fields based on an APL expression. Instead of pre-processing data before sending it to Axiom, you create these fields on the fly during a query. This provides flexibility for analysis without altering the raw data.

**Example: Simple conversion**

This example converts a response body size from bytes to kilobytes:

* **Name:** `resp_size_kb`
* **Expression:** `resp_body_size_bytes / 1024`

**Example: Categorization**

This example uses conditional logic to segment data. Define a virtual field to categorize HTTP responses:

* **Name:** `response_category`
* **Expression:** `case(status >= 500, "Server Error", status >= 400, "Client Error", status >= 300, "Redirect", "Success")`

Now you can run queries like `... | summarize count() by response_category` to compare behavior across these groups.

<Tip>
  * Use virtual fields to avoid sending redundant data. If you can derive a value, you don't need to add it to your raw logs.
  * Use virtual fields to normalize data from different sources. If one service logs `request_time` and another logs `duration`, create a virtual field using `coalesce(request_time, duration)` to standardize them.
  * For very common, expensive transformations queried frequently, consider performing them at ingest time instead.
</Tip>

### Factors impacting query performance

* **Catch-all queries:** Queries that don't specify fields with `project` select all fields. Avoid this on high-dimensionality datasets.
* **High cardinality `summarize` operations:** When the `by` field has very many unique values (like a `traceId`), the query may produce an enormous number of groups. Axiom has built-in limits to protect against this.
* **Mixed data types:** If a field has mixed types (for example, a status code is sometimes a number `200` and sometimes a string `"200"`), queries can produce unexpected results.
  For best performance, aim for consistent typing. If you can't avoid mixed types, normalize the data at query time using typecasting functions like `tostring()` or `toint()`. For example, `| where tostring(status) startswith "2"` works reliably on a field with mixed types.

For more information, see [Performance](/reference/performance).

### Platform limits

* **Fields per dataset:** A dataset can have a maximum number of fields. While the limit is high, ingesting logs with thousands of fields can cause issues.
* **Data retention:** Datasets have a configurable retention period. Data older than this period is automatically deleted.
* **Query rate limits:** Axiom imposes rate limits on queries to ensure service stability.

For more information, see [Limits](/reference/limits).

## What’s next

Check out the [list of example queries](/apl/tutorial) or explore the supported operators and functions:

* [Scalar functions](/apl/scalar-functions/)
* [Aggregation functions](/apl/aggregation-function/)
* [Tabular operators](/apl/tabular-operators/)
* [Scalar operators](/apl/scalar-operators/)
