Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 72 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,16 @@ Send JSON log events from a file or `STDIN`.
Example:

```
seqcli ingest -i events.clef --filter="@Level <> 'Debug'" -p Environment=Test
seqcli ingest -i events.clef --json --filter="@Level <> 'Debug'" -p Environment=Test
```

| Option | Description |
| ------ | ----------- |
| `-i`, `--input=VALUE` | CLEF file to ingest; if not specified, `STDIN` will be used |
| `--invalid-data=VALUE` | Specify how invalid data is handled: fail (default) or ignore |
| `-p`, `--property=VALUE1=VALUE2` | Specify event properties, e.g. `-p Customer=C123 -p Environment=Production` |
| `-x`, `--extract=VALUE` | An extraction pattern to apply to plain-text logs (ignored when `--json` is specified) |
| `--json` | Read the events as JSON (the default assumes plain text) |
| `-f`, `--filter=VALUE` | Filter expression to select a subset of events |
| `-s`, `--server=VALUE` | The URL of the Seq server; by default the `connection.serverUrl` value will be used |
| `-a`, `--apikey=VALUE` | The API key to use when connecting to the server; by default `config.apiKey` value will be used |
Expand Down Expand Up @@ -147,3 +149,72 @@ Stream log events matching a filter.
### `version`

Print the current executable version.

## Extraction Patterns

The `seqcli ingest` command can be used for parsing plain text logs into structured log events.

```shell
seqcli ingest -x "{@t:timestamp} [{@l:ident}] {@m:*}{:n}{@x:*}"
```

The `-x` argument above is an _extraction pattern_ that will parse events like:

```
2018-02-21 13:29:00.123 +10:00 [ERR] The operation failed
System.DivideByZeroException: Attempt to divide by zero
at SomeClass.SomeMethod()
```

### Syntax

Extraction patterns have a simple high-level syntax:

* Text that appears in the pattern is matched literally - so a pattern like `Hello, world!` will match logging statements that are made up of this greeting only,
* Text between `{curly braces}` is a _match expression_ that identifies a part of the event to be extracted, and
* Literal curly braces are escaped by doubling, so `{{` will match the literal text `{`, and `}}` matches `}`.

Match expressions have the form:

```
{name:matcher}
```

Both the name and matcher are optional, but either one or the other must be specified. Hence `{@t:timestamp}` specifies a name of `@t` and value `timestamp`, `{IPAddress}` specifies a name only, and `{:n}` a value only (in this case the built-in newline matcher).

The _name_ is the property name to be extracted; there are four built-in property names that get special handling:

* `@t` - the event's timestamp
* `@m` - the textual message associated with the event
* `@l` - the event's level
* `@x` - the exception or backtrace associated with the event

Other property names are attached to the event payload, so `{Elapsed:dec}` will extract a property called `Elapsed`, using the `dec` decimal matcher.

Match expressions with no name are consumed from the input, but are not added to the event payload.

### Matchers

Matchers identify chunks of the input event.

Different matchers are needed so that a piece of text like `200OK` can be separated into separate properties, i.e. `{StatusCode:nat}{Status:alpha}`. Here, the `nat` (natural number) matcher also coerces the result into a numeric value, so that it is attached to the event payload numerically as `200` instead of as the text `"200"`.

There are three kinds of matchers:

* Matchers like `alpha` and `nat` are built-in _named_ matchers. These are built-in.
* The special matchers `*`, `**` and so-on, are _non-greedy content_ matchers; these will match any text up until the next pattern element matches (`*`), the next two elements match, and so-on. We saw this in action with the `{@m:*}{:n}` elements in the example - the message is all of the text up until the next newline.
* More complex _compound_ matchers are described using a sub-expression. These are prefixed with an equals sign `=`, like `{Phone:={:nat}-{:nat}-{:nat}}`. This will extract chunks of text like `123-456-7890` into the `Phone` property.

### Processing

Extraction patterns are processed from left to right. When the first non-matching pattern is encountered, extraction stops; any remaining text that couldn't be matched will be attached to the resulting event in an `@unmatched` property.

Multi-line events are handled by looking for lines that start with the first element of the extraction pattern to be used. This works well if the first line of each event begins with something unambiguous like an `iso8601dt` timestamp; if the lines begin with less specific syntax, the first few elements of the extraction pattern might be grouped to identify the start of events more accurately:

```
{:=[{@t} {@l}]} {@m:*}
```

Here the literal text `[`, a timestamp token, adjacent space ` `, level and closing `]` are all grouped so that they constitute a single logical pattern element to identify the start of events.

When logs are streamed into `seqcli ingest` in real time, a 10 ms deadline is applied, within which any trailing lines that make up the event must be received.
64 changes: 32 additions & 32 deletions src/SeqCli/PlainText/Extraction/ExtractionPatternInterpreter.cs
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ static class ExtractionPatternInterpreter
{
public static NameValueExtractor MultilineMessageExtractor { get; } = new NameValueExtractor(new[]
{
new PatternElement(Matchers.MultiLineMessage, ReifiedProperties.Message)
new SimplePatternElement(Matchers.MultiLineMessage, ReifiedProperties.Message)
});

public static NameValueExtractor CreateNameValueExtractor(ExtractionPattern pattern)
static PatternElement[] CreatePatternElements(ExtractionPattern pattern)
{
if (pattern == null) throw new ArgumentNullException(nameof(pattern));

Expand All @@ -22,39 +22,39 @@ public static NameValueExtractor CreateNameValueExtractor(ExtractionPattern patt
var element = pattern.Elements[i];
switch (element)
{
case LiteralTextPatternExpression text:
patternElements[i] = new PatternElement(Matchers.LiteralText(text.Text));
break;
case CapturePatternExpression capture
when capture.Content is NonGreedyContentExpression ngc:
patternElements[i] = new PatternElement(
Matchers.NonGreedyContent(patternElements.Skip(i + 1).Take(ngc.Lookahead).ToArray()),
capture.Name);
break;
case CapturePatternExpression capture
when capture.Content is MatchTypeContentExpression mtc:
patternElements[i] = new PatternElement(
mtc.Type == null ? Matchers.Token : Matchers.GetByType(mtc.Type),
capture.Name);
break;
default:
throw new InvalidOperationException($"Element `{element}` not recognized.");
case LiteralTextPatternExpression text:
patternElements[i] = new SimplePatternElement(Matchers.LiteralText(text.Text));
break;
case CapturePatternExpression capture
when capture.Content is NonGreedyContentExpression ngc:
patternElements[i] = new SimplePatternElement(
Matchers.NonGreedyContent(patternElements.Skip(i + 1).Take(ngc.Lookahead).ToArray()),
capture.Name);
break;
case CapturePatternExpression capture
when capture.Content is MatchTypeContentExpression mtc:
patternElements[i] = new SimplePatternElement(
mtc.Type == null ? Matchers.Token : Matchers.GetByType(mtc.Type),
capture.Name);
break;
case CapturePatternExpression capture
when capture.Content is GroupedContentExpression gc:
patternElements[i] = new GroupedPatternElement(
CreatePatternElements(gc.ExtractionPattern),
capture.Name);
break;
default:
throw new InvalidOperationException($"Element `{element}` not recognized.");
}
}
return new NameValueExtractor(patternElements);

return patternElements;
}

// What we need to do here is:
// - for each parsed token
// - if it's literal text, map it an anonymous PatternElement with
// BuiltInPatterns.LiteralText()
// - otherwise, if it specifies no format, it's a named element with
// the BuiltInPatterns.Token parser
// - if it does specify a format, look up the parser based on the name, except
// - if the format is `$` it is BuiltInPatterns.SingleLineContent
// - if the format is `$$` it is BuiltInPatterns.MultiLineContent
// - if it's `*`, it's BuiltInPatterns.NonGreedyContent() passing the
// parser that follows it
public static NameValueExtractor CreateNameValueExtractor(ExtractionPattern pattern)
{
var patternElements = CreatePatternElements(pattern);
return new NameValueExtractor(patternElements);
}
}
}
51 changes: 51 additions & 0 deletions src/SeqCli/PlainText/Extraction/GroupedPatternElement.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
using System;
using System.Collections.Generic;
using System.Linq;
using Superpower;
using Superpower.Model;

namespace SeqCli.PlainText.Extraction
{
class GroupedPatternElement : PatternElement
{
readonly PatternElement[] _content;

public GroupedPatternElement(IEnumerable<PatternElement> content, string name = null)
: base(name)
{
_content = content?.ToArray() ?? throw new ArgumentNullException(nameof(content));
if (_content.Length == 0) throw new ArgumentException("A grouped pattern must include at least one element.");

Match = _content.Select(c => c.Match).Aggregate((a, b) => a.IgnoreThen(b));
}

public override TextParser<Unit> Match { get; }

public override bool TryExtract(
TextSpan input,
Dictionary<string, object> result,
out TextSpan remainder)
{
var temp = new Dictionary<string, object>();

var rem = input;
foreach (var element in _content)
{
if (!element.TryExtract(rem, temp, out rem))
{
remainder = input;
return false;
}
}

foreach (var pair in temp)
result.Add(pair.Key, pair.Value);

var value = input.Until(rem);
remainder = rem;
CollectResult(result, value);

return true;
}
}
}
4 changes: 2 additions & 2 deletions src/SeqCli/PlainText/Extraction/Matchers.cs
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,10 @@ public static TextParser<object> NonGreedyContent(params PatternElement[] follow
return SpanEx.MatchedBy(Character.AnyChar.Many())
.Select(span => span.Length > 0 ? (object) span : null);

var rest = following[0].Parser;
var rest = following[0].Match;
for (var i = 1; i < following.Length; ++i)
{
rest = rest.IgnoreThen(following[i].Parser);
rest = rest.IgnoreThen(following[i].Match);
}

return i =>
Expand Down
12 changes: 2 additions & 10 deletions src/SeqCli/PlainText/Extraction/NameValueExtractor.cs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ public NameValueExtractor(IEnumerable<PatternElement> elements)
throw new ArgumentException("An extraction pattern must contain at least one element.");
}

public TextParser<object> StartMarker => _elements[0].Parser;
public TextParser<Unit> StartMarker => _elements[0].Match;

public (IDictionary<string, object>, string) ExtractValues(string plainText)
{
Expand All @@ -28,21 +28,13 @@ public NameValueExtractor(IEnumerable<PatternElement> elements)
var remainder = input;
foreach (var element in _elements)
{
var match = element.Parser(remainder);
if (!match.HasValue)
if (!element.TryExtract(remainder, result, out remainder))
{
if (remainder.IsAtEnd || Span.WhiteSpace.IsMatch(remainder))
return (result, null);

return (result, remainder.ToStringValue());
}

remainder = match.Remainder;

if (!element.IsIgnored)
{
result.Add(element.Name, match.Value);
}
}

return (result, null);
Expand Down
31 changes: 22 additions & 9 deletions src/SeqCli/PlainText/Extraction/PatternElement.cs
Original file line number Diff line number Diff line change
@@ -1,18 +1,31 @@
using System;
using System.Collections.Generic;
using Superpower;
using Superpower.Model;

namespace SeqCli.PlainText.Extraction
{
class PatternElement
abstract class PatternElement
{
public PatternElement(TextParser<object> parser, string name = null)
readonly string _name;

bool IsIgnored => _name == null;

protected PatternElement(string name)
{
Parser = parser ?? throw new ArgumentNullException(nameof(parser));
Name = name;
_name = name;
}

public TextParser<object> Parser { get; }
public string Name { get; }
public bool IsIgnored => Name == null;
public abstract TextParser<Unit> Match { get; }

public abstract bool TryExtract(
TextSpan input,
Dictionary<string, object> result,
out TextSpan remainder);

protected void CollectResult(Dictionary<string, object> result, object value)
{
if (!IsIgnored)
result.Add(_name, value);
}
}
}
}
39 changes: 39 additions & 0 deletions src/SeqCli/PlainText/Extraction/SimplePatternElement.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using Superpower;
using Superpower.Model;

namespace SeqCli.PlainText.Extraction
{
class SimplePatternElement : PatternElement
{
readonly TextParser<object> _parser;

public override TextParser<Unit> Match { get; }

public SimplePatternElement(TextParser<object> parser, string name = null)
: base(name)
{
_parser = parser ?? throw new ArgumentNullException(nameof(parser));
Match = _parser.Select(s => Unit.Value);
}

public override bool TryExtract(
TextSpan input,
Dictionary<string, object> result,
out TextSpan remainder)
{
var match = _parser(input);
if (!match.HasValue)
{
remainder = input;
return false;
}

CollectResult(result, match.Value);
remainder = match.Remainder;

return true;
}
}
}
26 changes: 21 additions & 5 deletions src/SeqCli/PlainText/Patterns/ExtractionPatternParser.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,24 @@ static class ExtractionPatternParser
.IgnoreThen(Character.LetterOrDigit.Or(Character.EqualTo('_')).Many()))
.Select(s => s.ToStringValue());

static readonly TextParser<CaptureContentExpression> NonGreedyContent =
Character.EqualTo('*').AtLeastOnce()
.Select(chs => (CaptureContentExpression) new NonGreedyContentExpression(chs.Length));

static readonly TextParser<CaptureContentExpression> MatchTypeContent =
SpanEx.MatchedBy(Character.Letter.Or(Character.EqualTo('_'))
.IgnoreThen(Character.LetterOrDigit.Or(Character.EqualTo('_')).Many()))
.Select(s => (CaptureContentExpression) new MatchTypeContentExpression(s.ToStringValue()));

static readonly TextParser<CaptureContentExpression> GroupedContent =
Span.EqualTo("=")
.IgnoreThen(Superpower.Parse.Ref(() => Elements))
.Select(els => (CaptureContentExpression) new GroupedContentExpression(new ExtractionPattern(els)));

static readonly TextParser<CaptureContentExpression> CaptureContent =
Character.EqualTo('*').AtLeastOnce().Select(chs => (CaptureContentExpression)new NonGreedyContentExpression(chs.Length))
.Or(SpanEx.MatchedBy(Character.Letter.Or(Character.EqualTo('_'))
.IgnoreThen(Character.LetterOrDigit.Or(Character.EqualTo('_')).Many()))
.Select(s => (CaptureContentExpression)new MatchTypeContentExpression(s.ToStringValue())));
NonGreedyContent
.Or(MatchTypeContent)
.Or(GroupedContent);

static readonly TextParser<CapturePatternExpression> Capture =
from _ in Character.EqualTo('{')
Expand All @@ -40,8 +53,11 @@ from __ in Character.EqualTo('}')
LiteralText.Cast<LiteralTextPatternExpression, ExtractionPatternExpression>()
.Or(Capture.Cast<CapturePatternExpression, ExtractionPatternExpression>());

static readonly TextParser<ExtractionPatternExpression[]> Elements =
Element.AtLeastOnce();

static readonly TextParser<ExtractionPattern> Pattern =
Element.AtLeastOnce().AtEnd().Select(e => new ExtractionPattern(e));
Elements.AtEnd().Select(e => new ExtractionPattern(e));

public static ExtractionPattern Parse(string extractionPattern)
{
Expand Down
Loading