STIX Prototyping Language

The STIX prototyping language is intended to be a simple, readable way to express STIX object graphs. This library can automatically create STIX content from the language. The language and library can be useful for creating content for testing and experimentation.

Basic Syntax

The language is composed of a sequence of statements. Each statement is terminated by a period, like an English sentence. STIX domain objects and relationships are referenced by name. Domain objects must begin with a capital letter and contain only letters and underscores; relationships must begin with lower case. In following with the STIX specification, relationship names may contain only lowercase alphanumerics and hyphens. They must begin with a letter.

The simplest statement names a single SDO:

Identity.

To relate two objects together:

Malware targets Identity.

When SDOs are named this way, they have no reusable identity within the language. That means each use indicates a different object:

Attack_Pattern uses Malware.
Malware targets Identity.

Here, the two Malware objects are different. Object reuse may be accomplished with other syntax.

Multiplicity

Lists of objects are expressed with parentheses:

(Identity Location).

This silly example means the same as if the two objects were in separate statements. But lists can be used as sources and targets of relationships:

Attack_Pattern uses (Malware Tool).

This relates an attack pattern to both a malware and a tool object, via different relationships. It has a different meaning than if two statements were used: both relationships share the same source. If two statements had been used, the relationships would have two different sources. It is an analogous situation if the list had been in the source position.

If a list is in both the source and target position, then all objects in the source are related to all objects in the target. This is similar to a set- theoretic Cartesian product, or a relational join. If there are N objects in the source and M objects in the target, N*M relationships are created.

Counts

An integer count prefix can be given, which means the same as a homogenous list:

2 Identity.

This means the same as (Identity Identity). A count prefix may occur most places an object type name is allowed. This makes it usable in contexts where a list is not allowed, e.g. inside another list:

(Malware 2 Identity).

This means one malware object and two identity objects.

Lastly, counts are allowed on relationships, which has the effect of creating multiple parallel edges in the graph:

Attack_Pattern 2 uses (Malware Tool).

This relates a single attack-pattern to a malware and a tool, but two relationships each are created, for a total of four relationships.

Chaining

A relationship between a source and target can be chained to another target:

Attack_Pattern delivers Malware targets Identity.

This represents two relationships, where the malware delivered by the attack-pattern is the same one which targets the identity. This is another way of reusing an object. These chains can be arbitrarily long.

Property Blocks

Property blocks are primarily used to represent embedded relationships, i.e. those which are realized in STIX via an object property, not an SRO. They use a JSON object-like syntax with curly braces, positioned after the object type name:

Report {
    object_refs: (Malware)
}.

Note that a length-1 list is used because the STIX property is list-valued.

The property name must not be quoted, and the property value may be any STIX prototyping language graph statement, including relationships and nested property blocks. When a more complex graph is used as a property value, it is the top-level source objects which are assigned to the property. In keeping with STIX spec requirements on property names, these names may consist of lowercase alphanumerics and underscores only. They must begin with a letter.

String Literals

String literals are the only primitive literal type supported in the prototyping language, and are only supported in property blocks. The primary purpose of string literals is to assign simple names to things, to assist people in matching up generated STIX objects to components of language statements. When usage is more complex and/or generates numerous objects, it can otherwise be difficult to understand what was generated. Graphical visualization tools sometimes use certain properties to create graph labels. For example, some objects have a “name” property, and “labels” is a common property.

String literals are enclosed in double quotes. Lists of literals can be expressed with square brackets:

Malware {name: "Downloader"} downloads Malware {name: "Backdoor"}.

and

Indicator {
    labels: ["label1", "label2"]
}.

Special Relationship Syntax

In order to make STIX prototyping language more English-like, some relationship names are treated specially: on and of. These special relationships may not have counts.

object_refs and on

on is a shorthand used to set the object_refs property of an object, and may be used instead of a property block. The statement looks like others which represent SRO relationships, but it doesn’t do that. If you use this special syntax on a source object, you can’t also relate it to a target via a normal SRO relationship. You may still use a property block on the source object to populate other properties. For example:

Report on (Malware Campaign).

Sightings and of

Sightings are a special relationship type which breaks the mold of all other SROs. They are ternary (relate up to three things), and don’t have the usual SRO property names. So they don’t fit with the normal infix notation of other relationships. A sighting statement begins with Sighting and may be followed by of to represent the required sighting_of_ref property:

Sighting of Malware.

The other related objects must be represented in a property block:

Sighting {
    observed_data_refs: (Observed_Data),
    where_sighted_refs: 2 Location
} of Malware.

If desired, sighting_of_ref can also be given in a property block, and the trailing of clause omitted:

Sighting {
    observed_data_refs: (Observed_Data),
    where_sighted_refs: 2 Location,
    sighting_of_ref: Malware
}.

Note that Sighting must not have a count prefix, or it will be interpreted as a “normal” graph statement, not this special syntax.

Variables

If other methods of object reuse won’t work or are undesirable, the language supports variables. A variable declaration statement looks like:

var_a, var_b: Identity.

Variable names must be all lowercase, begin with a letter, and consist of alphanumerics, hyphens, and underscores only. Variables may only hold domain objects; they may not hold relationships.

Where used, a variable may not have either a count or a property block. Where declared, it may have both:

malware_a {name: "bad malware"}: Malware.
2 victims {name: "a victim"}: Identity.

malware_a targets victims.

The count on a variable is given before the variable name, similar to how it is done with domain objects and relationships in normal graph statements. This allows variables to hold multiple values. The above represents a malware targeting two identity objects, the “victims”.

Property blocks on variables may use other variables. This creates dependencies among them. Declaration order is unimportant; the tool figures out an appropriate initialization order automatically:

note {object_refs: (loc id)}: Note.
loc: Location.
id: Identity.

Report on note.

A dependency cycle will cause an error.

Implementation Notes

An obvious question to ask is what STIX object types are currently supported by the library and what names do you use for them in the language. The answer may be counterintuitive, and requires some understanding of the library architecture.

The library is composed of two components: 1. A language “processor” 2. An object generator

The first component is what understands the language and connects the objects together. The second component is a delegate of the first, and is responsible for generating its objects.

So the counterintuitive answer to the question is that the language processor has no hard-coded lists of STIX object names or properties. Anything goes; you just need to follow the lexical rules as described above. E.g. that domain objects start with capital letters and consist of letters and underscores, everything else starts with lower case, etc. STIX domain object names are passed to the object generator, and if the latter component doesn’t know how to generate an object of that type, it will produce an error. But that issue is unrelated to the language itself. You can also use any lexically legal relationship name you want; the language processor will happily create an SRO with that relationship type. It knows little of the STIX specification.

Another important architectural point is that all objects generated by the object generator, and by the language processor internally, are plain “parsed JSON”, i.e. simple Python values like dicts and lists. It is not until the very last step that those values are passed to the stix2 library, from which it creates the final objects which are returned. So the latter library is a dependency of this one. It has its own STIX support, and does certain compliance checks which none of the components of this library necessarily do.

The built-in object generator operates based on “specifications” contained in a JSON data file; it doesn’t have any STIX rules built into the programming. The advantage of all of this is that custom objects can potentially be supported without reprogramming anything in this library at all! (The stix2 library is a different story though.)

So the final answer as to current STIX object support boils down to what object types the object generator and the stix2 library recognize. The latter library has its own documentation. The built-in object generator in this library recognizes the following types:

Attack_Pattern
Campaign
Course_of_Action
Grouping
Identity
Indicator
Infrastructure
Intrusion_Set
Location
Malware
Malware_Analysis
Note
Observed_Data
Opinion
Report
Threat_Actor
Tool
Vulnerability
Artifact
Autonomous_System
Directory
Domain_Name
Email_Address
Email_Message
File
IPv4_Address
IPv6_Address
MAC_Address
Mutex
Network_Traffic
Process
Software
URL
User_Account
Windows_Registry_Key
X509_Certificate