Biml Basics

gravatar

Scott Currie

Biml Basics

published 08.14.15

last updated 10.05.15


Part of lesson Biml Basics.

Share

Introduction

Recall from the previous lesson that Biml files are just XML files that conform to the Biml XML schema. Since you already know XML, to learn Biml you just need to learn the rules and conventions of the Biml XML schema. Let's do that now.

Your First Biml File

To start with, we need to know what the root element of the Biml document is called. Unsurprisingly, it's Biml, which results in the following code:

<Biml></Biml>

But we don't have a valid Biml file yet, because we haven't declared that this document adheres to the Biml XML schema. To do that, we need to add our default XML namespace declaration attribute to the root element. While this is a string value that you just have to know, thankfully most Biml tools will produce empty documents and/or provide intellisense that fills it out for you - so you don't have to worry about remembering it:

<Biml xmlns="http://schemas.varigence.com/biml.xsd"></Biml>

Your First SSIS Package in Biml

Great! We've just written our first Biml file! If we try to build it, nothing will happen though. It's literally an empty Biml file. We'd like to add some useful content, which brings us to our first Biml language convention:

"If you have a collection of objects in your Biml file, the elements of that collection will always be children of a wrapper collection element."

That's a mouthful, but it's simpler than it sounds. If I have a collection of tables, I will first create a <Tables> collection element and then create each <Table> item as a child of that collection element. The same goes for <Connections>, <Databases>, <Schemas>, <Packages>, <Cubes>, and anything else in the entire Biml object model. This makes the code more readable, improves parsing by third party tools, and has a variety of other benefits. So let's add a package to our Biml file. Even though we are only adding a single package, we still need to use the collection element - because we have the option of later adding more packages to our root collection.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package></Package>
    </Packages>
</Biml>

If we try to build this, we'll get and error telling us that the required property "Name" has not been supplied. That makes sense. Packages must have names, so let's add one:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage"></Package>
    </Packages>
</Biml>

If you now try to build this script, a single empty package named "MyFirstBimlPackage.dtsx" will appear in your project.

Let's make the package a bit more interesting by adding two new tasks (one container and one data flow) to our package. Note that the collection element does not always match the name of its child elements, particularly when a collection is capable of storing children of many different types. Consequently the collection is <Tasks> and its children can be any of the tasks supported by SSIS:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Tasks>
                <Dataflow Name="Dataflow1" />
                <Container Name="Container1" />
            </Tasks>
        </Package>
    </Packages>
</Biml>

Go ahead and build that code to generate an updated package. So far so good. It should look like this:

First Biml Package

Referencing Objects by Name

Suppose we'd like to enhance this package by changing the simple sequence Container to a ForEachFromVariable looping container. In case you're not familiar with it, this new type of container will run one time for each item in a collection stored inside an SSIS variable. Consequently, we need to add both the container and a variable for that container to iterate over.

Before we change the container type, let's start by adding the variable, which works just as you would expect:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Tasks>
                <Dataflow Name="Dataflow1" />
                <Container Name="Container1">
                    <Variables>
                        <Variable Name="LoopVariable" DataType="String">abc</Variable>
                    </Variables>
                </Container>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Now we can change the Container to a ForEachFromVariableLoop. But how do we set the ForEachFromVariableLoop container to iterate over our new variable? Biml enables you to reference objects by their name, so we can just write:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Tasks>
                <Dataflow Name="Dataflow1" />
                <ForEachFromVariableLoop Name="Container1" VariableName="User.LoopVariable">
                    <Variables>
                        <Variable Name="LoopVariable" DataType="String">abc</Variable>
                    </Variables>
                </ForEachFromVariableLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

This will result in the following package:

Biml Simple Loop Container

You might have noticed that the variable was referenced as "User.LoopVariable". Like BIDS/SSDT, SSIS variables are referenced using their namespace and their name. Also like BIDS/SSDT, the default namespace for all SSIS variables in "User". You can override this by using the "Namespace" attribute on the variable definition.

You may have noticed that the attribute we used (VariableName) ends with "Name". This is actually a pattern in Biml and brings us to our second Biml language convention:

"Attribute names will end with 'Name' if and only if those attributes are referencing some other object by name."

References with ScopedName

All name references cannot be so simple, though. Remember that it is possible to have multiple variables with the same name, provided that those variables are defined in different tasks:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Variables>
                <Variable Name="LoopVariable" DataType="String">abc</Variable>
            </Variables>
            <Tasks>
                <Dataflow Name="Container1" />
                <ForEachFromVariableLoop Name="Boo" VariableName="User.LoopVariable">
                    <Variables>
                        <Variable Name="LoopVariable" DataType="String">abc</Variable>
                    </Variables>
                </ForEachFromVariableLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

In the example above, which instance of "Variable1" should Biml use for the execution variable value? This brings us to the notion of scope and scoped names. When you refer to an object by name, the Biml engine will automatically find the object with a matching name that is closest to the reference. In this case, since Container1's instance of LoopVariable is on the same object as the reference, while MyFirstBimlPackage's instance is on the parent object, the Biml compiler will use the Container1 instance. It's closer.

But what if I actually want to use the variable defined in MyFirstBimlPackage? To do so, we remove the ambiguity by using the ScopedName of the target variable. Think of ScopedName as being like multi-part names in SQL Server (i.e. where you can refer to a table using ServerName.DatabaseName.SchemaName.TableName in order to resolve the same type of ambiguity). Let's take a look at this in action:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Variables>
                <Variable Name="LoopVariable" DataType="String">abc</Variable>
            </Variables>
            <Tasks>
                <Dataflow Name="Dataflow1" />
                <ForEachFromVariableLoop Name="Container1" VariableName="MyFirstBimlPackage.User.LoopVariable">
                    <Variables>
                        <Variable Name="LoopVariable" DataType="String">abc</Variable>
                    </Variables>
                </ForEachFromVariableLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Just for reference, the full scoped name of the Container1 variable is MyFirstBimlPackage.Container1.User.Variable1. Note that scoped names can be arbitrarily long based on how deep in the tree an object is defined.

You might be wondering why we don't just require the full ScopedName for every reference. There are a couple of reasons for this. First, it means less typing when we can use a partial name, which from a developer perspective is usually a good thing. Second, it is more convenient in scenarios such as copy/paste. If you use a partial name, when copying and pasting our original Container1 to a different package or perhaps into a new container task, there is nothing to change. If we had used the full ScopedName, then we would have had to modify each reference to use the new package name in every reference in Container1.

One last note on ScopedName, and it's actually our third Biml language convention:

Objects of the same type can never have the same ScopedName.

At first, that may seem strange. Why can objects of different types share a ScopedName? Why would I need that? And wouldn't it cause problems with ambiguous references? To answer the first question, just think of connections and packages. Since both are defined at the root level of the Biml file, their Name and their ScopedName are the same. If objects of different types were not permitted to share a ScopedName, then we could never have connections and packages (or any other root object) with the same name. Of course, we want to be able to do that.

But what about the second question? Doesn't this cause problems? As it turns out, no. All references in Biml are restricted by the type of object you are referencing. For example, VariableName must reference a variable. Consequently, there aren't any cases where having objects of differing type with the same ScopedName can actually cause conflicts.

Annotations

As your Biml programs become more complex, it is often useful to store additional metadata on some of your Biml objects. This metadata could be used for documentation or perhaps to store information that will be consumed by BimlScript code nuggets (which you'll learn more about later). We already know how to write XML comments from the "Intro to XML" lesson, and XML comments certainly work as you would expect with Biml. While XML comments are easy to read and write, they have distinct limitations. They are difficult to access programmatically, and don't provide any kind of categorization or indexing mechanism for cases where you might need to annotate a single object with multiple independent metadata items.

For these reasons, we designed the Biml language to provide a general purpose annotation feature. Any element in the entire Biml language, including even the root node, can be supplemented with a collection of annotation objects. These annotations can each have a type of Description or Tag. Description annotations are automatically used in Biml intelliprompt within Mist and are also emitted into the documentation that Mist automatically creates for your solution. Tag annotations can be used to quickly and easily associate a key/value pair with your Biml object. Let's take a look at an example of a package with both types of annotations applied:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="MyFirstBimlPackage">
            <Annotations>
                <Annotation AnnotationType="Description">This text will be used in documentation and Mist intellisense for MyFirstBimlPackage.</Annotation>
                <Annotation AnnotationType="Tag" Tag="SubjectMatter">Finance</Annotation>
                <Annotation AnnotationType="Tag" Tag="IsRestartable">True</Annotation>
            </Annotations>
        </Package>
    </Packages>
</Biml>

Note that for the Tag annotations, we use the Tag attribute to create a key that will later be used to lookup the associated value. In the above example, we tagged the package with its subject matter and a boolean value indicating whether or not to use the "restartable" pattern. Other packages in the same solution would have the same tag keys but potentially with different values.

In our lesson about Biml Utility Methods, we'll learn about how to easily retrieve our tag annotations from within BimlScript code.

Miscellanea

We've already covered most of the core concepts you need to know to start using Biml, but there are just a few more things that it will help to cover up front.

Required Properties

In Biml, the large majority of all properties are optional and will use their default value if not specified. When a required property is not specified, you will get an error in Mist or BIDSHelper.

Attributes vs. Children

Sometimes when you are looking for a particular bit of configuration in Biml, it's not obvious whether to look for it in an attribute or a child element. Over time, you'll get a feel for where everything lives, but in the meantime, here are a few guidelines:

  1. If a property values stands alone and does not require other properties to be valid, an attribute is used.
  2. If multiple properties must cooperate, a child is used.
  3. Key exceptions:
    • Data type modifiers (e.g. length, precision, scale, codepage)
    • Lesser used SSIS task properties are mirrored for simplicity (e.g. ForceExecutionValue and ForceExecutionValueType)

DataType

In many places in Biml, you need to specify a data type - e.g. table columns, SSIS variables, and SSAS measures. Each of the underlying Microsoft SQL Server services uses a different and sometimes incompatible type system. In Biml we have unified all of these to use System.Data.DbType, which is built into the Microsoft .NET Framework. A major benefit of this is that any code you write using common .NET Data Access APIs will work seamlessly with your Biml code. A table of common type mappings can be found on Cathrine Wilhelmsen's blog: http://www.cathrinewilhelmsen.net/2014/05/27/sql-server-ssis-and-biml-data-types/.

Exploring Biml Further

As you write increasingly complex Biml programs, you will inevitably encounter questions about how to write specific bit of Biml or where to find some configuration value. The following sections offer suggestions for how to most efficiently resolve these questions.

Use Your Tools

Code editing tools for Biml actually provide a tremendous resource for learning the language.

If you are using Mist, you can use the Mist visual designers (which are much like BIDS/SSDT) to make the desired changes to your Biml program. As you make changes in the visual designers, the Biml code is modified for you automatically.

If you have a table or SSIS package for which you would like to see the equivalent Biml code, you can use the Mist import functionality to do so.

Finally, using either BIDSHelper or Mist, you can use the intellisense provided completion lists and quick info tooltips to quickly learn about the elements you are already using and browse the available options.

Use the Documentation

Biml is fully documented on the varigence.com website at https://varigence.com/Documentation/Language/Index. Starting at the root node, the Biml documentation describes every language element, including its purpose, its attributes, child elements, required vs. optional configuration, and much more.

Conclusion

Now that you've learned the basics of Biml syntax, in the upcoming lessons we will explore Biml syntax for relational modeling and SSIS development in more detail.

Finished?

Complete the lesson Biml Basics:

You are not authorized to comment. A verification email has been sent to your email address. Please verify your account.

Comments

gravatar

KiranA

2:31pm 01.06.16

Hi Scott Currie,

I am using BIML to create SSIS package . I am using Flat File Source for CSV. In Connection string it's mandatory to fill FileFormat. What is File Format for csv Flat File Source Connection String ?

gravatar

Joseph88

12:23pm 04.13.16

I believe its delimited

gravatar

Rune

10:28am 06.28.16

Hi Scott

In the walkthrough you refer to Variable1 ("Just for reference, the full scoped name of the Container1 variable is MyFirstBimlPackage.Container1.User.Variable1") but in the example code name the variable "LoopVariable" (abc). Is this a typo or am I missing something!?

Br, Rune Bay

gravatar

reallife

9:36am 04.19.19