credativ® Inside Archives - credativ®

Introduction

If you regularly deal with code that needs extensive internationalization capabilities, chances are, that you’ve used functionality from one of the ICU libraries before. Being developed by the Unicode Consortium, ICU provides reliable, mature and extensive implementations for all kinds of tools for internationalization and Unicode text operations. Traditionally, there have been two implementations of ICU, ICU4C implemented in C and ICU4J implemented in Java. These libraries have been the gold standard in correct Unicode text handling and i18n for many years. But for some years now, the Unicode Consortium has been developing on ICU4X, a relatively new implementation in Rust.

The focus of ICU4X is on availability on many platforms and in many programming languages. While older implementations like ICU4C and ICU4J are very mature and at the moment provide more functionality than ICU4X, these libraries have a very large code size and large runtime memory footprint, making them infeasible to use in resource constrained environments like in web browsers or on mobile or embedded devices. ICU4X takes care to reduce library code size and provides additional facilities to optimize the code size of both the library itself and the Unicode data shipped with an application.

In this article, I will provide an overview of what ICU4X can do and how to do it. If you’ve worked with other ICU implementations before, many of them will probably feel familiar, if, on the other hand, you never came in contact with ICU, this article should give you a good introduction on how to perform various Unicode text operations using ICU4X.

Prerequisites

I will be showing a lot of code examples on how to use ICU4X in Rust. While it should not be strictly necessary to understand Rust to understand the basics of what’s going on, some familiarity with the language will definitely help to understand the finer details. If you’re unfamiliar with Rust and want to learn more, I recommend The Rust Book as an introduction.

During the examples I’ll be referring to various functions and types from ICU4X without showing their types in full detail. Feel free to open API documentation alongside this article, to look up any types for the functions mentioned.

Test setup

If you want to run the example for yourself, I recommend setting up a cargo project with the appropriate dependency:

$ cargo new --bin icu4x-blog
$ cd icu4x-blog
$ cargo add icu

This initializes a basic Cargo.toml and src/main.rs. Now you can paste any example code into the generated main functions inside main.rs and run your examples using cargo run:

$ cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/icu4x-blog`
Hello, world!

For now this only outputs the default “Hello, world!“ message generated by cargo. So let’s go on to add our own examples.

Locales

The behavior of some of ICU4X’s operations depends on lingual or cultural context. When they do, we need to specify what lingual or cultural background we want. We do this in the form of so-called Locales. At its core, a locale is identified by a short string identifying a language and region. They usually look something like “en-US” for an American English locale, or “de-AT” for German language locale as spoken in Austria.

Locales don’t do anything exciting on their own. They only tell other operations how to behave, so construction is basically the only thing we do with Locales. There are two main ways to construct a locale. We can use the locale! macro to construct and validate a static Locale like this:

let en_us = icu::locid::locale!("en-US");
println!("{en_us}");

Or we can try to parse a locale from a string at runtime:

let de_at = "de-AT".parse::<icu::locid::Locale>().unwrap();
println!("{de_at}");

Note that parsing a locale can fail on invalid inputs. This is encoded by the parse function returning a Result<Locale, ParserError>. In the example above we use unwrap to ignore the possibility of an error, which will panic on actual invalid inputs:

let invalid_locale = "Invalid!".parse::<icu::locid::Locale>().unwrap();
println!("{invalid_locale}");

Taken together, these examples will produce the following output:

$ cargo run
[...]
en-US
de-AT

thread 'main' panicked at src/main.rs:8:67:
called `Result::unwrap()` on an `Err` value: InvalidLanguage
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

In practical scenarios, you will probably want to detect locales dynamically at runtime via some standard mechanism provided by your operating system or some other execution platform. Unfortunately, there is currently no standardized way to detect ICU4X locales from external sources, but progress to implementing such a solution is tracked in this issue.

Now that we’ve looked on how to construct Locales, let’s look at some operations that need locales to function.

Collation

The first operation we’re going to look at is collation. You’re probably familiar with the concept of comparing strings lexicographically. In Rust the str type already implements the Ord trait, allowing us easy lexicographic comparisons and sorting of strings. However not all languages and cultures agree on which order letters should be sorted in. As an example, in Germany the letter Ä is usually sorted right after A, while in Sweden the letter Ä is usually sorted after Z. Rust’s standard method of comparing strings does not take these regional differences into account. ICU4X provides us with collation functionality to compare and sort strings with these cultural differences in mind.

Construction

The first step for doing so is to create a Collator like this:

let de_de = icu::locid::locale!("de-DE");
let collator_de = icu::collator::Collator::try_new(&de_de.into(), Default::default()).unwrap();

The first parameter is the locale we want collation for. Or technically, it’s a DataLocale, because that’s what try_new wants. The difference doesn’t need to concern us too much right now. Just know, that we can convert a Locale to a DataLocale using .into(). The second parameter is a CollatorOptions structure, which we could use to specify more specific options for the collation. We won’t look at the specific options here and instead just use the default options, but check out the API documentation if you’re curious about what options you can specify. At last, we unwrap the Collator, since creating it can fail in cases where no collation data for the given locale could be found. We’ll talk about this possibility later when talking about data handling.

Now that we have one collator for the de-DE locale, let’s build another for Swedish (sv-SE):

let sv_se = icu::locid::locale!("sv-SE");
let collator_sv = icu::collator::Collator::try_new(&sv_se.into(), Default::default()).unwrap();

Usage

Now that we have built some collators, let’s sort some strings with the standard Rust comparison and different locales to see the different results:

let mut strings = ["Abc", "Äbc", "ZYX"];
strings.sort_by(Ord::cmp);
println!("Rust default sorted: {strings:?}");
strings.sort_by(|a, b| collator_de.compare(a, b));
println!("Collated de-DE: {strings:?}");
strings.sort_by(|a, b| collator_sv.compare(a, b));
println!("Collated sv-SE: {strings:?}");

This produces the following output:

$ cargo run
[...]
Rust default sorted: ["Abc", "ZYX", "Äbc"]
Collated de-DE: ["Abc", "Äbc", "ZYX"]
Collated sv-SE: ["Abc", "ZYX", "Äbc"]

As predicted, the German collation sorted the strings differently than the Swedish collation. Incidentally, the default Rust order sorted these specific strings the same as the Swedish collation, though in practice you shouldn’t rely on coincidences like this and always use the correct collation when sorting Strings for display purposes.

Calendars

Sometimes it’s easy to forget, but not all cultures use the same calendar system. And even different cultures sharing the same calendar system might use different formats to represent dates. ICU4X provides support for converting between representations of different calendars and formatting them according to local taste. Besides the Gregorian calendar, popular in most regions of the world, and the ISO calendar often used for technical purposes, many other calendars, such as Japanese, Ethiopian or Indian are supported. However, no functionality for retrieving the current time is currently provided, so in real applications you will have to convert from some other representation first.

Construction

In the next example, we have a known date given as an ISO date, that we want to display in some locale:

let iso_date = icu::calendar::Date::try_new_iso_date(1978, 3, 8).unwrap();

Next we create a DateFormatter:

let local_formatter = icu::datetime::DateFormatter::try_new_with_length(
    &icu::locid::locale!("th").into(),
    icu::datetime::options::length::Date::Medium,
)
.unwrap();

We will use the formatter to format dates into locale specific textual representation. During creation we get to pick a locale (or rather a DataLocale again). We’re picking the Thai locale, because unlike most of the world it uses the Buddhist calendar instead of the Gregorian calendar. We also get to pick a format length that gives us some control over the length of the date format. We use the medium length, which uses abbreviated month names, if these are available in the locale, or numeric months otherwise.

Formatting

Now we just format the date and print it:

let local_format = local_formatter.format(&iso_date.to_any()).unwrap();
println!("{local_format}");

Which gives us this output:

$ cargo run
[...]
8 มี.ค. 2521

If, like me, you’re not well versed in the Buddhist calendar and Thai month names, this probably won’t tell you much. But that’s exactly the point of using an i18n library like ICU4X. We can use general operations that do the correct thing for any supported locale, without having to understand the intricacies of every specific locale.

When calling format you have to be careful to pass a date that belongs to a calendar suitable for the locale of the formatter. Even though its parameter type suggests that a date from any calendar can be used, the operation only accepts dates from the ISO calendar or dates from correct calendar for that locale (i.e. in this example we could have also passed a date that was already formatted according to the Buddhist calendar). In this case, we used a ISO date, which is always accepted. If you have a date in an entirely different calendar it gets more complicated. You would need to convert your date to the correct target calendar explicitly and then pass it to format. For this you obtain the needed calendar using AnyCalendar::new_for_locale and do the conversion using Date::to_calendar.

Normalization

Unicode texts are represented as a sequence of numbers called code points. But not each code point has its own atomic meaning. Some sequences of code points combine into groups to represent more complex characters. Due to this complexity it is possible in many cases, that different sequences of code points represent the same sequence of semantic characters. As an example, the letter Ä can be represented as the single code point U+00C4 or as the sequence of code points U+0041 U+0308 (an A followed by combining two dots above). This has implications when we want to compare strings for equality. Naively we might want to compare strings by checking if each of the code points are equal. But that would mean that strings that compare different, because they contain different code points, actually contain the semantically same characters.

To deal with this situation, ICU4X gives us string normalization. The idea is as follows: Before comparing strings to each other we “normalize” each string. Normalization transforms the string into a normalized representation, thereby ensuring that all strings that are semantically equal also have the same normalized representation. This means that once we have normalized the strings we want to compare, we can simply compare the resulting strings by code point to determine if the original strings where semantically the same.

Normalization forms

Before we can perform this normalization, we need to understand that there are multiple forms of normalization. These forms are differentiated by two properties. On one axis, they can be composing or decomposing. On the other axis, they can be canonical or compatible.

Composed normalization forms ensure that the normalized form has as few code points as possible, e.g. for the letter Ä the single code point form would be used. Decomposed normalization on the other hand always chooses the representation requiring the most code points available, e.g. for the letter Ä the two code point form would be used. With composed normalization we need less storage space to store the normalized form. However, composed normalization is also usually slower to perform than decomposed normalization, because internally composed normalization first has to run decomposed normalization and then compress the result. As a rule of thumb, it is usually recommended that composed normalization be used when the normalized strings are stored on disk or sent over the network, whereas decomposed normalization should be used when the normalized form is only used internally within an application.

Canonical normalization only considers different code point representations of the same characters to be equal. Compatible normalization goes a step further and considers characters that convey the same meaning, but differ in representation, to be equal. As an example, under compatible normalization the characters “2”, “²” and “②” are all considered equal, whereas under canonical normalization they are different. Compatible normalization can be appropriate when normalizing identifiers such as usernames to detect close-but-different lookalikes.

Taking all of this together, this gives us four different possible forms of normalization:

Performing normalization

Once we have decided on a normalization form to use, actually performing the normalization is easy. Here’s an example using NFD normalization:

let string1 = "\u{00C4}";
let string2 = "\u{0041}\u{0308}";

let rust_equal = string1 == string2;

let normalizer = icu::normalizer::DecomposingNormalizer::new_nfd();
let normalized1 = normalizer.normalize(string1);
let normalized2 = normalizer.normalize(string2);
let normalized_equal = normalized1 == normalized2;

println!(
    "1: {string1}, 2: {string2}, rust equal: {rust_equal}, normalized equal: {normalized_equal}"
)
$ cargo run
[...]
1: Ä, 2: Ä, rust equal: false, normalized equal: true

As we can see, string1 and string2 look the same when printed, but the == operator doesn’t consider them equal. However, normalizing both strings and comparing the results, does compare them equal.

NFKD normalization can be used by constructing the normalizer using DecomposingNormalizer::new_nfkd. NFC and NFKC are accessible using ComposingNormalizer::new_nfc and ComposingNormalizer::new_nfkc respectively.

Segmentation

When we look into Unicode texts, we’ll often find that they aren’t only made up of individual code points, but rather of larger constructs consisting of multiple code points, such as words or lines. When processing text, it is often necessary to recognize, where boundaries between these individual pieces are. In ICU4X this process is called segmentation and it provides us with four different types of segments to recognize: graphemes, words, sentences, and lines. The process of segmenting is very similar for each one, but each of them also has their own quirks, so we’ll look at each of them in turn.

Graphemes

As previously mentioned, some code points combine with other code points thereby gaining a different meaning than each code point would have individually. If we break strings apart between two combined code points, the code points can no longer combine and thus revert to their individual meaning. Here’s an example of such unintentional changes in meaning happening:

let string1 = "\u{61}\u{308}\u{6f}\u{308}\u{75}\u{308}";
let string2 = "stu";
println!("string1: {string1}, string2: {string2}");

let (split1, split2) = string1.split_at(4);
println!("split1: {split1}, split2: {split2}");
println!("combined: {string2}{split2}");
$ cargo run
[...]
string1: äöü, string2: stu
split1: äo, split2: ̈ü
combined: stüü

First, note that the output of split1 and split2 shows that what was previously an ö has now been split into an o and a loose pair of two dots. Even worse: when we combine string2 and split2 in a single output, the dots at the start of split2 combine with the last character of string2 forming an extra “ü” that was never intended to exist.

Graphemes to the rescue

So how do we know where it is safe to split a string, without altering the meaning of its contained characters? For this purpose, Unicode defines the concept of grapheme clusters, which is a sequence of code points that have a single meaning together, but are unaffected by the meaning of code points around them. As long as we’re careful to split strings only on the boundaries between grapheme clusters, we can be sure not to inadvertently change the semantics of characters contained in the string. Similarly, when we build a user interface for text editing or text selection, we should be careful to present a single grapheme cluster to the user as a single unbreakable unit.

To find out, where the boundaries between grapheme cluster are, ICU4X gives us the GraphemeClusterSegmenter. Let’s look at how it would have segmented our string from earlier:

let string = "\u{61}\u{308}\u{6f}\u{308}\u{75}\u{308}";
println!("string: {string}");
let grapheme_boundaries: Vec<usize> = icu::segmenter::GraphemeClusterSegmenter::new()
    .segment_str(string)
    .collect();
println!("grapheme boundaries: {grapheme_boundaries:?}");
$ cargo run
[...]
string: äöü
grapheme boundaries: [0, 3, 6, 9]

As we can see, the segment_str function returns an iterator over indices where boundaries between grapheme clusters are located. Naturally the first index is always 0 and the last index is always the end of the string. We can also see, that the index 4, where we split our string in the last example, was not a boundary between grapheme clusters, and thus our split caused the change in meaning we observed. Had we instead split the string at the indices 3 or 6, we would have not had the same problems.

Words

Sometimes it is helpful to separate a string into its individual words. For this purpose, we get the aptly named WordSegmenter. So let’s get right into it:

let string = "Hello world";
println!("string: {string}");
let word_boundaries: Vec<usize> = icu::segmenter::WordSegmenter::new_auto()
    .segment_str(string)
    .collect();
println!("word boundaries: {word_boundaries:?}");
$ cargo run
[...]
string: Hello world
word boundaries: [0, 5, 6, 11]

So far this is very similar to the GraphemeClusterSegmenter we’ve seen before. But what if we want the words themselves and not only their boundaries? We can just iterate over windows of two boundaries at a time and slice the original string:

let words: Vec<&str> = word_boundaries
    .windows(2)
    .map(|bounds| &string[bounds[0]..bounds[1]])
    .collect();
println!("words: {words:?}");
$ cargo run
[...]
words: ["Hello", " ", "world"]

This looks better. It gives use the two words we expect. It also gives us the white space between words. If we do not want that, we can ask the WordSegmenter to tell us if a given boundary comes after a real word or just some white space and filter on that:

let word_boundaries: Vec<(usize, icu::segmenter::WordType)> =
    icu::segmenter::WordSegmenter::new_auto()
        .segment_str(string)
        .iter_with_word_type()
        .collect();
println!("word boundaries: {word_boundaries:?}");
let words: Vec<&str> = word_boundaries
    .windows(2)
    .filter_map(|bounds| {
        let (start, _) = bounds[0];
        let (end, word_type) = bounds[1];
        if word_type.is_word_like() {
            Some(&string[start..end])
        } else {
            None
        }
    })
    .collect();
println!("words: {words:?}");
$ cargo run
[...]
word boundaries: [(0, None), (5, Letter), (6, None), (11, Letter)]
words: ["Hello", "world"]

In case you were wondering why the constructor for WordSegmenter is called new_auto, it’s because there are multiple algorithms for word segmentation to choose from. There are also new_dictionary and new_lstm and not every algorithm works equally well for different writing systems. new_auto is a good choice in the general case, as it automatically picks a good implementation based on the actual data encountered in the string.

Sentences

If we want to break strings into sentences, SentenceSegmenter does just that. There’s not much special to it, so let’s get right into it:

let string = "here is a sentence. This is another sentence.";
println!("string: {string}");
let sentence_boundaries: Vec<usize> = icu::segmenter::SentenceSegmenter::new()
    .segment_str(string)
    .collect();
println!("sentence boundaries: {sentence_boundaries:?}");
let words: Vec<&str> = sentence_boundaries
    .windows(2)
    .map(|bounds| &string[bounds[0]..bounds[1]])
    .collect();
println!("words: {words:?}");
$cargo run
[...]
string: here is a sentence. This is another sentence.
sentence boundaries: [0, 20, 45]
words: ["here is a sentence. ", "This is another sentence."]

No surprises there, so let’s move on.

Lines

The LineSegmenter identifies boundaries at which strings may be split into multiple lines. Let’s see an example:

let string = "The first line.\nThe\u{a0}second line.";
println!("string: {string}");
let line_boundaries: Vec<usize> = icu::segmenter::LineSegmenter::new_auto()
    .segment_str(string)
    .collect();
println!("line boundaries: {line_boundaries:?}");
let lines: Vec<&str> = line_boundaries
    .windows(2)
    .map(|bounds| &string[bounds[0]..bounds[1]])
    .collect();
println!("lines: {lines:?}");
$ cargo run
[...]
string: The first line.
The second line.
line boundaries: [0, 4, 10, 16, 28, 33]
lines: ["The ", "first ", "line.\n", "The\u{a0}second ", "line."]

This gives us more individual “lines” than we might have previously anticipated. That’s because the LineSegmenter not only gives us boundaries on line breaks already contained in the string, but also gives us boundaries in places where a soft line break could be placed. This can be very useful if you want to wrap a long string over multiple lines.

If you want to differentiate whether a given boundary is a hard line break contained in the string or just an opportunity for an optional line break, you can negotiate the character right before the line break using icu::properties::maps::line_break.

Case Mapping

When processing Unicode texts, there is sometimes the need to transform letter between lower case and upper case. ICU4X gives us various tool for this, so let’s look at each of them.

UPPERCASE and lowercase

Lowercasing and uppercasing are very simple operations on the surface. They do similar things to Rust’s built-in str::to_lowercase and str::to_uppercase methods. So let’s see why ICU4X has separate support for them:

let string = "AaBbIıİi";
println!("string: {string}");

let locale = icu::locid::locale!("de-DE");

let cm = icu::casemap::CaseMapper::new();
let lower = cm.lowercase_to_string(string, &locale.id);
let upper = cm.uppercase_to_string(string, &locale.id);
println!("lower: {lower}, upper: {upper}");
$cargo run
[...]
string: AaBbIıİi
lower: aabbiıi̇i, upper: AABBIIİI

So far this looks like the familiar lowercasing and uppercasing operations from most languages’ standard libraries. But note that we had to provide locale.id to run these operations. The twist here is that the rules for lowercasing and uppercasing can vary by language, which is reflected in ICU4X’s variants of these operations. Observe how the result changes if we use the locale tr-TR instead of de-DE:

$ cargo run
[...]
string: AaBbIıİi
lower: aabbııii, upper: AABBIIİİ

With ICU4X we don’t need to know the details of how different lowercase and uppercase letters pair up in different languages. As long as we pass the correct locale, ICU4X will do the correct thing.

Note however, that uppercasing and lowercasing operations are only intended for display purposes. If you want to compare strings case-insensitively, you want case folding instead, which we will look at later.

Titlecasing

Titlecasing is the process of uppercasing the first letter of a segment and lowercasing all other characters. So for example, if we wanted to titlecase every word in a string, we would first use a WordSegmenter to extract every word and then use a TitlecaseMapper to perform the titlecasing on every word.

let string = "abc DŽ 'twas words and more wORDS";
println!("string: {string}");

let locale = icu::locid::locale!("de-DE");

let cm = icu::casemap::TitlecaseMapper::new();
let word_segments: Vec<usize> = icu::segmenter::WordSegmenter::new_auto()
    .segment_str(string)
    .collect();

let titlecased: String = word_segments
    .windows(2)
    .map(|bounds| {
        let word = &string[bounds[0]..bounds[1]];
        cm.titlecase_segment_to_string(word, &locale.id, Default::default())
    })
    .collect();
println!("titlecased: {titlecased}");
$ cargo run
[...]
string: abc DŽ 'twas words and more wORDS
titlecased: Abc Dž 'Twas Words And More Words

Again we had to provide &locale.id to specify which language-specific rules to obey during case transformations. Additionally we can pass other options as a third parameter. Here we’ve used the default options, but feel free to checkout out the API documentation to see what other options are supported.

Note how DŽ was transformed to Dž, even though it is a single letter whose regular uppercase form is DŽ. This is because each character has separate uppercase and titlecase forms, which just happen to be the same for most latin characters. Also note that 'twas was transformed to 'Twas. This is because the TitlecaseMapper titlecases the first letter in a word and skips over non-letter characters at the start of a word when doing so.

Case folding

Sometimes we want to tell whether two strings are equal while ignoring differences in casing. Traditionally this has been done by transforming both strings to lower case or upper case to eliminate differences in casing and comparing those strings. With Unicode strings, for some characters simple lowercasing or uppercasing isn’t enough to eliminate all differences in casing. As an example, the German letter ß uppercases to SS, but there’s also an uppercase version of ß: ẞ, which uppercases to itself, but lowercases to a regular ß. To consistently eliminate all casing differences, we need to map SS, ß, and ẞ all to the same output character. Luckily for us, ICU4X gives us the case folding operation, which promises to do just that. Let’s see it in action:

let string = "SSßẞ";
println!("string: {string}");

let locale = icu::locid::locale!("de-DE");

let cm = icu::casemap::CaseMapper::new();
let upper = cm.uppercase_to_string(string, &locale.id);
let lower = cm.lowercase_to_string(string, &locale.id);
let folded = cm.fold_string(string);
println!("upper: {upper}, lower: {lower}, folded: {folded}");
$ cargo run
[...]
string: SSßẞ
upper: SSSSẞ, lower: ssßß, folded: ssssss

As we see, in the folded string all the different versions of ß have been consistently turned into ss, which successfully eliminates all casing differences. It also means that a single ß would be considered equal to a lowercase ss, which we might not have considered equal otherwise. This is a kind of ambiguity that is hard to avoid when comparing strings case-insensitively.

Note that we didn’t have to specify any locale or language for the case folding operation. This is because case folding is often used for identifiers that are supposed to behave identically regardless of the lingual context they’re used in. The case folding operation tries to use rules that work best across most languages. However, they don’t work perfectly for Turkic languages. To deal with this, there’s an alternative case folding operation fold_turkic_string just for Turkic languages. In most cases you’ll probably want to use the general folding operation, unless you’re really sure you need the special behavior for Turkic languages.

Case-insensitive comparison

Given the case folding operation, we could implement a function to compare two strings case-insensitively like this:

fn equal_ci(a: &str, b: &str) -> bool {
    let cm = icu::casemap::CaseMapper::new();
    cm.fold_string(a) == cm.fold_string(b)
}

Data handling

So far we’ve looked at various operations that work correctly in a vast number of locales over strings made up of a huge amount of valid code points. On the surface, these operations were relatively easy to use and most of the time we only needed to specify our input and a desired locale to get the correct result. However, in the background ICU4X needs a lot of data about different locales and Unicode characters to do the right thing in every situation. But so far, we never had to be concerned with this data at all.

So where does ICU4X get all this data from? In the default configuration we’ve been using so far, the data is shipped as part of the library and compiled directly into our application executable. This has the benefit that we don’t need to worry about shipping the data along with the binary and getting access to it at runtime, as the data is always included in the binary. But it comes at the cost of sometimes dramatically increased binary sizes. Since data for a large number of locales is included by default, we’re talking about tens of megabytes of data being included in the binary.

Alternatives to embedded data

Since ICU4X is designed to run even in minimalist environments, such as embedded devices, forcing this increased application binary size on every application would be unacceptable. Instead, ICU4X provides multiple ways to access the relevant data. Besides using the included default set of data, you can also generate you own set of data using icu4x-datagen. This allows you to reduce the data included from the beginning, either by limiting the number of locales to include or by limiting the functionalities supported by the data. Furthermore you have the choice between compiling this data directly into your application binary or putting it into separate data files that your application then parses at runtime.

Reducing the set of available runtime data of course comes with the benefit of reducing the data size needed to ship with your application. On the other hand it has the drawback of reducing the set of operations you can successfully run at runtime. Each bit of data you remove, can have the effect of making some operation fail, if no data is available to perform that operation with the requested locale. As with many other things, reducing the data size has obvious benefits, but it is always a tradeoff. In the examples above we usually used unwrap to ignore the possibility of errors, but in a real application you’ll probably want more sophisticated error handling, like falling back to some non-failing behavior or at least reporting the error to the user.

I’ll avoid going through all the available options in detail, and instead refer to ICU4X’s official Tutorial on data management instead. It should explain all the supported ways to make the required data available to your application.

Conclusion

I hope this has given you a satisfactory overview of what ICU4X can do. As we have seen, a lot of functionality works well out of the box. In other areas functionality is still lacking. For example, I’ve mentioned earlier, that there’s currently no comfortable way to detect the user’s preferred locale in a standard way from the execution environment. Another area where ICU4X is currently lacking behind its C and Java counterparts is translation support. ICU4X and ICU4J provide capabilities for formatting localized message using MessageFormats, which ICU4X still lacks. Similarly, ICU4X doesn’t currently seem to have functionality to deal with resource bundles.

Even though ICU4X doesn’t have all the functionality you might expect of it yet, overall it seems like a good choice for those cases, where it already brings all the required functionality. Given some more time, we may even see more and more of the missing functionality to land in ICU4x.

ProxLB – Version 1.1.0 of the Advanced Loadbalancer for Proxmox Clusters is Ready!

Finally, it’s here – and it’s no April Fool’s joke! The long-awaited version 1.1.0 of ProxLB has been officially released! This new version features a complete code refactoring, making maintenance easier and laying the groundwork for future expansions. Additionally, numerous bugs have been fixed, and more features have been implemented. ProxLB is the result of the dedication of our employee Florian Paul Azim Hoberg, better known as gyptazy, who has applied his knowledge and passion to create a powerful open-source solution for Proxmox clusters. We – as credativ GmbH – believe in the power of open-source software and support him by spending time to this project during the business time.

Closing the gap

ProxLB - An Advanced Loadbalancer for Proxmox ClustersProxLB fills the gap left by the absence of a Dynamic Resource Scheduler (DRS) in Proxmox. As a powerful load balancer, it intelligently migrates workloads or virtual machines (VMs) across all nodes in the cluster, ensuring optimal resource utilization. ProxLB takes CPU, memory, and disk usage into account to prevent over-provisioning and maximize performance.

Automatic maintenance mode handling

One of the standout features of ProxLB is its maintenance mode. When one or more nodes are placed in maintenance mode, all VMs and containers running on them are automatically moved to other nodes, ensuring the best possible resource utilization across the cluster. This allows for seamless updates, reboots, or hardware maintenance without disrupting ongoing operations.

Custom affinity rules

Furthermore, ProxLB offers extensive customization options through affinity and anti-affinity rules. Administrators can specify whether certain VMs should run together on the same node or be deliberately separated. This is particularly useful for high-availability applications or specialized workloads. Another practical feature is the ability to identify the optimal node for new guests. This function can be easily integrated into CI/CD pipelines using tools like Ansible or Terraform to automate deployments and further enhance cluster efficiency. You can see how this works with ProxLB and Terraform in this example.

ProxLB also stands out with its deep integration into the Proxmox API. It fully supports the Access Control List (ACL), eliminating the need for additional SSH access. This not only enhances security but also simplifies management.

Whether used as a one-time operation or in daemon mode, ProxLB is a flexible, transparent, and efficient cluster management solution. Thanks to its open-source license, users can customize the software to meet their specific needs and contribute to its further development.

Download

ProxLB can be installed in many different ways where it can operate and run inside of an dedicated VM (even inside the Proxmox cluster), on bare-metal, on a Proxmox node itself or on containers like LXC or Docker. The project also provides ready to use container images that can directly be used in Docker or Podman. The project’s docs provide you a more detailed overview of the different types and ways to install and use ProxLB, which can simply be found right here. While you can find below resources for a quick-start you should think about switching to the project’s Debian based repository for long-term usage.

Typ       Download
Debian Package       proxlb_1.1.0_all.deb
Container Image       cr.gyptazy.com/proxlb/proxlb:latest

Conclusion

With version 1.1.0, ProxLB lives up to its reputation as an indispensable tool for Proxmox administrators, especially for those transitioning from VMware. Try out the new version and experience how easy and efficient load balancing can be in your cluster! We are also happy to support you with the integration and operation of ProxLB in your cluster, as well as with all other Proxmox-related topics, including planning a migration from other hypervisor technologies to Proxmox!

From March, 1st 2025, the Mönchengladbach-based open source specialist credativ IT Services GmbH will once again operate as an independent company on the market. credativ GmbH was acquired by NetApp in May 2022 and integrated into NetApp Deutschland GmbH on February, 1st 2023. This step enabled the company to draw on extensive experience and a broader resource base. However, after intensive collaboration within the storage and cloud group, it has become clear that credativ can offer the best conditions for addressing the needs of its customers in an even more targeted manner thanks to its regained independence. The transfer of operations will be supported by all 46 employees.

“We have decided to take this step to focus on our core business areas and create the best possible conditions for further growth. This means maximum flexibility for our customers. We would like to thank the NetApp management for this extraordinary opportunity,‘ said David Brauner, Managing Director of credativ IT Services GmbH.

’The change is a testament to the confidence we have in the credativ team and their ability to lead the business towards a prosperous future,” explained Begoña Jara, Vice President of NetApp Deutschland GmbH here in Germany.

What does this change mean for credativ customers?

As a medium-sized company, the open source service provider can rely on even closer collaboration and more direct communication with its customers. An agile structure should enable faster and more customised decisions, and thus a more flexible response to requests and requirements. Naturally, the collaboration with the various NetApp teams and their partner organisations will also continue as before.

credativ has been a service provider in the open source sector since 1999, with a strong focus on IT infrastructure, virtualisation, and cloud technologies. credativ also has a strong team that focuses on open source-based databases, such as PostgreSQL, and related technologies. In the next few weeks, the new company will be renamed credativ GmbH.

Introduction

Proxmox Virtual Environment (VE) is a powerful open-source platform for enterprise virtualization. It supports advanced Dynamic Memory Management features, including Kernel Samepage Merging (KSM) and Memory Ballooning, which can optimize memory usage and improve performance. This blog post evaluates the effectiveness of KSM and Memory Ballooning features in Proxmox VE using Linux virtual machines (VMs). We will set up a VM with Proxmox VE for a test environment, perform tests, and analyze the results to understand how these features can benefit virtualized environments. Additionally, we will have a look at the security concerns of enabling KSM and the risks associated with using ballooning, especially in database environments.

What’s KSM?

Kernel Samepage Merging (KSM), is a memory deduplication feature in Linux kernel that scans for identical memory pages in different processes and merge them into a single page to reduce the memory usage. It is particularly useful in virtualized environments where multiple VMs may have similar or identical data in memory, such as when running the same operating system or applications.

KSM was introduced long ago since the Linux kernel version 2.6.32 in 2009. However, it does not stop the developers to introduced new features for KSM as shown by the 6.x kernel. There are new changes introduced that you can find here: Breakdown of changes to Kernel Samepage Merging (KSM) by Kernel Version. As you can see, the kernel developers are constantly adding new features for KSM to the Linux Kernel to further improve its functionality.

The current Linux Kernel used in Proxmox VE is 6.8.x for example. It supports the newly added „Smart Scan“ feature which we going to test together in this blog post.

What’s Memory Ballooning?

Memory Ballooning is a technique used in virtualized environments to dynamically adjust the memory allocation of VMs based on their current needs. A “balloon driver” within the guest VM allocates unused memory into a pool of memory (the “balloon”), allowing the hypervisor to reallocate memory resources to other VMs as needed. This helps optimizing memory usage across the host system, ensuring that memory is efficiently utilized and not wasted on idle VMs.

Tests Setup

To evaluate KSM and ballooning features in Proxmox VE, we set up a test cluster consisting of one node which we operate within a VM that offers 16GB of RAM. That sample cluster will then run multiple Linux Guest VMs on top of it to demonstrate the KSM and Memory Ballooning features.

The following picture shows an overview of our test VM setup:

Proxmox VE Host:

Linux Guest VM Template:

Linux Guest VMs:

Perform tests

We perform two sets of tests. First, we just evaluate KSM. Then, we perform another tests set to testing Memory ballooning without KSM.

Guest VMs Setup for KSM Tests:

  1. We cloned 8 VMs out of our VM template with 2GB RAM each, as you can see in the picture below.
    Each VM configured with 2GB RAM without ballooning enabled.
  2. Next, we boot those 8 VMs up and start them with LXQt desktop auto-login without triggering KSM. Here, we want to check how much memory each of those VMs consumes before applying any kind of reducing mechanism.

  3. As you can see, all 8 VMs consume 13154.1MB in total. The screenshot above has been captured on our Proxmox VE host.

  4. Enable KSM Smart Scan by the command on host:
    # echo "scan-time" > /sys/kernel/mm/ksm/advisor_mode
  5. Enable KSM run:
    # echo 1 > /sys/kernel/mm/ksm/run

Observations on KSM Smart Scan

The KSM Smart Scan feature appears to be more efficient compared to the classic ksmtuned method, as it comes with optimizations for page scanning that skip pages if de-duplication was not successful in previous attempts. This reduces the CPU time required for scanning pages significantly, which is especially helpful when the system has reached a “steady state“. During our tests, we did not observe ksmd occupying significant system resources, resulting that KSM Smart Scan can optimize memory usage with minimal overhead.

Test Results

  1. After a while as the KSM is scanning and merging pages. The used Mem reduced to 6770.1 Mib.
  2. We also can see the KSM sharing status on Proxmox VE WebUI.

A significant reduction in memory usage was observed. Although there was a slight increase in CPU usage by ksmd during KSM operation, there was no significant degradation in VM performance. This indicates that KSM operates efficiently without imposing a heavy load on the system. The merging of identical pages resulted in better memory utilization, allowing more VMs to run on the same host without additional hardware.

Kernel Samepage Merging (KSM) in Windows VMs

KSM is a native feature in the Linux kernel that works at the hypervisor level, scanning memory pages across all VMs and merging identical pages into a single shared page. This process reduces the overall memory footprint of the VMs.

For Windows VMs, the hypervisor treats their memory similarly to Linux VMs, identifying and merging identical pages. This means that the benefits of KSM can also extend to Windows VMs running on Proxmox VE due to the fact that Proxmox itself runs Linux and therefore utilizes the KSM kernel feature no matter what OS the guests VMs on top of Proxmox VE are running.

Guest VMs Setup for Ballooning Tests:

Next, let’s have a look at Memory Balloning in another test. In order to evaluate the balloning features in Proxmox VE. To evaluate the ballooning features in Proxmox VE, we will repurpose the Proxmox VE environment used for KSM tests with the following adjustments:

  1. Retain three VMs and remove the others.
  2. Enable Ballooning in each VM.
  3. Set the minimum memory to 2048MB and the maximum memory to 5120MB in each VM
    .
Disable the KSM:

To disable KSM manually, execute the following command:

# echo 2 > /sys/kernel/mm/ksm/run

The following picture shows an overview of our Ballooning test VMs setup:

Due to memory ballooning, we should now have more memory available for each VM. Let’s test this by using stress-ng to allocate 4GB of memory on each guest VM, and hold the allocated memory in seconds you may specify:

$ stress-ng --vm-bytes 4G -m 1 –vm-hang <seconds>

The –vm-hang <seconds> option specify how many seconds that VM hangs before unmapping memory.

OOM-Killer!

We observed the OOM-killer being triggered on the Proxmox VE host.

Having the OOM-killer triggered on the host is problematic. Allocating 5GB of memory to each VM resulted in excessive overcommitment, causing the OOM-killer to activate due to insufficient memory to handle the host’s workload.

OOM-killer triggered are always problematic, but it triggered on the host are even worse compared to triggered within guest VMs since you never know what VM gets terminated and killed or at least it’s really hard to forecast.

One of the basic purpose of Memory balloning is to ovoid OOM-killer triggered on the host system since they can cause „more“ damage than a OOM-killer triggered within a specific VM.

Reduce Maximum Memory Configuration in VMs for Ballooning Tests

To address the overcommitment issue, let’s reduce the maximum memory configuration in each VM to 4GB.

  1. Adjust the maximum memory setting for each VM to 4GB.
  2. Boot up three VMs.

Next, we’ll use stress-ng in the guest VM to allocate 3GB of memory and then hang for a specified duration without CPU usage on each guest VM:

$ stress-ng --vm-bytes 3G -m 1 --vm-hang <seconds>

This is top command in the guest VM.

Check Memory Usage on the Host

After running the stress-ng test, we check the memory usage on the host:

The free memory on the host is now low. The third VM, which is trying to allocate memory, experiences very high CPU usage due to the limited resources available on the host.

After a while, we can observe the ballooning driver starting to reclaim memory from the guest VMs on the host. Each VM’s RES (occupied physical memory) got reduced:

The ballooning driver is now reclaiming memory from each guest VM to increase the available free memory on the host. This action helps to maintain the host’s workload but causes all other guest VMs to slow down due to reduced memory allocation.

Impact of Ballooning on Guest VMs

The slowed down VMs eventually do not have enough available free memory to maintain their workloads. As a result, the OOM-killer is triggered inside the guest VMs:

All the VMs hang for a while, and then the OOM-killer triggers to terminate the stress-ng process. After this, the VMs return to their normal state, and there is sufficient available free memory on the host:

When Does Memory Stealing Get Triggered?

To determine when memory stealing gets triggered, let’s conduct another tests. We will use the same stress-ng command to allocate 3GB of memory on two VMs.

Next, we will gradually allocate memory on the third VM, starting with 512MB and then incrementally adding another 512MB until we observe memory reclaiming being triggered.

As we gradually increase the memory allocation on the third VM, we monitor the host’s memory usage:

We observe that memory stealing is not yet triggered when the available free memory on the host reaches 2978.1MB (approximately 18.5%) of the total memory.

Let’s allocate a bit more memory on the third VM to further reduce the available free memory on the host. We found that when the available free memory on the host reaches around 15% of the total memory, the ballooning driver triggers to stealing memory from the guest VMs:

At this point, we can see the memory allocated to the VMs being reduced and the CPU usage increasing significantly.

The memory stealing process continues until the available free memory on the host reaches 20% of the total memory again. After releasing the allocated memory from the third VM, we observe that the reclaiming process stops when the available free memory on the host reaches 20% of the total memory.

Visualizing the Ballooning Tests Results

The following picture below illustrates the observations from our tests:

In this picture, you can see the following key points:

  1. More than 20% free available memory on host: The initial memory allocation to the VMs, where each VM is configured to be able allocated a maximum 4GB of memory.
  2. Free available memory reached 18.6% on host: The first and second VMs have allocated their maximum of 4GB of memory. The incremental allocation of memory to the third VM begins, starting with 512MB and increasing by 512MB increments.
  3. Triggering Memory Stealing: The point at which the available free memory on the host drops to around 15% of the total memory, triggering the ballooning driver to reclaim memory from the guest VMs. The red color in guest VMs indicates increased CPU usage as the ballooning driver stealing memory, affecting the performance of the guest VMs.

Memory Ballooning in Windows VMs

Memory ballooning also works with Windows VMs in Proxmox VE by Windows VirtIO Drivers. You can find the drivers ISO from the Proxmox wiki or download directly from upstream VirtIO drivers ISO.

Compared to Linux VMs

Memory hot plug is supported in Linux VMs, allowing the total amount of memory to change dynamically when the ballooning driver is active. This means that in Linux VMs, you can see the total memory allocation adjust in real-time as the ballooning driver works. Windows does not support memory hot plug in the same way. As a result, you won’t see the total amount of memory adjusted in a Windows VM. Instead, you will observe an increase in the amount of used memory. Despite this difference, the end result is the same: the available free memory is reduced as the ballooning driver reclaims memory.

This screenshot shows you will observe the used memory increased when ballooning is active to stealing memory inside Windows VM.

Results

Memory ballooning in Proxmox VE is a powerful feature for dynamically managing memory allocation among VMs, optimizing the host’s overall memory usage. However, it’s crucial to understand the thresholds that trigger memory reclaiming to avoid performance degradation. It is recommended to set an appropriate minimum memory limit to ensure that no more memory can be stolen once this minimum threshold is reached, this way to keep the stability of the guest VM and preventing the OOM-killer from terminating processes inside the guest VM. By appropriately setting, carefully monitoring, and adjusting memory allocations, you can ensure a stable and efficient virtual environment.

Security Concerns

Implications of Enabling KSM

According to the Kernel Samepage Merging (KSM) document from Proxmox VE wiki. It mentioned the implications of KSM. There are already some document proof by researchers that “Memory Deduplication as Threat to the Guest OS” , it is possible to perform “Remote Memory-Deduplication Attacks”, and also possible compromising Linux VMs by “New FFS Rowhammer Attack Hijacks Linux VMs”.

In the concern, you should only enable KSM when you have full control of all the VMs. If you are using Proxmox VE to provide hosting services, you better consider disabling KSM to protect your users. Furthermore, you should check your country’s regulations, as disabling KSM may be a legal requirement.

Risks When Using Databases with Ballooning

Memory ballooning dynamically adjusts the memory allocation of VMs based on demand. While this feature is beneficial for optimizing memory usage, it poses certain risks when used with database like PostgreSQL, which rely heavily on available memory for performance. If the balloon driver reclaims too much memory, where overcommitting memory pages can lead to trigger OOM-Killer to kill the process with the highest score until the high memory stress situation is over. And the process with the highest score metrics could be on memory consumption which highly possibility the database itself.

In the concern, you better running database server in VM without Memory Ballooning enabled, or set no overcommit policy in the Linux kernel inside the guest VM if you don’t have such control.

Conclusion

Our tests demonstrate that KSM and memory ballooning are effective features in Proxmox VE for optimizing memory usage in virtualized environments. KSM can significantly reduce memory usage by merging identical pages across VMs, while memory ballooning allows dynamic adjustment of memory allocation based on demand.

Memory ballooning in Proxmox VE is a powerful feature for dynamically managing memory allocation among VMs, optimizing the host’s overall memory usage. However, it’s crucial to understand the thresholds that trigger memory reclaiming to avoid performance degradation. By carefully monitoring and adjusting memory allocations, you can ensure a stable and efficient virtual environment.

Together, these features can enhance the efficiency and performance of virtualized workloads, making Proxmox VE a robust solution for enterprise virtualization.

By leveraging KSM and memory ballooning, organizations can achieve better resource utilization and potentially reduce hardware costs. If you have full control of the host and all the VMs, consider enabling these features in Proxmox VE to explore these potential benefits.

Introduction

In our previous article, we introduced NetApp Storage and NVMe-oF for Breakthrough Performance in Proxmox Virtualization Environments. That article introduced LVM with NVMe-oF via TCP in NetApp storage with Proxmox VE, highlighting its potential to deliver high-performance storage solution that suitable for latency-sensitive applications like virtualized data servers . And it works over Ethernet network without other specialized hardware such as Fibre Channel or InfiniBand, which can be cost-prohibitive for many enterprises.

While NVMe-oF offers significant performance benefits, it is primarily supported on newer and higher-end NetApp ONTAP storage systems, like the AFF series. For organizations with older or hybrid storage systems, iSCSI remains a viable and cost-effective alternative that leverages existing Ethernet infrastructure and provides reliable performance for virtualization environments.

In this blog post, we will delve into using iSCSI (Internet Small Computer Systems Interface) in NetApp Storage with Proxmox VE.

Setup

Hardware and Software Used in This Example

What is iSCSI

iSCSI (Internet Small Computer Systems Interface) is a network protocol that allows for the transport of block-level storage data over IP (Internet Protocol) networks. The protocol allows clients (called initiators) to send SCSI commands over TCP/IP to storage devices (targets) on remote servers. It allows for the connection of storage devices over a standard network infrastructure without requiring specialized hardware and cabling.

It offers a flexible, cost-effective, and scalable storage solution that integrates well with virtualization environments, providing the necessary features and performance to support modern virtualized workloads.

How iSCSI Works

Configuration

On NetApp Storage

The guide presumes that users have already established the foundational storage setup, including the configuration of Storage Virtual Machines (SVMs). It highlights that the administration of these systems is relatively straightforward, thanks to the intuitive web interface – ONTAP System Manager provided by NetApp storage systems. Users can expect a user-friendly experience when managing their storage solutions, as the web interface is designed to simplify complex tasks. This also includes the whole setup for iSCSI storage, which requires to enabling iSCSI in general on the SVM, setting up the SAN Initiator group and mapping it to LUNs.

Note: All changes can of course also be performed in an automated way by orchestrating the ONTAP API.

Enable iSCSI target on SVM

To enable iSCSI at the SVM level on a NetApp storage system, this can typically be done by following these summarized steps, which can be accessed through the system’s web interface ONTAP System Manager.

Navigate to the Storage menu. Then, navigate to Storage VMs. Specify the SVM name you wish to configure:

  1. Configure iSCSI Protocol: Within the SVM settings, look for a section or tab related to proto
    cols. Locate the iSCSI option and enable it. This might involve checking a box or switching a toggle to the ‘on’ position.
  2. Save and Apply Changes: After enabling iSCSI, ensure to save the changes. There might be additional prompts or steps to confirm the changes, depending on the specific NetApp system and its version.

Remember to check for any prerequisites or additional configuration settings that might be required for iSCSI operation, such as network settings, licensing, or compatible hardware checks. The exact steps may vary slightly depending on the version of ONTAP or the specific NetApp model you are using. Always refer to the latest official NetApp documentation or support resources for the most accurate guidance.

Create SAN initiator group

Navigate to the HOSTS menu. Then, navigate toSAN initiator groups. Select the specific initiator group name you wish to configure:

  1. Configure Host initiators: On each Proxmox VE nodes, look up /etc/iscsi/initiatorname.iscsi file and collect the InitiatorName (iSCSI host IQN). Add the each InitiatorName as host initiators in the initiator group in ONTAP System Manager.
  2. Save and Apply Changes: After adding all the initiators, ensure to save the changes. There might be additional prompts or steps to confirm the changes, depending on the specific NetApp system and its version.

Create LUNs with initiator group

Navigate to the Storage menu. Then, navigate to LUNS. Select and specific LUN name you wish to configure:

  1. Add LUNs: Specify the number of LUNs you want to configure. Set the Host Operating System to Linux. Select the Initiator Group created in the previous step.
  2. Save and Apply Changes: After adding LUNs, ensure to save the changes. There might be additional prompts or steps to confirm the changes, depending on the specific NetApp system and its version.

Configuring Proxmox Node

General

After configuring the NetApp storage appliance for the creation of the iSCSI Target and LUNs, we can now configure Proxmox VE cluster to use and access the iSCSI storage. This can be easily configured by the Proxmox web interface. In general, this process consists of:

The next steps in this blog post will cover the process in detail and guide you through the necessary steps on the Proxmox VE which can be done on the Proxmox web interface.

Connecting With the iSCSI Block Storage

To use this iSCSI block storage on Proxmox VE cluster, follow the steps to log in to the web frontend of the Proxmox VE cluster, and add the storage at the datacenter level:

Navigate to the Storage Configuration: Go to Datacenter -> Storage -> Add -> iSCSI.

Define the New iSCSI Storage Details:

Create LVM on the iSCSI Block Storage

To use this LVM Volume Group over iSCSI block storage on all Proxmox VE nodes within the cluster, the Volume Group must be added and integrated at the datacenter level. Follow these steps to configure it through the Proxmox VE web interface:

Navigate to the Storage Configuration: Go to Datacenter -> Storage -> Add -> LVM.

 

Define the New LVM Storage Details:

Press Add to attach the new volume to the selected nodes. The LVM storage will then available for use.

Conclusion

The utilization of iSCSI via TCP in addition to Proxmox VE in a virtualization environment presents a compelling solution for organizations looking for cost-effective shared storage architectures. This approach leverages the widespread availability and compatibility of Ethernet-based networks, avoiding the need for specialized hardware such as Fibre Channel, which can be cost-prohibitive for many enterprises.

However, block-level storage with SAN protocols (FC/iSCSI/NVMe-oF) is typically restricted to the VM Disk and Container Image content types supported by Proxmox VE. Additionally, guest VM snapshots and thin provisioning are currently not supported when using LVM/iSCSI storage in Proxmox VE.

There are alternatives that NetApp storage can also serve to meet the more complete needs of Proxmox VE. The NAS protocols, such as NFS, support all content types of Proxmox VE and are typically configured once at the datacenter level. Guest VMs can use disks of type raw, qcow2, or VMDK on NAS protocol storage. Furthermore, guest VM snapshots and thin provisioning are supported with the qcow2 format.

At your convenience, we are available to provide more insights into NetApp storage systems, covering both hardware and software aspects. Our expertise also extends to open-source products, especially in establishing virtualization environments using technologies like Proxmox and OpenShift or in maintaining them with configuration management. We invite you to reach out for any assistance you require.

You might also be interested in learning how to migrate VMs from VMware ESXi to Proxmox VE or how to include the Proxmox Backup Server and NetApp Storage and NVMe-oF for Breakthrough Performance in Proxmox Virtualization Environments into your infrastructure.

ONTAP snapshots for Proxmox VE

In modern IT infrastructure, virtualization is the key to efficient resource management. With virtualization, memory, cpu, network and storage resources can easily be assigned to and shared between virtual machines (VM). Virtualization also comes with the advantage of being able to easily change the resources assigned to virtual machines or to clone virtual machines as needed. Since the virtual machines hard disk is just a file, it can easily be resized, copied and backuped.

Motivation

One of the many advantages of virtualization is the ability to easily create snapshots of the virtual machine disk images.

Proxmox VE offers this ability for the qcow2 disk image format, but not for the raw format on NFS based storage. When creating a VM disk snapshot with Proxmox VE on the local filesystem, Proxmox VE uses the features of the underlying filesystem to create snapshots. In cases, in which the VM disk is placed on a NFS storage, it is not possible for Proxmox VE to use any filesystem features to create a snapshot and therefore has to fallback to file based snapshots.

Since NetApp ONTAP offers snapshot features on file level and on volume level using its own filesystem features, it could be an advantage to use this features over the Proxmox VE file based snapshots.

The common way to connect Proxmox VE with NetApp ONTAP would be NFS. NetApp ONTAP also supports iSCSI, but that is out of scope, as an iSCSI connected storage offers full access to the filesystem and therefore Proxmox VE could use the filesystem features.

ONTAP snapshots

A FileClone is a copy of a file, that points to the same blocks as the original file, only changes are written to new blocks. Therefore creating a FileClone happens instantly and writing to a FileClone does not create any overhead as a file based snapshot in Proxmox VE does. Also deleting or better merging the snapshot back into the virtual disk image is not necessary, as a FileClone is a full copy of the original virtual disk. Just delete the original virtual disk and continue using the FileClone or switch back to the original virtual disk image and delete the clone if a roll-back is necessary.

A VolumeClone works the same way as the FileClone, but on volume level. A VolumeClone creates an instant copy of a complete volume, a Snapshot. The Snapshot references the same blocks as the original volume and changes on the original volume are written to new blocks. It is also possible to access the Snapshot by creating a new volume from it, this is called a FlexClone volume.

This can be used to create a snapshot of a Proxmox VE storage with all its virtual machine disk images instead of creating snapshots of single virtual machine disk images. With the FlexClone volume functionality it is easy to access the data in the clone.

Since Proxmox VE does not support ONTAP features directly, there is a small Python script that uses the ONTAP Rest API and the Proxmox VE API to make this features easily accessible. The script is able to create a FileClone of a virtual machine disk image, it can create the clone from a running VM, but is also able to suspend or stop a VM before creating the clone and then start the VM again.

The script also gives easy access to the VolumeClone and FlexClone features, it is able to create, manage, mount, unmount and delete clones of a volume. When mounting a clone of a volume, it will also automatically be attached as an additional storage to Proxmox VE.

NFS features

In Linux kernel 5.3 the NFS mount option `nconnect` was introduced. By default a NFS client will use one TCP connection to the server, this can be a bottleneck in the case of high NFS work loads. With the nconnect option the number of TCP connections per server can be increased up to 16 connections.

nconnect is supported by Proxmox VE since version 6.2 and by ONTAP 9 and can be easily used by adding the nconnect=<value> to the mount options for the NFS share on the client side.

The other interesting NFS mount option is max_connect. max_connect sets the maximum number of connections to different server IPs belonging to the same NFSv4.1+ server. This is called trunking and is a multipath functionality. The difference to nconnect is, that nconnect sets the number of TCP connections to one server IP, while max_connect sets the number of TCP connections to the same server over multiple IPs. This is supported by ONTAP since version 9.14.1.

Combining the options is possible, but might not lead to the desired result.

More

To further optimize the resource usage, data deduplication on storage systems is an important feature, since especially virtual machine disk images share the same (operating system) data between each other. Deduplication is reducing the used space in this use case very effective. ONTAP supports deduplication.

The issue of table and index bloat due to failed inserts on unique constraints is well known and has been discussed in various articles across the internet. However, these discussions sometimes lack a clear, practical example with measurements to illustrate the impact. And despite the familiarity of this issue, we still frequently see this design pattern—or rather, anti-pattern—in real-world applications. Developers often rely on unique constraints to prevent duplicate values from being inserted into tables. While this approach is straightforward, versatile, and generally considered effective, in PostgreSQL, inserts that fail due to unique constraint violations unfortunately always lead to table and index bloat. And on high-traffic systems, this unnecessary bloat can significantly increase disk I/O and the frequency of autovacuum runs. In this article, we aim to highlight this problem once again and provide a straightforward example with measurements to illustrate it. We suggest simple improvement that can help mitigate this issue and reduce autovacuum workload and disk I/O.

Two Approaches to Duplicate Prevention

In PostgreSQL, there are two main ways to prevent duplicate values using unique constraints:

1. Standard Insert Command (INSERT INTO table)

The usual INSERT INTO table command attempts to insert data directly into the table. If the insert would result in a duplicate value, it fails with a “duplicate key value violates unique constraint” error. Since the command does not specify any duplicate checks, PostgreSQL internally immediately inserts the new row and only then begins updating indexes. When it encounters a unique index violation, it triggers the error and deletes the newly added row. The order of index updates is determined by their relation IDs, so the extent of index bloat depends on the order in which indexes were created. With repeated “unique constraint violation” errors, both the table and some indexes accumulate deleted records leading to bloat, and the resulting write operations increase disk I/O without achieving any useful outcome.

2. Conflict-Aware Insert (INSERT INTO table … ON CONFLICT DO NOTHING)

The INSERT INTO table ON CONFLICT DO NOTHING command behaves differently. Since it specifies that a conflict might occur, PostgreSQL first checks for potential duplicates before attempting to insert data. If a duplicate is found, PostgreSQL performs the specified action—in this case, “DO NOTHING”—and no error occurs. This clause was introduced in PostgreSQL 9.5, but some applications either still run on older PostgreSQL versions or retain legacy code when the database is upgraded. As a result, this conflict-handling option is often underutilized.

Testing Example

To be able to do testing we must start PostgreSQL with “autovacuum=off”. Otherwise with instance mostly idle, autovacuum will immediately process bloated objects and it would be unable to catch statistics. We create a simple testing example with multiple indexes:

CREATE TABLE IF NOT EXISTS test_unique_constraints(
  id serial primary key,
  unique_text_key text,
  unique_integer_key integer,
  some_other_bigint_column bigint,
  some_other_text_column text);

CREATE INDEX test_unique_constraints_some_other_bigint_column_idx ON test_unique_constraints (some_other_bigint_column );
CREATE INDEX test_unique_constraints_some_other_text_column_idx ON test_unique_constraints (some_other_text_column );
CREATE INDEX test_unique_constraints_unique_text_key_unique_integer_key__idx ON test_unique_constraints (unique_text_key, unique_integer_key, some_other_bigint_column );
CREATE UNIQUE test_unique_constraints_unique_integer_key_idx INDEX ON test_unique_constraints (unique_text_key );
CREATE UNIQUE test_unique_constraints_unique_text_key_idx INDEX ON test_unique_constraints (unique_integer_key );

And now we populate this table with unique data:

DO $$
BEGIN
  FOR i IN 1..1000 LOOP
    INSERT INTO test_unique_constraints
    (unique_text_key, unique_integer_key, some_other_bigint_column, some_other_text_column)
    VALUES (i::text, i, i, i::text);
  END LOOP;
END;
$$;

In the second step, we use a simple Python script to connect to the database, attempt to insert conflicting data, and close the session after an error. First, it sends 10,000 INSERT statements that conflict with the “test_unique_constraints_unique_int_key_idx” index, then another 10,000 INSERTs conflicting with “test_unique_constraints_unique_text_key_idx”. The entire test is done in a few dozen seconds, after which we inspect all objects using the “pgstattuple” extension. The following query lists all objects in a single output:

WITH maintable AS (SELECT oid, relname FROM pg_class WHERE relname = 'test_unique_constraints')
SELECT m.oid as relid, m.relname as relation, s.*
FROM maintable m
JOIN LATERAL (SELECT * FROM pgstattuple(m.oid)) s ON true
UNION ALL
SELECT i.indexrelid as relid, indexrelid::regclass::text as relation, s.*
FROM pg_index i
JOIN LATERAL (SELECT * FROM pgstattuple(i.indexrelid)) s ON true
WHERE i.indrelid::regclass::text = 'test_unique_constraints'
ORDER BY relid;

Observed Results

After running the whole test several times, we observe the following:

Here is one example output from the query shown above after the test run which used unique values for all columns. As we can see, bloat of non unique indexes due to failed inserts can be big.

 relid |                       relation                                  | table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-------+-----------------------------------------------------------------+-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
 16418 | test_unique_constraints                                         |   1269760 |        1000 |     51893 |          4.09 |            20000 |        1080000 |              85.06 |       5420 |         0.43
 16424 | test_unique_constraints_pkey                                    |    491520 |       21000 |    336000 |         68.36 |                0 |              0 |                  0 |      51444 |        10.47
 16426 | test_unique_constraints_some_other_bigint_column_idx            |    581632 |       16396 |    326536 |         56.14 |                0 |              0 |                  0 |     168732 |        29.01
 16427 | test_unique_constraints_some_other_text_column_idx              |    516096 |       16815 |    327176 |         63.39 |                0 |              0 |                  0 |     101392 |        19.65
 16428 | test_unique_constraints_unique_text_key_unique_integer_key__idx |   1015808 |       21000 |    584088 |          57.5 |                0 |              0 |                  0 |     323548 |        31.85
 16429 | test_unique_constraints_unique_text_key_idx                     |     57344 |        1263 |     20208 |         35.24 |                2 |             32 |               0.06 |      15360 |        26.79
 16430 | test_unique_constraints_unique_integer_key_idx                  |     40960 |        1000 |     16000 |         39.06 |                0 |              0 |                  0 |       4404 |        10.75
(7 rows)

In a second test, we modify the script to include the ON CONFLICT DO NOTHING clause in the INSERT command and repeat both tests. This time, inserts do not result in errors; instead, they simply return “INSERT 0 0”, indicating that no records were inserted. Inspection of the Transaction ID after this test shows only a minimal increase, caused by background processes. Attempts to insert conflicting data did not result in increase of Transaction ID (XID), as PostgreSQL started first only virtual transaction to check for conflicts, and because a conflict was found, it aborted the transaction without having assigned a new XID. The “pgstattuple” output confirms that all objects contain only live data, with no dead tuples this time.

Summary

As demonstrated, each failed insert bloats the underlying table and some indexes, and increases the Transaction ID because each failed insert occurs in a separate transaction. Consequently, autovacuum is forced to run more frequently, consuming valuable system resources. Therefore applications still relying solely on plain INSERT commands without ON CONFLICT conditions should consider reviewing this implementation. But as always, the final decision should be based on the specific conditions of each application.

 

Artificial Intelligence (AI) is often regarded as a groundbreaking innovation of the modern era, yet its roots extend much further back than many realize. In 1943, neuroscientist Warren McCulloch and logician Walter Pitts proposed the first computational model of a neuron. The term “Artificial Intelligence” was coined in 1956. The subsequent creation of the Perceptron in 1957, the first model of a neural network, and the expert system Dendral designed for chemical analysis demonstrated the potential of computers to process data and apply expert knowledge in specific domains. From the 1970s to the 1990s, expert systems proliferated. A pivotal moment for AI in the public eye came in 1997 when IBM’s chess-playing computer Deep Blue defeated chess world champion Garry Kasparov.

The new millennium brought a new era for AI, with the integration of rudimentary AI systems into everyday technology. Spam filters, recommendation systems, and search engines subtly shaped online user experiences. In 2006, deep learning emerged, marking the renaissance of neural networks. The landmark development came in 2017 with the introduction of Transformers, a neural network architecture that became the most important ingredient for the creation of Large Language Models (LLMs). Its key component, the attention mechanism, enables the model to discern relationships between words over long distances within a text. This mechanism assigns varying weights to words depending on their contextual importance, acknowledging that the same word can hold different meanings in different situations. However, modern AI, as we know it, was made possible mainly thanks to the availability of large datasets and powerful computational hardware. Without the vast resources of the internet and electronic libraries worldwide, modern AI would not have enough data to learn and evolve. And without modern performant GPUs, training AI would be a challenging task.

The LLM is a sophisticated, multilayer neural network comprising numerous interconnected nodes. These nodes are the micro-decision-makers that underpin the collective intelligence of the system. During its training phase, an LLM learns to balance myriad small, simple decisions, which, when combined, enable it to handle complex tasks. The intricacies of these internal decisions are typically opaque to us, as we are primarily interested in the model’s output. However, these complex neural networks can only process numbers, not raw text. Text must be tokenized into words or sub-words, standardized, and normalized — converted to lowercase, stripped of punctuation, etc. These tokens are then put into a dictionary and mapped to unique numerical values. Only this numerical representation of the text allows LLMs to learn the complex relationships between words, phrases, and concepts and the likelihood of certain words or phrases following one another. LLMs therefore process texts as huge numerical arrays without truly understanding the content. They lack a mental model of the world and operate solely on mathematical representations of word relationships and their probabilities. This focus on the answer with the highest probability is also the reason why LLMs can “hallucinate” plausible yet incorrect information or get stuck in response loops, regurgitating the same or similar answers repeatedly.

Based on the relationships between words learned from texts, LLMs also create vast webs of semantic associations that interconnect words. These associations form the backbone of an LLM’s ability to generate contextually appropriate and meaningful responses. When we provide a prompt to an LLM, we are not merely supplying words; we are activating a complex network of related concepts and ideas. Consider the word “apple”. This simple term can trigger a cascade of associated concepts such as “fruit,” “tree,” “food,” and even “technology” or “computer”. The activated associations depend on the context provided by the prompt and the prevalence of related concepts in the training data. The specificity of a prompt greatly affects the semantic associations an LLM considers. A vague prompt like “tell me about apples” may activate a wide array of diverse associations, ranging from horticultural information about apple trees to the nutritional value of the fruit or even cultural references like the tale of Snow White. An LLM will typically use the association with the highest occurrence in its training data when faced with such a broad prompt. For more targeted and relevant responses, it is crucial to craft focused prompts that incorporate specific technical jargon or references to particular disciplines. By doing so, the user can guide the LLM to activate a more precise subset of semantic associations, thereby narrowing the scope of the response to the desired area of expertise or inquiry.

LLMs have internal parameters that influence their creativity and determinism, such as “temperature”, “top-p”, “max length”, and various penalties. However, these are typically set to balanced defaults, and users should not modify them; otherwise, they could compromise the ability of LLMs to provide meaningful answers. Prompt engineering is therefore the primary method for guiding LLMs toward desired outputs. By crafting specific prompts, users can subtly direct the model’s responses, ensuring relevance and accuracy. The LLM derives a wealth of information from the prompt, determining not only semantic associations for the answer but also estimating its own role and the target audience’s knowledge level. By default, an LLM assumes the role of a helper and assistant, but it can adopt an expert’s voice if prompted. However, to elicit an expert-level response, one must not only set an expert role for the LLM but also specify that the inquirer is an expert as well. Otherwise, an LLM assumes an “average Joe” as the target audience by default. Therefore, even when asked to impersonate an expert role, an LLM may decide to simplify the language for the “average Joe” if the knowledge level of the target audience is not specified, which can result in a disappointing answer.

Consider two prompts for addressing a technical issue with PostgreSQL:

1. “Hi, what could cause delayed checkpoints in PostgreSQL?”

2. “Hi, we are both leading PostgreSQL experts investigating delayed checkpoints. The logs show checkpoints occasionally taking 3-5 times longer than expected. Let us analyze this step by step and identify probable causes.”

The depth of the responses will vary significantly, illustrating the importance of prompt specificity. The second prompt employs common prompting techniques, which we will explore in the following paragraphs. However, it is crucial to recognize the limitations of LLMs, particularly when dealing with expert-level knowledge, such as the issue of delayed checkpoints in our example. Depending on the AI model and the quality of its training data, users may receive either helpful or misleading answers. The quality and amount of training data representing the specific topic play a crucial role.

Highly specialized problems may be underrepresented in the training data, leading to overfitting or hallucinated responses. Overfitting occurs when an LLM focuses too closely on its training data and fails to generalize, providing answers that seem accurate but are contextually incorrect. In our PostgreSQL example, a hallucinated response might borrow facts from other databases (like MySQL or MS SQL) and adjust them to fit PostgreSQL terminology. Thus, the prompt itself is no guarantee of a high-quality answer—any AI-generated information must be carefully verified, which is a task that can be challenging for non-expert users.

With these limitations in mind, let us now delve deeper into prompting techniques. “Zero-shot prompting” is a baseline approach where the LLM operates without additional context or supplemental reference material, relying on its pre-trained knowledge and the prompt’s construction. By carefully activating the right semantic associations and setting the correct scope of attention, the output can be significantly improved. However, LLMs, much like humans, can benefit from examples. By providing reference material within the prompt, the model can learn patterns and structure its output accordingly. This technique is called “few-shot prompting”. The quality of the output is directly related to the quality and relevance of the reference material; hence, the adage “garbage in, garbage out” always applies.

For complex issues, “chain-of-thought” prompting can be particularly effective. This technique can significantly improve the quality of complicated answers because LLMs can struggle with long-distance dependencies in reasoning. Chain-of-thought prompting addresses this by instructing the model to break down the reasoning process into smaller, more manageable parts. It leads to more structured and comprehensible answers by focusing on better-defined sub-problems. In our PostgreSQL example prompt, the phrase “let’s analyze this step by step” instructs the LLM to divide the processing into a chain of smaller sub-problems. An evolution of this technique is the “tree of thoughts” technique. Here, the model not only breaks down the reasoning into parts but also creates a tree structure with parallel paths of reasoning. Each path is processed separately, allowing the model to converge on the most promising solution. This approach is particularly useful for complex problems requiring creative brainstorming. In our PostgreSQL example prompt, the phrase “let’s identify probable causes” instructs the LLM to discuss several possible pathways in the answer.

Of course, prompting techniques have their drawbacks. Few-shot prompting is limited by the number of tokens, which restricts the amount of information that can be included. Additionally, the model may ignore parts of excessively long prompts, especially the middle sections. Care must also be taken with the frequency of certain words in the reference material, as overlooked frequency can bias the model’s output. Chain-of-thought prompting can also lead to overfitted or “hallucinated” responses for some sub-problems, compromising the overall result.

Instructing the model to provide deterministic, factual responses is another prompting technique, vital for scientific and technical topics. Formulations like “answer using only reliable sources and cite those sources” or “provide an answer based on peer-reviewed scientific literature and cite the specific studies or articles you reference” can direct the model to base its responses on trustworthy sources. However, as already discussed, even with instructions to focus on factual information, the AI’s output must be verified to avoid falling into the trap of overfitted or hallucinated answers.

In conclusion, effective prompt engineering is a skill that combines creativity with strategic thinking, guiding the AI to deliver the most useful and accurate responses. Whether we are seeking simple explanations or delving into complex technical issues, the way we communicate with the AI always makes a difference in the quality of the response. However, we must always keep in mind that even the best prompt is no guarantee of a quality answer, and we must double-check received facts. The quality and amount of training data are paramount, and this means that some problems with received answers can persist even in future LLMs simply because they would have to use the same limited data for some specific topics.

When the model’s training data is sparse or ambiguous in certain highly focused areas, it can produce responses that are syntactically valid but factually incorrect. One reason AI hallucinations can be particularly problematic is their inherent plausibility. The generated text is usually grammatically correct and stylistically consistent, making it difficult for users to immediately identify inaccuracies without external verification. This highlights a key distinction between plausibility and veracity: just because something sounds right it does not mean it is true.

Whether the response is an insightful solution to a complex problem or completely fabricated nonsense is a distinction that must be made by human users, based on their expertise of the topic at hand. Our clients gained repeatedly exactly this experience with different LLMs. They tried to solve their technical problems using AI, but answers were partially incorrect or did not work at all. This is why human expert knowledge is still the most important factor when it comes to solving difficult technical issues. The inherent limitations of LLMs are unlikely to be fully overcome, at least not with current algorithms. Therefore expert knowledge will remain essential in delivering reliable, high-quality solutions even in the future. As people increasingly use AI tools in the same way they rely on Google — using them as a resource or assistant — true expertise will still be needed to interpret, refine, and implement these tools effectively. On the other hand, AI is emerging as a key driver of innovations. Progressive companies are investing heavily in AI, facing challenges related to security and performance. And this is the area where NetApp can also help. Its cloud AI focused solutions are designed to address exactly these issues.

(Picture generated by my colleague Felix Alipaz-Dicke using ChatGPT-4.)


Mastering Cloud Infrastructure with Pulumi: Introduction

In today’s rapidly changing landscape of cloud computing, managing infrastructure as code (IaC) has become essential for developers and IT professionals. Pulumi, an open-source IaC tool, brings a fresh perspective to the table by enabling infrastructure management using popular programming languages like JavaScript, TypeScript, Python, Go, and C#. This approach offers a unique blend of flexibility and power, allowing developers to leverage their existing coding skills to build, deploy, and manage cloud infrastructure. In this post, we’ll explore the world of Pulumi and see how it pairs with Amazon FSx for NetApp ONTAP—a robust solution for scalable and efficient cloud storage.


Pulumi – The Theory

Why Pulumi?

Pulumi distinguishes itself among IaC tools for several compelling reasons:

Challenges with Pulumi

Like any tool, Pulumi comes with its own set of challenges:

State Management in Pulumi: Ensuring Consistency Across Deployments

Effective infrastructure management hinges on proper state handling. Pulumi excels in this area by tracking the state of your infrastructure, enabling it to manage resources efficiently. This capability ensures that Pulumi knows exactly what needs to be created, updated, or deleted during deployments. Pulumi offers several options for state storage:

Managing state effectively is crucial for maintaining consistency across deployments, especially in scenarios where multiple team members are working on the same infrastructure.

Other IaC Tools: Comparing Pulumi to Traditional IaC Tools

When comparing Pulumi to other Infrastructure as Code (IaC) tools, several drawbacks of traditional approaches become evident:


Pulumi – In Practice

Introduction

In the this section, we’ll dive into a practical example to better understand Pulumi’s capabilities. We’ll also explore how to set up a project using Pulumi with AWS and automate it using GitHub Actions for CI/CD.

Prerequisites

Before diving into using Pulumi with AWS and automating your infrastructure management through GitHub Actions, ensure you have the following prerequisites in place:

Project Structure

When working with Infrastructure as Code (IaC) using Pulumi, maintaining an organized project structure is essential. A clear and well-defined directory structure not only streamlines the development process but also improves collaboration and deployment efficiency. In this post, we’ll explore a typical directory structure for a Pulumi project and explain the significance of each component.

Overview of a Typical Pulumi Project Directory

A standard Pulumi project might be organized as follows:


/project-root
├── .github
│ └── workflows
│ └── workflow.yml # GitHub Actions workflow for CI/CD
├── __main__.py # Entry point for the Pulumi program
├── infra.py # Infrastructure code
├── pulumi.dev.yml # Pulumi configuration for the development environment
├── pulumi.prod.yml # Pulumi configuration for the production environment
├── pulumi.yml # Pulumi configuration (common or default settings)
├── requirements.txt # Python dependencies
└── test_infra.py # Tests for infrastructure code

NetApp FSx on AWS

Introduction

Amazon FSx for NetApp ONTAP offers a fully managed, scalable storage solution built on the NetApp ONTAP file system. It provides high-performance, highly available shared storage that seamlessly integrates with your AWS environment. Leveraging the advanced data management capabilities of ONTAP, FSx for NetApp ONTAP is ideal for applications needing robust storage features and compatibility with existing NetApp systems.

Key Features

What It’s About

Setting up Pulumi for managing your cloud infrastructure can revolutionize the way you deploy and maintain resources. By leveraging familiar programming languages, Pulumi brings Infrastructure as Code (IaC) to life, making the process more intuitive and efficient. When paired with Amazon FSx for NetApp ONTAP, it unlocks advanced storage solutions within the AWS ecosystem.

Putting It All Together

Using Pulumi, you can define and deploy a comprehensive AWS infrastructure setup, seamlessly integrating the powerful FSx for NetApp ONTAP file system. This combination simplifies cloud resource management and ensures you harness the full potential of NetApp’s advanced storage capabilities, making your cloud operations more efficient and robust.

In the next sections, we’ll walk through the specifics of setting up each component using Pulumi code, illustrating how to create a VPC, configure subnets, set up a security group, and deploy an FSx for NetApp ONTAP file system, all while leveraging the robust features provided by both Pulumi and AWS.

Architecture Overview

A visual representation of the architecture we’ll deploy using Pulumi: Single AZ Deployment with FSx and EC2

The diagram above illustrates the architecture for deploying an FSx for NetApp ONTAP file system within a single Availability Zone. The setup includes a VPC with public and private subnets, an Internet Gateway for outbound traffic, and a Security Group controlling access to the FSx file system and the EC2 instance. The EC2 instance is configured to mount the FSx volume using NFS, enabling seamless access to storage.

Setting up Pulumi

Follow these steps to set up Pulumi and integrate it with AWS:

Install Pulumi: Begin by installing Pulumi using the following command:

curl -fsSL https://get.pulumi.com | sh

Install AWS CLI: If you haven’t installed it yet, install the AWS CLI to manage AWS services:

pip install awscli

Configure AWS CLI: Configure the AWS CLI with your credentials:

aws configure

Create a New Pulumi Project: Initialize a new Pulumi project with AWS and Python:

pulumi new aws-python

Configure Your Pulumi Stack: Set the AWS region for your Pulumi stack:

pulumi config set aws:region eu-central-1

Deploy Your Stack: Deploy your infrastructure using Pulumi:

pulumi preview ; pulumi up

Example: VPC, Subnets, and FSx for NetApp ONTAP

Let’s dive into an example Pulumi project that sets up a Virtual Private Cloud (VPC), subnets, a security group, an Amazon FSx for NetApp ONTAP file system, and an EC2 instance.

Pulumi Code Example: VPC, Subnets, and FSx for NetApp ONTAP

The first step is to define all the parameters required to set up the infrastructure. You can use the following example to configure these parameters as specified in the pulumi.dev.yaml file.

This pulumi.dev.yaml file contains configuration settings for a Pulumi project. It specifies various parameters for the deployment environment, including the AWS region, availability zones, and key name. It also defines CIDR blocks for subnets. These settings are used to configure and deploy cloud infrastructure resources in the specified AWS region.


config:
  aws:region: eu-central-1
  demo:availabilityZone: eu-central-1a
  demo:keyName: XYZ
  demo:subnet1CIDER: 10.0.3.0/24
  demo:subnet2CIDER: 10.0.4.0/24

The following code snippet should be placed in the infra.py file. It details the setup of the VPC, subnets, security group, and FSx for NetApp ONTAP file system. Each step in the code is explained through inline comments.


import pulumi
import pulumi_aws as aws
import pulumi_command as command
import os

# Retrieve configuration values from Pulumi configuration files
aws_config = pulumi.Config("aws")
region = aws_config.require("region")  # The AWS region where resources will be deployed

demo_config = pulumi.Config("demo")
availability_zone = demo_config.require("availabilityZone")  # Availability Zone for the deployment
subnet1_cidr = demo_config.require("subnet1CIDER")  # CIDR block for the public subnet
subnet2_cidr = demo_config.require("subnet2CIDER")  # CIDR block for the private subnet
key_name = demo_config.require("keyName")  # Name of the SSH key pair for EC2 instance access# Create a new VPC with DNS support enabled
vpc = aws.ec2.Vpc(
    "fsxVpc",
    cidr_block="10.0.0.0/16",  # VPC CIDR block
    enable_dns_support=True,    # Enable DNS support in the VPC
    enable_dns_hostnames=True   # Enable DNS hostnames in the VPC
)

# Create an Internet Gateway to allow internet access from the VPC
internet_gateway = aws.ec2.InternetGateway(
    "vpcInternetGateway",
    vpc_id=vpc.id  # Attach the Internet Gateway to the VPC
)

# Create a public route table for routing internet traffic via the Internet Gateway
public_route_table = aws.ec2.RouteTable(
    "publicRouteTable",
    vpc_id=vpc.id,
    routes=[aws.ec2.RouteTableRouteArgs(
        cidr_block="0.0.0.0/0",  # Route all traffic (0.0.0.0/0) to the Internet Gateway
        gateway_id=internet_gateway.id
    )]
)

# Create a single public subnet in the specified Availability Zone
public_subnet = aws.ec2.Subnet(
    "publicSubnet",
    vpc_id=vpc.id,
    cidr_block=subnet1_cidr,  # CIDR block for the public subnet
    availability_zone=availability_zone,  # The specified Availability Zone
    map_public_ip_on_launch=True  # Assign public IPs to instances launched in this subnet
)

# Create a single private subnet in the same Availability Zone
private_subnet = aws.ec2.Subnet(
    "privateSubnet",
    vpc_id=vpc.id,
    cidr_block=subnet2_cidr,  # CIDR block for the private subnet
    availability_zone=availability_zone  # The same Availability Zone
)

# Associate the public subnet with the public route table to enable internet access
public_route_table_association = aws.ec2.RouteTableAssociation(
    "publicRouteTableAssociation",
    subnet_id=public_subnet.id,
    route_table_id=public_route_table.id
)

# Create a security group to control inbound and outbound traffic for the FSx file system
security_group = aws.ec2.SecurityGroup(
    "fsxSecurityGroup",
    vpc_id=vpc.id,
    description="Allow NFS traffic",  # Description of the security group
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=2049,  # NFS protocol port
            to_port=2049,
            cidr_blocks=["0.0.0.0/0"]  # Allow NFS traffic from anywhere
        ),
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=111,  # RPCBind port for NFS
            to_port=111,
            cidr_blocks=["0.0.0.0/0"]  # Allow RPCBind traffic from anywhere
        ),
        aws.ec2.SecurityGroupIngressArgs(
            protocol="udp",
            from_port=111,  # RPCBind port for NFS over UDP
            to_port=111,
            cidr_blocks=["0.0.0.0/0"]  # Allow RPCBind traffic over UDP from anywhere
        ),
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=22,  # SSH port for EC2 instance access
            to_port=22,
            cidr_blocks=["0.0.0.0/0"]  # Allow SSH traffic from anywhere
        )
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol="-1",  # Allow all outbound traffic
            from_port=0,
            to_port=0,
            cidr_blocks=["0.0.0.0/0"]  # Allow all outbound traffic to anywhere
        )
    ]
)

# Create the FSx for NetApp ONTAP file system in the private subnet
file_system = aws.fsx.OntapFileSystem(
    "fsxFileSystem",
    subnet_ids=[private_subnet.id],  # Deploy the FSx file system in the private subnet
    preferred_subnet_id=private_subnet.id,  # Preferred subnet for the FSx file system
    security_group_ids=[security_group.id],  # Attach the security group to the FSx file system
    deployment_type="SINGLE_AZ_1",  # Single Availability Zone deployment
    throughput_capacity=128,  # Throughput capacity in MB/s
    storage_capacity=1024  # Storage capacity in GB
)

# Create a Storage Virtual Machine (SVM) within the FSx file system
storage_virtual_machine = aws.fsx.OntapStorageVirtualMachine(
    "storageVirtualMachine",
    file_system_id=file_system.id,  # Associate the SVM with the FSx file system
    name="svm1",  # Name of the SVM
    root_volume_security_style="UNIX"  # Security style for the root volume
)

# Create a volume within the Storage Virtual Machine (SVM)
volume = aws.fsx.OntapVolume(
    "fsxVolume",
    storage_virtual_machine_id=storage_virtual_machine.id,  # Associate the volume with the SVM
    name="vol1",  # Name of the volume
    junction_path="/vol1",  # Junction path for mounting
    size_in_megabytes=10240,  # Size of the volume in MB
    storage_efficiency_enabled=True,  # Enable storage efficiency features
    tiering_policy=aws.fsx.OntapVolumeTieringPolicyArgs(
        name="SNAPSHOT_ONLY"  # Tiering policy for the volume
    ),
    security_style="UNIX"  # Security style for the volume
)

# Extract the DNS name from the list of SVM endpoints
dns_name = storage_virtual_machine.endpoints.apply(lambda e: e[0]['nfs'][0]['dns_name'])

# Get the latest Amazon Linux 2 AMI for the EC2 instance
ami = aws.ec2.get_ami(
    most_recent=True,
    owners=["amazon"],
    filters=[{"name": "name", "values": ["amzn2-ami-hvm-*-x86_64-gp2"]}]  # Filter for Amazon Linux 2 AMI
)

# Create an EC2 instance in the public subnet
ec2_instance = aws.ec2.Instance(
    "fsxEc2Instance",
    instance_type="t3.micro",  # Instance type for the EC2 instance
    vpc_security_group_ids=[security_group.id],  # Attach the security group to the EC2 instance
    subnet_id=public_subnet.id,  # Deploy the EC2 instance in the public subnet
    ami=ami.id,  # Use the latest Amazon Linux 2 AMI
    key_name=key_name,  # SSH key pair for accessing the EC2 instance
    tags={"Name": "FSx EC2 Instance"}  # Tag for the EC2 instance
)

# User data script to install NFS client and mount the FSx volume on the EC2 instance
user_data_script = dns_name.apply(lambda dns: f"""#!/bin/bash
sudo yum update -y
sudo yum install -y nfs-utils
sudo mkdir -p /mnt/fsx
if ! mountpoint -q /mnt/fsx; then
sudo mount -t nfs {dns}:/vol1 /mnt/fsx
fi
""")

# Retrieve the private key for SSH access from environment variables while running with Github Actions
private_key_content = os.getenv("PRIVATE_KEY")
print(private_key_content)

# Ensure the FSx file system is available before executing the script on the EC2 instance
pulumi.Output.all(file_system.id, ec2_instance.public_ip).apply(lambda args: command.remote.Command(
    "mountFsxFileSystem",
    connection=command.remote.ConnectionArgs(
        host=args[1],
        user="ec2-user",
        private_key=private_key_content
    ),
    create=user_data_script,
    opts=pulumi.ResourceOptions(depends_on=[volume])
))


Pytest with Pulumi

Introduction

Pytest is a widely-used Python testing framework that allows developers to create simple and scalable test cases. When paired with Pulumi, an infrastructure as code (IaC) tool, Pytest enables thorough testing of cloud infrastructure code, akin to application code testing. This combination is crucial for ensuring that infrastructure configurations are accurate, secure, and meet the required state before deployment. By using Pytest with Pulumi, you can validate resource properties, mock cloud provider responses, and simulate various scenarios. This reduces the risk of deploying faulty infrastructure and enhances the reliability of your cloud environments. Although integrating Pytest into your CI/CD pipeline is not mandatory, it is highly beneficial as it leverages Python’s robust testing capabilities with Pulumi.

Testing Code

The following code snippet should be placed in the infra_test.py file. It is designed to test the infrastructure setup defined in infra.py, including the VPC, subnets, security group, and FSx for NetApp ONTAP file system. Each test case focuses on different aspects of the infrastructure to ensure correctness, security, and that the desired state is achieved. Inline comments are provided to explain each test case.


# Importing necessary libraries
import pulumi
import pulumi_aws as aws
from typing import Any, Dict, List

# Setting up configuration values for AWS region and various parameters
pulumi.runtime.set_config('aws:region', 'eu-central-1')
pulumi.runtime.set_config('demo:availabilityZone1', 'eu-central-1a')
pulumi.runtime.set_config('demo:availabilityZone2', 'eu-central-1b')
pulumi.runtime.set_config('demo:subnet1CIDER', '10.0.3.0/24')
pulumi.runtime.set_config('demo:subnet2CIDER', '10.0.4.0/24')
pulumi.runtime.set_config('demo:keyName', 'XYZ') - Change based on your own key

# Creating a class MyMocks to mock Pulumi's resources for testing
class MyMocks(pulumi.runtime.Mocks):
    def new_resource(self, args: pulumi.runtime.MockResourceArgs) -> List[Any]:
        # Initialize outputs with the resource's inputs
        outputs = args.inputs

        # Mocking specific resources based on their type
        if args.typ == "aws:ec2/instance:Instance":
            # Mocking an EC2 instance with some default values
            outputs = {
                **args.inputs,  # Start with the given inputs
                "ami": "ami-0eb1f3cdeeb8eed2a",  # Mock AMI ID
                "availability_zone": "eu-central-1a",  # Mock availability zone
                "publicIp": "203.0.113.12",  # Mock public IP
                "publicDns": "ec2-203-0-113-12.compute-1.amazonaws.com",  # Mock public DNS
                "user_data": "mock user data script",  # Mock user data
                "tags": {"Name": "test"}  # Mock tags
            }
        elif args.typ == "aws:ec2/securityGroup:SecurityGroup":
            # Mocking a Security Group with default ingress rules
            outputs = {
                **args.inputs,
                "ingress": [
                    {"from_port": 80, "cidr_blocks": ["0.0.0.0/0"]},  # Allow HTTP traffic from anywhere
                    {"from_port": 22, "cidr_blocks": ["192.168.0.0/16"]}  # Allow SSH traffic from a specific CIDR block
                ]
            }
        
        # Returning a mocked resource ID and the output values
        return [args.name + '_id', outputs]

    def call(self, args: pulumi.runtime.MockCallArgs) -> Dict[str, Any]:
        # Mocking a call to get an AMI
        if args.token == "aws:ec2/getAmi:getAmi":
            return {
                "architecture": "x86_64",  # Mock architecture
                "id": "ami-0eb1f3cdeeb8eed2a",  # Mock AMI ID
            }
        
        # Return an empty dictionary if no specific mock is needed
        return {}

# Setting the custom mocks for Pulumi
pulumi.runtime.set_mocks(MyMocks())

# Import the infrastructure to be tested
import infra

# Define a test function to validate the AMI ID of the EC2 instance
@pulumi.runtime.test
def test_instance_ami():
    def check_ami(ami_id: str) -> None:
        print(f"AMI ID received: {ami_id}")
        # Assertion to ensure the AMI ID is the expected one
        assert ami_id == "ami-0eb1f3cdeeb8eed2a", 'EC2 instance must have the correct AMI ID'

    # Running the test to check the AMI ID
    pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.ami.apply(check_ami))

# Define a test function to validate the availability zone of the EC2 instance
@pulumi.runtime.test
def test_instance_az():
    def check_az(availability_zone: str) -> None:
        print(f"Availability Zone received: {availability_zone}")
        # Assertion to ensure the instance is in the correct availability zone
        assert availability_zone == "eu-central-1a", 'EC2 instance must be in the correct availability zone'
    
    # Running the test to check the availability zone
    pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.availability_zone.apply(check_az))

# Define a test function to validate the tags of the EC2 instance
@pulumi.runtime.test
def test_instance_tags():
    def check_tags(tags: Dict[str, Any]) -> None:
        print(f"Tags received: {tags}")
        # Assertions to ensure the instance has tags and a 'Name' tag
        assert tags, 'EC2 instance must have tags'
        assert 'Name' in tags, 'EC2 instance must have a Name tag'
    
    # Running the test to check the tags
    pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.tags.apply(check_tags))

# Define a test function to validate the user data script of the EC2 instance
@pulumi.runtime.test
def test_instance_userdata():
    def check_user_data(user_data_script: str) -> None:
        print(f"User data received: {user_data_script}")
        # Assertion to ensure the instance has user data configured
        assert user_data_script is not None, 'EC2 instance must have user_data_script configured'
    
    # Running the test to check the user data script
    pulumi.runtime.run_in_stack(lambda: infra.ec2_instance.user_data.apply(check_user_data))

Github Actions

Introduction

GitHub Actions is a powerful automation tool integrated within GitHub, enabling developers to automate their workflows, including testing, building, and deploying code. Pulumi, on the other hand, is an Infrastructure as Code (IaC) tool that allows you to manage cloud resources using familiar programming languages. In this post, we’ll explore why you should use GitHub Actions and its specific purpose when combined with Pulumi.

Why Use GitHub Actions and Its Importance

GitHub Actions is a powerful tool for automating workflows within your GitHub repository, offering several key benefits, especially when combined with Pulumi:

Execution

To execute the GitHub Actions workflow:

By incorporating this workflow, you ensure that your Pulumi infrastructure is continuously integrated and deployed with proper validation, significantly improving the reliability and efficiency of your infrastructure management process.

Example: Deploy infrastructure with Pulumi


name: Pulumi Deployment

on:
  push:
    branches:
      - main

env:
  # Environment variables for AWS credentials and private key.
  AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
  AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
  AWS_DEFAULT_REGION: ${{ secrets.AWS_DEFAULT_REGION }}
  PRIVATE_KEY: ${{ secrets.PRIVATE_KEY }}

jobs:
  pulumi-deploy:
    runs-on: ubuntu-latest
    environment: dev

    steps:
      - name: Checkout code
        uses: actions/checkout@v3
        # Check out the repository code to the runner.

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v3
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: eu-central-1
        # Set up AWS credentials for use in subsequent actions.

      - name: Set up SSH key
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/XYZ.pem
          chmod 600 ~/.ssh/XYZ.pem
        # Create an SSH directory, add the private SSH key, and set permissions.

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'
        # Set up Python 3.9 environment for running Python-based tasks.
  
      - name: Set up Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '14'
        # Set up Node.js 14 environment for running Node.js-based tasks.

      - name: Install project dependencies
        run: npm install
        working-directory: .
        # Install Node.js project dependencies specified in `package.json`.
      
      - name: Install Pulumi
        run: npm install -g pulumi
        # Install the Pulumi CLI globally.

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
        working-directory: .
        # Upgrade pip and install Python dependencies from `requirements.txt`.

      - name: Login to Pulumi
        run: pulumi login
        env:
          PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}
        # Log in to Pulumi using the access token stored in secrets.
        
      - name: Set Pulumi configuration for tests
        run: pulumi config set aws:region eu-central-1 --stack dev
        # Set Pulumi configuration to specify AWS region for the `dev` stack.

      - name: Pulumi stack select
        run: pulumi stack select dev  
        working-directory: .
        # Select the `dev` stack for Pulumi operations.

      - name: Run tests
        run: |
          pulumi config set aws:region eu-central-1
          pytest
        working-directory: .
        # Set AWS region configuration and run tests using pytest.
    
      - name: Preview Pulumi changes
        run: pulumi preview --stack dev
        working-directory: .
        # Preview the changes that Pulumi will apply to the `dev` stack.
    
      - name: Update Pulumi stack
        run: pulumi up --yes --stack dev
        working-directory: . 
        # Apply the changes to the `dev` stack with Pulumi.

      - name: Pulumi stack output
        run: pulumi stack output
        working-directory: .
        # Retrieve and display outputs from the Pulumi stack.

      - name: Cleanup Pulumi stack
        run: pulumi destroy --yes --stack dev
        working-directory: . 
        # Destroy the `dev` stack to clean up resources.

      - name: Pulumi stack output (after destroy)
        run: pulumi stack output
        working-directory: .
        # Retrieve and display outputs from the Pulumi stack after destruction.

      - name: Logout from Pulumi
        run: pulumi logout
        # Log out from the Pulumi session.

Output:

Finally, let’s take a look at how everything appears both in GitHub and in AWS. Check out the screenshots below to see the GitHub Actions workflow in action and the resulting AWS resources.

GitHub Actions workflow

AWS FSx resource

AWS FSx storage virtual machine


Outlook: Exploring Advanced Features of Pulumi and Amazon FSx for NetApp ONTAP

As you become more comfortable with Pulumi and Amazon FSx for NetApp ONTAP, there are numerous advanced features and capabilities to explore. These can significantly enhance your infrastructure automation and storage management strategies. In a follow-up blog post, we will delve into these advanced topics, providing a deeper understanding and practical examples.

Advanced Features of Pulumi

  1. Cross-Cloud Infrastructure Management
    Pulumi supports multiple cloud providers, including AWS, Azure, Google Cloud, and Kubernetes. In more advanced scenarios, you can manage resources across different clouds in a single Pulumi project, enabling true multi-cloud and hybrid cloud architectures.
  2. Component Resources
    Pulumi allows you to create reusable components by grouping related resources into custom classes. This is particularly useful for complex deployments where you want to encapsulate and reuse configurations across different projects or environments.
  3. Automation API
    Pulumi’s Automation API enables you to embed Pulumi within your own applications, allowing for infrastructure to be managed programmatically. This can be useful for building custom CI/CD pipelines or integrating with other systems.
  4. Policy as Code with Pulumi CrossGuard
    Pulumi CrossGuard allows you to enforce compliance and security policies across your infrastructure using familiar programming languages. Policies can be applied to ensure that resources adhere to organizational standards, improving governance and reducing risk.
  5. Stack References and Dependency Management
    Pulumi’s stack references enable you to manage dependencies between different Pulumi stacks, allowing for complex, interdependent infrastructure setups. This is crucial for large-scale environments where components must interact and be updated in a coordinated manner.

Advanced Features of Amazon FSx for NetApp ONTAP

  1. Data Protection and Snapshots
    FSx for NetApp ONTAP offers advanced data protection features, including automated snapshots, SnapMirror for disaster recovery, and integration with AWS Backup. These features help safeguard your data and ensure business continuity.
  2. Data Tiering and Cost Optimization
    FSx for ONTAP includes intelligent data tiering, which automatically moves infrequently accessed data to lower-cost storage. This feature is vital for optimizing costs, especially in environments with large amounts of data that have varying access patterns.
  3. Multi-Protocol Access and CIFS/SMB Integration
    FSx for ONTAP supports multiple protocols, including NFS, SMB, and iSCSI, enabling seamless access from both Linux/Unix and Windows clients. This is particularly useful in mixed environments where applications or users need to access the same data using different protocols.
  4. Performance Tuning and Quality of Service (QoS)
    FSx for ONTAP allows you to fine-tune performance parameters and implement QoS policies, ensuring that critical workloads receive the necessary resources. This is essential for applications with stringent performance requirements.
  5. ONTAP System Manager and API Integration
    Advanced users can leverage the ONTAP System Manager or integrate with NetApp’s extensive API offerings to automate and customize the management of FSx for ONTAP. This level of control is invaluable for organizations looking to tailor their storage solutions to specific needs.

What’s Next?

In the next blog post, we will explore these advanced features in detail, providing practical examples and use cases. We’ll dive into multi-cloud management with Pulumi, demonstrate the creation of reusable infrastructure components, and explore how to enforce security and compliance policies with Pulumi CrossGuard. Additionally, we’ll examine advanced data management strategies with FSx for NetApp ONTAP, including snapshots, data tiering, and performance optimization.

Stay tuned as we take your infrastructure as code and cloud storage management to the next level!

Conclusion

This example demonstrates how Pulumi can be used to manage AWS infrastructure using Python. By defining resources like VPCs, subnets, security groups, and FSx file systems in code, you can version control your infrastructure and easily reproduce environments.

Amazon FSx for NetApp ONTAP offers a powerful and flexible solution for running file-based workloads in the cloud, combining the strengths of AWS and NetApp ONTAP. Pulumi’s ability to leverage existing programming languages for infrastructure management allows for more complex logic and better integration with your development workflows. However, it requires familiarity with these languages and has a smaller ecosystem compared to Terraform. Despite these differences, Pulumi is a powerful tool for managing modern cloud infrastructure.


Disclaimer

The information provided in this blog post is for educational and informational purposes only. The features and capabilities of Pulumi and Amazon FSx for NetApp ONTAP mentioned are subject to change as new versions and updates are released. While we strive to ensure that the content is accurate and up-to-date, we cannot guarantee that it reflects the latest changes or improvements. Always refer to the official documentation and consult with your cloud provider or technology partner for the most current and relevant information. The author and publisher of this blog post are not responsible for any errors or omissions, or for any actions taken based on the information provided.

Suggested Links

  1. Pulumi Official Documentation: https://www.pulumi.com/docs/
  2. Amazon FSx for NetApp ONTAP Documentation: https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/what-is.html
  3. Pulumi GitHub Repository: https://github.com/pulumi/pulumi
  4. NetApp ONTAP Documentation: https://docs.netapp.com/ontap/index.jsp
  5. AWS VPC Documentation: https://docs.aws.amazon.com/vpc/
  6. AWS Security Group Documentation: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html
  7. AWS EC2 Documentation: https://docs.aws.amazon.com/ec2/index.html

DebConf 2024 from 28. July to 4. Aug 2024 https://debconf24.debconf.org/

Last week the annual Debian Community Conference DebConf happend in Busan, South Korea. Four NetApp employees (Michael, Andrew, Christop and Noël) participated the whole week at the Pukyong National University. The camp takes place before the conference, where the infrastructure is set up and the first collaborations take place. The camp is described in a separate article: https://www.credativ.de/en/blog/credativ-inside/debcamp-bootstrap-for-debconf24/
There was a heat wave with high humidity in Korea at the time but the venue and accommodation at the University are air conditioned so collaboration work, talks and BoF were possible under the circumstances.

Around 400 Debian enthusiasts from all over the world were onsite and additional people attended remotly with the video streaming and the Matrix online chat #debconf:matrix.debian.social

The content team created a schedule with different aspects of Debian; technical, social, political,….
https://debconf24.debconf.org/schedule/

There were two bigger announcements during DebConf24:

  1. the new distribution eLxr https://elxr.org/ based on Debian initiated by Windriver
    https://debconf24.debconf.org/talks/138-a-unified-approach-for-intelligent-deployments-at-the-edge/
    Two takeaway points I understood from this talk is Windriver wants to exchange CentOS and preferes a binary distribution.
  2. The Debian package management system will get a new solver https://debconf24.debconf.org/talks/8-the-new-apt-solver/

The list of interesting talks is much longer from a full conference week. Most talks and BoF were streamed live and the recordings can be found in the video archive:
https://meetings-archive.debian.net/pub/debian-meetings/2024/DebConf24/

It is a tradtion to have a Daytrip for socializing and get a more interesting view of the city and the country. https://wiki.debian.org/DebConf/24/DayTrip/ (sorry the details of the three Daytrip are on the website for participants).

For the annual conference group photo we have to go outsite into the heat with high humidity but I hope you will not see us sweeting.

The Debian Conference 2025 will be in July in Brest, France: https://wiki.debian.org/DebConf/25/ and we will be there.:) Maybe it will be a chance for you to join us.

See also Debian News: DebConf24 closes in Busan and DebConf25 dates announced

DebConf24 https://debconf24.debconf.org/ took place from 2024-07-28 to 2024-08–04 in Busan, Korea.

Four employees (three Debian developers) from NetApp had the opportunity to participate in the annual event, which is the most important conference in the Debian world: Christoph Senkel, Andrew Lee, Michael Meskes and Noël Köthe.

DebCamp

What is DebCamp? DebCamp usually takes place a week before DebConf begins. For participants, DebCamp is a hacking session that takes place just before DebConf. It’s a week dedicated to Debian contributors focusing on their Debian-related projects, tasks, or problems without interruptions.

DebCamps are largely self-organized since it’s a time for people to work. Some prefer to work individually, while others participate in or organize sprints. Both approaches are encouraged, although it’s recommended to plan your DebCamp week in advance.

During this DebCamp, there are the following public sprints:
Python Team Sprint: QA work on the Python Team’s packages
l10n-pt-br Team Sprint: pt-br translation
Security Tools Packaging Team Sprint: QA work on the pkg-security Team’s packages
Ruby Team Sprint: Work on the transition to Ruby 3.3
Go Team Sprint: Get newer versions of docker.io, containerd, and podman into unstable/testing
Ftpmaster Team Sprint: discuss potential changes in ftpmaster team, workflow and communication
DebConf24 Boot Camp: guide people new to debian with a focus on debian packaging
LXQt Team Sprint: Workshop for new commers and work on the latest upstream release based on Qt6 and wayland support.

Scheduled workshops include:

GPG Workshop for Newcomers:
Asymmetric cryptography is a daily tool in Debian operations, used to establish trust and secure communications through email encryption, package signing, and more. This workshop participants will learn to create a PGP key and perform essential tasks such as file encryption/decryption, content signing, and sending encrypted emails. Post-creation, the key will be uploaded to public keyservers, enabling attendees to participate in our Continuous Keysigning Party.

Creating Web Galleries with Geo-Tagged Photos:
Learn how to create a web gallery with integrated maps from a geo-tagged photo collection. The session will cover the use of fgallery, openlayers, and a custom Python script, all orchestrated by a Makefile. This method, used for a South Korea gallery in 2018, will be taught hands-on, empowering others to showcase their photo collections similarly.

Introduction to Creating .deb Files (Debian Packaging):
This session will delve into the basics of Debian packaging and the Debian release cycle, including stable, unstable, and testing branches. Attendees will set up a Debian unstable system, build existing packages from source, and learn to create a Debian package from scratch. Discussions will extend online at #debconf24-bootcamp on irc.oftc.net.

In addition to the organizational part, our colleague Andrew is part of the orga team this year. He suported to arrange Cheese and Wine party and proposed an idea to organize a “Coffee Lab” where people can bring their coffee equipments and beans from their country and share each other during the conference. Andrew successfully set up the Coffee Lab in the social space with support from the “Local Team” and contributors Kitt, Clement, and Steven. They provided a diverse selection of beans and teas from countries such as Colombia, Ethiopia, India, Peru, Taiwan, Thailand, and Guatemala. Additionally, they shared various coffee-making tools, including the “Mr. Clever Dripper,” AeroPress, and AerSpeed grinder.

It also allows the DebConf committee to work together with the local team to prepare additional details for the conference. During DebCamp, the organization team typically handles the following tasks:

Setting up the Frontdesk: This involves providing conference badges (with maps and additional information) and distributing SWAG such as food vouchers, conference t-shirts, conference cups, usb-powered fan, and sponsor gifts.
Setting up the network: This includes configuring the network in conference rooms, hack labs, and video team equipment for live streaming during the event.
Accommodation arrangements: Assigning rooms for participants to check in to on-site accommodations.
Food arrangements: Catering to various dietary requirements, including regular, vegetarian, vegan, and accommodating special religious and allergy-related needs.
Setting up a spcial space: Providing a relaxed environment for participants to socialize and get to know each other.
Writing daily announcements: Keeping participants informed about ongoing activities.
Arranging childcare service.
Organizing day trip options.
Arranging parties.

In addition to the organizational part, our colleague Andrew also attended and arranged private sprints during DebCamp and contiune through DebConf via his LXQt team BoF and LXQt team newcommer private workshop. Where the team received contribution from new commers. The youngest one is only 13 years old who created his first GPG key during the GPG key workshop and attended LXQt team workshop where he managed to fix a few bugs in Debian during the workshop session.

Young kids in DebCamp

At DebCamp, two young attendees, aged 13 and 10, participated in a GPG workshop for newcomers and created their own GPG keys. The older child hastily signed another new attendee’s key without proper verification, not fully grasping that Debian’s security relies on the trustworthiness of GPG keys. This prompted a lesson from his Debian Developer father, who explained the importance of trust by comparing it to entrusting someone with the keys to one’s home. Realizing his mistake, the child considered how to rectify the situation since he had already signed and uploaded the key. He concluded that he could revoke the old key and create a new one after DebConf, which he did, securing his new GPG and SSH keys with a Yubikey.

How and when to use Software-Defined Networks in Proxmox VE

Proxmox is still the current go to solution when it comes to VM workloads by using open-source software. In the past, we already covered several topics around Proxmox, like migrating virtual machines from an ESXi environment to Proxmox environments, using Proxmox in addition to NVMe-oF for breakthrough performance or how to integrate the Proxmox Backup Server into a Proxmox cluster.
We can clearly see that there are still many other very interesting topics around Proxmox and therefore, we want to cover the SDN (software defined networking) feature in Proxmox. From a historical point of view, this feature is not really new – it already got introduced in Proxmox’s web ui with Proxmox 6.2 but was always defined as an experimental feature. This finally changed with Proxmox 8.x where this not only got fully integrated but also got with Proxmox 8.1 the essential feature of IP address management (IPAM). Also, the SDN integration is now installed by default in Proxmox. However, you should still take note that this does not mean that all features are already stable – IPAM with DHCP management and also FRRouting and its controller integration are still in a tech preview state. So far, this sounds pretty interesting!

What is Software-Defined Networking?

But what is SDN and what does it have to do with Proxmox? Software-Defined Networking (SDN) which also often just gets called as Software-Defined Network, is a network architecture that centralizes network intelligence in a programmable controller, which maintains a global view of the network. This architecture allows for dynamic, scalable, and automated network configurations, in contrast to traditional networking where control and data planes are tightly coupled within network devices. The benefits of SDN include increased flexibility and agility, centralized management, improved resource utilization, and enhanced security. These benefits enable a quick deployment and adjustment of network services, simplify the management of large and complex networks, enhance the efficiency of resource allocation, and facilitate the implementation of comprehensive security policies and monitoring.
Proxmox VE also supports SDN to extend its capabilities in managing virtualized networks. With SDN, Proxmox VE offers centralized network management through a unified interface which simplifies the management of virtual networks across multiple nodes. Administrators can define and manage virtual networks at a central point for the whole cluster which reduces the complexity of network configurations. SDN in Proxmox VE also enables network virtualization, allowing the creation of virtual networks that are abstracted from the physical network infrastructure. This capability supports isolated network environments for different virtual machines (VMs) and containers.
Dynamic network provisioning is another key feature of SDN in Proxmox VE, which leverages SDN to dynamically allocate network resources based on the needs of VMs and containers, optimizing performance and resource utilization. The integration of Proxmox VE with Open vSwitch (OVS) enhances these capabilities. OVS is a production-quality, multilayer virtual switch designed to enable SDN and supports advanced network functions such as traffic shaping, QoS, and network isolation. Furthermore, Proxmox VE supports advanced networking features like VLAN tagging, network bonding, and firewall rules, providing comprehensive network management capabilities.

How to configure a SDN

Knowing the basics and possibilities of Software-Defined Networking (SDN) now, it gets interesting to set up such a network within a Proxmox cluster.

Proxmox comes with support for software-defined networking (SDN), allowing users to integrate various types of network configurations to suit their specific networking needs. With Proxmox, you have the flexibility to select from several SDN types, including “Simple”, which is likely aimed at straightforward networking setups without the need for advanced features. For environments requiring network segmentation, VLAN support is available, providing the means to isolate and manage traffic within distinct virtual LANs. More complex scenarios might benefit from QinQ support, which allows multiple VLAN tags on a single interface. Also and very interesting for data centers, Proxmox also includes VxLAN support, which extends layer 2 networking over a layer 3 infrastructure which significantly increases the number of possible VLANs which would else be limited to 4096 VLANs. Lastly to mention is the EVPN support which is also part of Proxmox’s SDN offerings, facilitating advanced layer 2 and layer 3 virtualization and providing a scalable control plane with BGP (Border Gateway Protocol) for multi-tenancy environments.

In this guide, we’ll walk through the process of setting up a streamlined Software-Defined Network (SDN) within a Proxmox Cluster environment. The primary goal is to establish a new network, including its own network configuration that is automatically propagated across all nodes within the cluster. This newly created network will created by its own IP space where virtual machines (VMs) receiving their IP addresses dynamically via DHCP. This setup eliminates the need for manual IP forwarding or Network Address Translation (NAT) on the host machines. An additional advantage of this configuration is the consistency it offers; the gateway for the VMs will always remain constant regardless of the specific host node they are operating on.

Configuration

The configuration of Software-Defined Networking (SDN) got very easy within the latest Proxmox VE versions where the whole process can be done in the Proxmox web UI. Therefore, we just connect to the Proxmox management web interface which typically reachable at:

The SDN options are integrated within the datacenter chapter, in the sub chapter SDN. All further work will only be done within this chapter. Therefore, we navigate to:
–> Datacenter
—-> SDN
——–> Zones

The menu on the right site offers to add a new zone where the new zone of the type Simple will be selected. A new windows pops up where we directly activate the advanced options at the bottom. Afterwards, further required details will be provided.

 

ID: devnet01
MTU: Auto
Nodes: All
IPAM: pve
Automatic DHCP: Activate

 

The ID represents the unique identifier of this zone. It might make sense to give it a recognisable name. Usually, we do not need to adjust the MTU size for this kind of default setups. However, there may always be some corner cases. In the node sections, this zone can be assigned to specific nodes or simply to all ones. There may also be scenarios where zones might only be limited to specific nodes. According to our advanced options, further details like DNS server and also the forward- & reverse zones can be defined. For this basic setup, this will not be used but the automatic DHCP option must be activated.

Now, the next steps will be done in the chapter VNets where the previously created zone will be linked to a virtual network. In the same step we will also provide additional network information like the network range etc.

When creating a new VNet, an identifier or name must be given. It often makes sense to align the virtual network name to the previously generated zone name. In this example, the same names will be used. Optionally, an alias can be defined. The important part is to select the desired zone that should be used (e.g., devnet01). After creating the new VNet, we have the possibility to create a new subnet in the same window by clicking on the Create Subnet button.

Within this dialog, some basic network information will be entered. In general, we need to provide the desired subnet in CIDR notation (e.g., 10.11.12.0/24). Defining the IP address for the gateway is also possible. In this example the gateway will be placed on the IP address 10.11.12.1. Important is to activate the option SNAT. SNAT (Source Network Address Translation) is a technique to modify the source IP address of outgoing network traffic to appear as though it originates from a different IP address, which is usually the IP address of the router or firewall. This method is commonly employed to allow multiple devices on a private network to access external networks.

After creating and linking the zone, VNet and the subnet, the configuration can simply be applied on the web interface by clicking on the apply button. The configuration will now be synced to the desired nodes (in our example all ones).

Usage

After applying the configuration on the nodes within the cluster, virtual machines must still be assigned to this network. Luckily, this can easily be done by using the regular Proxmox web interface which now also provides the newly created network devnet01 in the networking chapter of the VM. But also already present virtual machines can be assigned to this network.

When it comes to DevOps and automation, this is also available in the API where virtual machines can be assigned to the new network. Such a task could look like in the following example in Ansible:

- name: Create Container in Custom Network
community.general.proxmox:
vmid: 100
node: de01-dus01-node03
api_user: root@pam
api_password: {{ api_password }}
api_host: de01-dus01-node01
password: {{ container_password }}
hostname: {{ container_fqdn }}
ostemplate: 'local:vztmpl/debian-12-x86_64.tar.gz'
netif: '{"net0":"name=eth0,ip=dhcp,ip6=dhcp,bridge=devnet01"}'

Virtual machines assigned to this network will immediately get IP addresses within our previously defined network 10.11.12.0/24 and can access the internet without any further needs. VMs may also moved across nodes in the cluster without any needs to adjust the gateway, even a node get shut down or rebooted for maintenances.

Conclusion

In conclusion, the integration of Software-Defined Networking (SDN) into Proxmox VE represents a huge benefit from a technical, but also from a user perspective where this feature is also usable from the Proxmox’s web ui. This ease of configuration empowers even those with limited networking experience to set up and manage even more complex network setups as well.

Proxmox makes it also easier with simple SDNs to create basic networks that let virtual machines connect to the internet. You don’t have to deal with complicated settings or gateways on the main nodes. This makes it quicker to get virtual setups up and running and lowers the chance of making mistakes that could lead to security problems.

For people just starting out, Proxmox has a user friendly website that makes it easy to set up and control networks. This is really helpful because it means they don’t have to learn a lot of complicated stuff to get started. Instead, they can spend more time working with their virtual computers and not worry too much about how to connect everything.

People who know more about technology will like how Proxmox lets them set up complex networks. This is good for large scaled setups because it can make the network run better, handle more traffic, and keep different parts of the network separate from each other.

Just like other useful integrations (e.g. Ceph), also the SDN integration provides huge benefits to its user base and shows the ongoing integration of useful tooling in Proxmox.