09 February 2026

LanguageTool: Powerful Language Checking on Your Own Network

LanguageTool: Powerful Language Checking on Your Own Network

LanguageTool is one of the leading open-source solutions for grammatical and stylistic text checking. While most users are likely familiar with the cloud-based version, the on-premise (self-hosted) variant is gaining increasing importance – especially for businesses, educational institutions, and organizations with high data protection and control requirements.

The core of LanguageTool is licensed under the GNU Lesser General Public License (LGPL-2.1). This license permits the free use, modification, and distribution of the software, even in commercial environments, provided that changes to the original code are also published under the LGPL. The license is “weak copyleft,” meaning that applications using LanguageTool as a library do not necessarily have to be open source. License information can be found in the official repository on GitHub in COPYING.txt. Third-party components such as dictionaries may be under different licenses (e.g., GPL).

There is an open-source version as well as an extended premium version with additional features such as improved style, semantics, and format checks. A detailed overview can be found on the website. It is important to note that for self-hosted instances, premium features are only available for commercial use and by individual quote. However, this is communicated with difficulty and primarily in the forum upon request. It also appears that not all premium features are available.

Unfortunately, LanguageTool made changes to the use of browser extensions in 2026: a premium subscription is now required for cloud usage. The self-hosted version remains unaffected – here, the browser extension can still be connected to your own server to enable seamless integration into web applications such as email, CMS, or forms.

Features

LanguageTool has a modular design and combines several technologies. These go far beyond the integrated spell checking of, for example, LibreOffice or Thunderbird. However, a much-desired feature is currently not yet available: support for multiple languages within a single document.

  1. Morphological Analyzer & POS Tagger
    First, the text is broken down into sentences and words. Each word receives at least one Part-of-Speech (POS) tag (e.g., noun, verb, adjective). The analyzer also considers inflectional forms, so “gegangen” (gone) is correctly identified as a past participle.

  2. Disambiguator
    Many words have multiple meanings (e.g., “Bank” as a bench or a financial institution). The disambiguator uses contextual information to select the correct interpretation. This is done either rule-based or statistically and improves the accuracy of subsequent rule application.

  3. Rule Engine (XML & Java)
    Error detection is based on a combination of:

    • XML Rules: Simple patterns like “dass instead of das” or “missing comma before weil.” These are easy to write and maintain.
    • Java Rules: Complex, context-dependent rules that are programmatically implemented, e.g., for sentence structure or cross-text repetitions.
  4. N-Gram Model (optional)
    For improved detection of confusions (e.g., “ihre vs. ihre“), an n-gram model can be added. This uses statistical data from vast text corpora (e.g., Google Books) and compares the probability of word sequences. The n-gram data is not included in the standard package but can be downloaded locally.

  5. User Dictionaries
    Custom technical terms can be added to avoid false positives. This is done either via the API or by editing the spelling_custom.txt.
  6. Markup Support
    With AnnotatedText, HTML, LaTeX, or XML can be processed without distorting position information.
  7. Java API
    For direct integration into Java applications, JLanguageTool offers a powerful interface.

Integration

Integration is versatile: in addition to the browser extension, LanguageTool supports APIs for custom applications, plugins for LibreOffice, Microsoft Word, Thunderbird, and direct connection to development tools. The self-hosted solution thus offers maximum flexibility, security, and scalability – ideal for use in sensitive or regulated environments.

A complete list can be found in the following link. Notably absent is a dedicated plugin for the Outlook client. As far as could be ascertained, the effort was probably not justified by the demand. However, there are only older posts in the forum about this. Nevertheless, LanguageTool in the browser also works without problems with Outlook in the browser. The limitation should therefore only affect the desktop client.

Deployment

On Github, you will find various options for installing a self-hosted service. Especially for local installations, a Docker instance is probably the fastest to deploy.

Several images are linked here; the author chose one as an example.

The maintainer also offers various almost ready-to-use copy-paste solutions to start the service. This includes a Docker Compose template to start the service as an unprivileged user and keep the file system read-only:
To use this, the content must be written into, for example, a docker-compose.yml, the ‘ngrams’ and ‘fasttext’ directories created, and permissions adjusted for, for example, the ‘nobody’ user. All subsequent examples were performed on a Debian 13 system.

$ mkdir ~/Programme/Languagetool
$ cd ~/Programme/Languagetool
$ mkdir ngrams fasttext
$ chown nobody:nogroup ngrams fasttext

Below is the content of the compose-yaml with support for n-grams in German and English. It is important to note that the n-gram data is quite large and requires several GB of storage.
Currently, it is approximately 3 GB for German and 15 GB for English.

services:
languagetool:
image: meyay/languagetool:latest
container_name: languagetool
restart: unless-stopped
user: "65534:65534"
read_only: true
tmpfs:
- /tmp:exec
cap_drop:
- ALL
security_opt:
- no-new-privileges
ports:
- 8081:8081
environment:
download_ngrams_for_langs: de, en
volumes:
- ./ngrams:/ngrams
- ./fasttext:/fasttext

The service can then be started with the following command:

$ docker compose up -d
# Das Herunterladen der n-grams kann etwas dauern.
$ docker ps
2af60ed08544 meyay/languagetool:latest "/sbin/tini -g -e 14…" 4 weeks ago Up 3 hours (healthy) 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp languagetool

The service is now available, and the plugins should be able to access it. There is no authentication or similar. Anyone with access to the URL and port can use it.

Conclusion

LanguageTool on-premise combines data protection-compliant text checking with flexible integration. The LGPL-2.1 license allows free use, while comprehensive interfaces enable seamless integration into office and web applications. With the correct configuration, a local server becomes a fully functional, enterprise-grade solution for linguistic checking.

 

Categories: credativ® Inside HowTos

About the author

Danilo Endesfelder

Berater

about the person

Danilo ist seit 2016 Berater bei der credativ GmbH. Sein fachlicher Fokus liegt bei Containertechnologien wie Kubernetes, Podman, Docker und deren Ökosystem. Außerdem hat er Erfahrung mit Projekten und Schulungen im Bereich RDBMS (MySQL/Mariadb und PostgreSQL®). Seit 2015 ist er ebenfalls im Organisationsteam der deutschen PostgreSQL® Konferenz PGConf.DE.

View posts


Beitrag teilen: