language_tool_python package

Submodules

language_tool_python.config_file module

Module for configuring LanguageTool’s local server.

class language_tool_python.config_file.LanguageToolConfig(config: Dict[str, Any])

Bases: object

Configuration class for LanguageTool.

Parameters:

config (Dict[str, Any]) – Dictionary containing configuration keys and values.

_create_temp_file() str

Create a temporary file to store the configuration.

Returns:

Path to the temporary file.

Return type:

str

config: Dict[str, Any]

Dictionary containing configuration keys and values.

path: str

Path to the temporary file storing the configuration.

class language_tool_python.config_file.OptionSpec(py_types: type | Tuple[type, ...], encoder: Callable[[Any], str], validator: Callable[[Any], None] | None = None)

Bases: object

Specification for a configuration option.

This class defines the structure and behavior of a configuration option, including its type constraints, encoding mechanism, and optional validation.

This class is frozen (immutable) to ensure configuration specifications remain constant throughout the application lifecycle.

encoder: Callable[[Any], str]

A callable that converts the option value to its string representation.

py_types: type | Tuple[type, ...]

The Python type(s) that this option accepts.

validator: Callable[[Any], None] | None = None

An optional validator function for the option value.

language_tool_python.config_file._bool_encoder(v: Any) str

Encode a value as a lowercase boolean string.

Converts any value to a boolean and returns its string representation in lowercase format (‘true’ or ‘false’).

Parameters:

v (Any) – The value to be converted to a boolean string.

Returns:

A lowercase string representation of the boolean value (‘true’ or ‘false’).

Return type:

str

language_tool_python.config_file._comma_list_encoder(v: str | Iterable[str]) str

Encode a value as a comma-separated list string.

Converts a value into a string representation suitable for comma-separated list configuration options. If the input is already a string, it is returned as-is. If it’s an iterable, its elements are converted to strings and joined with commas.

Parameters:

v (Union[str, Iterable[str]]) – The value to encode. Can be a string or an iterable of values.

Returns:

A comma-separated string representation of the input value.

Return type:

str

language_tool_python.config_file._encode_config(config: Dict[str, Any]) Dict[str, str]

Encode configuration dictionary values to their string representations. This function converts a configuration dictionary into a format suitable for serialization by encoding each value according to its corresponding schema specification.

Parameters:

config (Dict[str, Any]) – A dictionary containing configuration keys and values to be encoded.

Returns:

A dictionary with the same keys but with all values encoded as strings.

Return type:

Dict[str, str]

Raises:
  • ValueError – If a key in the config is not found in the CONFIG_SCHEMA and is not a language key.

  • TypeError – If a value’s type does not match the expected type(s) defined in the CONFIG_SCHEMA specification.

language_tool_python.config_file._is_lang_key(key: str) bool

Check if a given key is a valid language key. A valid language key must follow one of these formats:

  • lang-<code> where code is a non-empty language code

  • lang-<code>-dictPath where code is a non-empty language code

Parameters:

key (str) – The key string to validate

Returns:

True if the key is a valid language key, False otherwise

Return type:

bool

language_tool_python.config_file._path_encoder(v: Any) str

Encode a path value to a string. Converts the input to a Path object, then to a string, and escapes all backslashes by doubling them. This is useful for windows file paths and other contexts where backslashes need to be escaped. (because they will be used by LT java binary)

Parameters:

v (Any) – The path value to encode. Can be any type that Path accepts (str, Path, etc.).

Returns:

The path as a string with escaped backslashes (e.g., “C:\Users\file”).

Return type:

str

language_tool_python.config_file._path_validator(v: Any) None

Validate that a given path exists and is a file.

Parameters:

v (Any) – The path to validate, which will be converted to a Path object

Raises:
  • PathError – If the path does not exist

  • PathError – If the path exists but is not a file

language_tool_python.download_lt module

LanguageTool download module.

language_tool_python.download_lt.confirm_java_compatibility(language_tool_version: str = 'latest') None

Confirms if the installed Java version is compatible with language-tool-python. This function checks if Java is installed and verifies that the major version is at least 8 or 17 (depending on the LanguageTool version). It raises an error if Java is not installed or if the version is incompatible.

Parameters:

language_tool_version (str) – The version of LanguageTool to check compatibility for.

Raises:
  • ModuleNotFoundError – If no Java installation is detected.

  • SystemError – If the detected Java version is less than the required version.

language_tool_python.download_lt.download_lt(language_tool_version: str = 'latest') None

Downloads and extracts the specified version of LanguageTool. This function checks for Java compatibility, and downloads the specified version of LanguageTool if it is not already present.

Parameters:

language_tool_version (str) – The version of LanguageTool to download. If not specified, the default version defined by LTP_DOWNLOAD_VERSION is used.

Raises:
  • PathError – If the download folder is not a directory.

  • ValueError – If the specified version format is invalid.

language_tool_python.download_lt.download_zip(url: str, directory: Path) None

Downloads a ZIP file from the given URL and extracts it to the specified directory.

Parameters:
  • url (str) – The URL of the ZIP file to download.

  • directory (Path) – The directory where the ZIP file should be extracted.

language_tool_python.download_lt.get_common_prefix(z: ZipFile) str | None

Determine the common prefix of all file names in a zip archive.

Parameters:

z (zipfile.ZipFile) – A ZipFile object representing the zip archive.

Returns:

The common prefix of all file names in the zip archive, or None if there is no common prefix.

Return type:

Optional[str]

language_tool_python.download_lt.http_get(url: str, out_file: IO[bytes], proxies: Dict[str, str] | None = None) None

Downloads a file from a given URL and writes it to the specified output file.

Parameters:
  • url (str) – The URL to download the file from.

  • out_file (IO[bytes]) – The file object to write the downloaded content to.

  • proxies (Optional[Dict[str, str]]) – Optional dictionary of proxies to use for the request.

Raises:
  • TimeoutError – If the request times out.

  • PathError – If the file could not be found at the given URL (HTTP 404).

language_tool_python.download_lt.parse_java_version(version_text: str) Tuple[int, int]

Parse the Java version from a given version text.

This function attempts to extract the major version numbers from the provided Java version string using regular expressions. It supports two different version formats defined by JAVA_VERSION_REGEX and JAVA_VERSION_REGEX_UPDATED.

Parameters:

version_text (str) – The Java version string to parse.

Returns:

A tuple containing the major version numbers.

Return type:

Tuple[int, int]

Raises:

SystemExit – If the version string cannot be parsed.

language_tool_python.download_lt.unzip_file(temp_file_name: str, directory_to_extract_to: Path) None

Unzips a zip file to a specified directory.

Parameters:
  • temp_file_name (str) – A temporary file object representing the zip file to be extracted.

  • directory_to_extract_to (Path) – The directory where the contents of the zip file will be extracted.

language_tool_python.exceptions module

exception language_tool_python.exceptions.JavaError

Bases: LanguageToolError

Exception raised for errors related to the Java backend of LanguageTool. This exception is a subclass of LanguageToolError and is used to indicate issues that occur when interacting with Java, such as Java not being found.

exception language_tool_python.exceptions.LanguageToolError

Bases: Exception

Exception raised for errors in the LanguageTool library. This is a generic exception that can be used to indicate various types of errors encountered while using the LanguageTool library.

exception language_tool_python.exceptions.PathError

Bases: LanguageToolError

Exception raised for errors in the file path used in LanguageTool. This error is raised when there is an issue with the file path provided to LanguageTool, such as the LanguageTool JAR file not being found, or a download path not being a valid available file path.

exception language_tool_python.exceptions.RateLimitError

Bases: LanguageToolError

Exception raised for errors related to rate limiting in the LanguageTool server. This exception is a subclass of LanguageToolError and is used to indicate issues such as exceeding the allowed number of requests to the public API without a key.

exception language_tool_python.exceptions.ServerError

Bases: LanguageToolError

Exception raised for errors that occur when interacting with the LanguageTool server. This exception is a subclass of LanguageToolError and is used to indicate issues such as server startup failures.

language_tool_python.language_tag module

LanguageTool language tag normalization module.

class language_tool_python.language_tag.LanguageTag(tag: str, languages: Iterable[str])

Bases: object

A class to represent and normalize language tags.

Parameters:
  • tag (str) – The language tag.

  • languages (Iterable[str]) – An iterable of supported language tags.

_LANGUAGE_RE = re.compile('^([a-z]{2,3})(?:[_-]([a-z]{2}))?$', re.IGNORECASE)

A regular expression to match language tags.

_normalize(tag: str) str

Normalize a language tag to a standard format.

Parameters:

tag (str) – The language tag to normalize.

Raises:

ValueError – If the tag is empty or unsupported.

Returns:

The normalized language tag.

Return type:

str

languages: Iterable[str]

An iterable of supported language tags.

normalized_tag: str

The normalized language tag.

tag: str

The language tag to be normalized.

language_tool_python.match module

LanguageTool API Match object representation and utility module.

class language_tool_python.match.Match(attrib: Dict[str, Any], text: str)

Bases: object

Represents a match object that contains information about a language rule violation.

Parameters:
  • attrib (Dict[str, Any]) –

    A dictionary containing various attributes for the match. The dictionary is expected to have the following keys:

    • rule (Dict[str, Any]): A dictionary with keys category (which has an id) and id, issueType.

    • context (Dict[str, Any]): A dictionary with keys offset and text.

    • replacements (List[Dict[str, str]]): A list of dictionaries, each containing a value.

    • length (int): The length of the error.

    • message (str): The message describing the error.

  • text (str) – The original text in which the error occurred (the whole text, not just the context).

Example of a match object received from the LanguageTool API :

{
    'message': 'Possible spelling mistake found.',
    'shortMessage': 'Spelling mistake',
    'replacements': [{'value': 'newt'}, {'value': 'not'}, {'value': 'new', 'shortDescription': 'having just been made'}, {'value': 'news'}, {'value': 'foot', 'shortDescription': 'singular'}, {'value': 'root', 'shortDescription': 'underground organ of a plant'}, {'value': 'boot'}, {'value': 'noon'}, {'value': 'loot', 'shortDescription': 'plunder'}, {'value': 'moot'}, {'value': 'Root'}, {'value': 'soot', 'shortDescription': 'carbon black'}, {'value': 'newts'}, {'value': 'nook'}, {'value': 'Lieut'}, {'value': 'coot'}, {'value': 'hoot'}, {'value': 'toot'}, {'value': 'snoot'}, {'value': 'neut'}, {'value': 'nowt'}, {'value': 'Noor'}, {'value': 'noob'}],
    'offset': 8,
    'length': 4,
    'context': {'text': 'This is noot okay. ', 'offset': 8, 'length': 4}, 'sentence': 'This is noot okay.',
    'type': {'typeName': 'Other'},
    'rule': {'id': 'MORFOLOGIK_RULE_EN_US', 'description': 'Possible spelling mistake', 'issueType': 'misspelling', 'category': {'id': 'TYPOS', 'name': 'Possible Typo'}},
    'ignoreForIncompleteSentence': False,
    'contextForSureMatch': 0
}
FOUR_BYTES_POSITIONS: List[int] | None = None

The positions of 4-byte encoded characters in the text, registered by the previous match object (kept for optimization purposes if the text is the same).

PREVIOUS_MATCHES_TEXT: str | None = None

The text of the previous match object.

category: str

The category of the rule that was violated.

context: str

The context in which the error occurred.

error_length: int

The length of the error.

get_line_and_column(original_text: str) Tuple[int, int]

Returns the line and column number of the error in the context.

Parameters:

original_text (str) – The original text in which the error occurred. We need this to calculate the line and column number, because the context has no more newline characters.

Returns:

A tuple containing the line and column number of the error.

Return type:

Tuple[int, int]

property matched_text: str

Returns the substring from the context that corresponds to the matched text.

Returns:

The matched text from the context.

Return type:

str

message: str

The message describing the error.

offset: int

The offset of the error.

offset_in_context: int

The offset of the error in the context.

replacements: List[str]

A list of suggested replacements for the error.

rule_id: str

The ID of the rule that was violated.

rule_issue_type: str

The issue type of the rule that was violated.

select_replacement(index: int) None

Select a single replacement suggestion based on the given index and update the replacements list, leaving only the selected replacement.

Parameters:

index (int) – The index of the replacement to select.

Raises:
  • ValueError – If there are no replacement suggestions.

  • ValueError – If the index is out of the valid range.

language_tool_python.match.auto_type(obj: Any) Any

Attempts to automatically convert the input object to an integer or float. If the conversion to an integer fails, it tries to convert to a float. If both conversions fail, it returns the original object.

Parameters:

obj (Any) – The object to be converted.

Returns:

The converted object as an integer, float, or the original object.

Return type:

Any

language_tool_python.match.four_byte_char_positions(text: str) List[int]

Identify positions of 4-byte encoded characters in a UTF-8 string. This function scans through the input text and identifies the positions of characters that are encoded with 4 bytes in UTF-8. These characters are typically non-BMP (Basic Multilingual Plane) characters, such as certain emoji and some rare Chinese, Japanese, and Korean characters.

Parameters:

text (str) – The input string to be analyzed.

Returns:

A list of positions where 4-byte encoded characters are found.

Return type:

List[int]

language_tool_python.match.get_match_ordered_dict() OrderedDict[str, type]

Returns an ordered dictionary with predefined keys and their corresponding types.

Returns:

An OrderedDict where each key is a string representing a specific attribute and each value is the type of that attribute.

Return type:

OrderedDictType[str, type]

The keys and their corresponding types are:

  • ‘rule_id’: str

  • ‘message’: str

  • ‘replacements’: list

  • ‘offset_in_context’: int

  • ‘context’: str

  • ‘offset’: int

  • ‘error_length’: int

  • ‘category’: str

  • ‘rule_issue_type’: str

  • ‘sentence’: str

language_tool_python.server module

LanguageTool server management module.

class language_tool_python.server.LanguageTool(language: str | None = None, mother_tongue: str | None = None, remote_server: str | None = None, new_spellings: List[str] | None = None, new_spellings_persist: bool = True, host: str | None = None, config: Dict[str, Any] | None = None, language_tool_download_version: str = 'latest', proxies: Dict[str, str] | None = None)

Bases: object

A class to interact with the LanguageTool server for text checking and correction.

Parameters:
  • language (Optional[str]) – The language to be used by the LanguageTool server. If None, it will try to detect the system language.

  • mother_tongue (Optional[str]) – The mother tongue of the user.

  • remote_server (Optional[str]) – URL of a remote LanguageTool server. If provided, the local server will not be started.

  • new_spellings (Optional[List[str]]) – Custom spellings to be added to the LanguageTool server.

  • new_spellings_persist (Optional[bool]) – Whether the new spellings should persist across sessions.

  • host (Optional[str]) – The host address for the LanguageTool server. Defaults to ‘localhost’.

  • config (Optional[str]) – Path to a configuration file for the LanguageTool server.

  • language_tool_download_version (Optional[str]) – The version of LanguageTool to download if needed.

  • proxies (Optional[Dict[str, str]]) – A dictionary of proxies to use for server requests (e.g., {‘http’: ‘http://proxy:port’, ‘https’: ‘https://proxy:port’}).

_SPELL_CHECKING_CATEGORIES: Set[str] = {'TYPOS'}

Categories used for spell checking.

_TIMEOUT: Literal[300] = 300

The timeout for server requests.

_available_ports: List[int]

A list of available ports for the server, shuffled randomly.

_config: LanguageToolConfig | None

The server configuration options (used when starting the local server).

_create_params(text: str) Dict[str, str]

Create a dictionary of parameters for the language tool server request.

Parameters:

text (str) – The text to be checked.

Returns:

A dictionary containing the parameters for the request.

Return type:

Dict[str, str]

The dictionary may contain the following keys: - ‘language’: The language code. - ‘text’: The text to be checked. - ‘motherTongue’: The mother tongue language code, if specified. - ‘disabledRules’: A comma-separated list of disabled rules, if specified. - ‘enabledRules’: A comma-separated list of enabled rules, if specified. - ‘enabledOnly’: ‘true’ if only enabled rules should be used. - ‘disabledCategories’: A comma-separated list of disabled categories, if specified. - ‘enabledCategories’: A comma-separated list of enabled categories, if specified. - ‘preferredVariants’: A comma-separated list of preferred language variants, if specified. - ‘level’: ‘picky’ if picky mode is enabled.

_disabled_categories: Set[str]

A set of disabled rule categories (used in requests to the server).

_disabled_rules: Set[str]

A set of disabled grammar/style rules (used in requests to the server).

_enabled_categories: Set[str]

A set of explicitly enabled categories (used in requests to the server).

_enabled_rules: Set[str]

A set of explicitly enabled rules (used in requests to the server).

_enabled_rules_only: bool

A flag to use only explicitly enabled rules (used in requests to the server).

_get_languages() Set[str]

Retrieve the set of supported languages from the server. This method starts the server if it is not already running, constructs the URL for querying the supported languages, and sends a request to the server. It then processes the server’s response to extract the language codes and adds them to a set. The special code “auto” is also added to the set before returning it.

Returns:

A set of language codes supported by the server.

Return type:

Set[str]

static _get_valid_spelling_file_path() Path

Retrieve the valid file path for the spelling file. This function constructs the file path for the spelling file used by the language tool. It checks if the file exists at the constructed path and raises a FileNotFoundError if the file is not found.

Raises:

FileNotFoundError – If the spelling file does not exist at the constructed path.

Returns:

The valid file path for the spelling file.

Return type:

Path

_host: str

The host to use for the server.

_language: LanguageTag

The language to use for text checking (used in requests to the server).

_language_tool_download_version: str

The version of LanguageTool to download.

_mother_tongue: str | None

The user’s mother tongue for better error detection (used in requests to the server).

_new_spellings: List[str] | None

A list of new spellings to register.

_new_spellings_persist: bool

A flag to indicate if new spellings should persist.

_picky: bool

A flag to enable stricter checking mode (used in requests to the server).

_port: int

The port number to use for the server.

_preferred_variants: Set[str]

A set of preferred language variants (used in requests to the server).

_proxies: Dict[str, str] | None

A dictionary of proxies for network requests (used in requests to the server).

_query_server(url: str, params: Dict[str, str] | None = None, num_tries: int = 2) Any

Query the server with the given URL and parameters.

Parameters:
  • url (str) – The URL to query.

  • params (Optional[Dict[str, str]], optional) – The parameters to include in the query, defaults to None.

  • num_tries (int, optional) – The number of times to retry the query in case of failure, defaults to 2.

Returns:

The JSON response from the server.

Return type:

Any

Raises:

LanguageToolError – If the server returns an invalid JSON response or if the query fails after the specified number of retries.

_register_spellings() None

Registers new spellings by adding them to the spelling file. This method reads the existing spellings from the spelling file, filters out the new spellings that are already present, and appends the remaining new spellings to the file. If the DEBUG_MODE is enabled, it prints a message indicating the file where the new spellings were registered.

_remote: bool

A flag to indicate if the server is remote.

_server: Popen[str] | None

The server process.

_server_is_alive() bool

Check if the server is alive. This method checks if the server instance exists and is currently running.

Returns:

True if the server is alive (exists and running), False otherwise.

Return type:

bool

_start_local_server() None

Start the local LanguageTool server. This method starts a local instance of the LanguageTool server. If the LanguageTool is not already downloaded, it will download the specified version. It handles the server initialization, including setting up the server command and managing the server process.

Raises:
  • PathError – If the path to LanguageTool cannot be found.

  • ServerError – If the server fails to start or exits early.

_start_server_if_needed() None

Starts the server if it is not already running and if it is not a remote server. This method checks if the server is alive and if it is not a remote server. If the server is not alive and it is not remote, it starts the server on a free port.

_start_server_on_free_port() None

Attempt to start the server on a free port within the specified range. This method continuously tries to start the local server on the current host and port. If the port is already in use, it increments the port number and tries again until a free port is found or the maximum port number is reached.

Raises:

ServerError – If the server cannot be started and the maximum port number is reached.

_terminate_server() None

Terminates the server process. This method performs the following steps: 1. Attempts to terminate the server process gracefully. 2. Closes associated file descriptor (stdin).

_unregister_spellings() None

Unregister new spellings from the spelling file. This method reads the current spellings from the spelling file, removes any spellings that are present in the _new_spellings attribute, and writes the updated list back to the file.

_update_remote_server_config(url: str) None

Update the configuration to use a remote server.

Parameters:

url (str) – The URL of the remote server.

_url: str

The base URL of the LanguageTool server (used in all server requests).

_wait_for_server_ready(timeout: int = 15) None

Wait for the LanguageTool server to become ready and responsive. This method polls the server’s /healthcheck endpoint until it responds successfully or until the timeout is reached. It also monitors the server process to detect early exits.

Parameters:

timeout (int) – Maximum time in seconds to wait for the server to become ready. Defaults to 15 seconds.

Raises:

ServerError – If the server process exits early with a non-zero code, or if the server does not become ready within the specified timeout period or if the server process is not initialized.

check(text: str) List[Match]

Checks the given text for language issues using the LanguageTool server.

Parameters:

text (str) – The text to be checked for language issues.

Returns:

A list of Match objects representing the issues found in the text.

Return type:

List[Match]

check_matching_regions(text: str, pattern: str, flags: int = 0) List[Match]

Check only the parts of the text that match a regex pattern. The returned Match objects can be applied to the original text with language_tool_python.utils.correct().

Parameters:
  • text – The full text.

  • pattern – Regular expression defining the regions to check

  • flags – Regex flags (re.IGNORECASE, re.MULTILINE, etc.)

Returns:

List of Match with offsets adjusted to the original text

Return type:

List[Match]

close() None

Closes the server and performs necessary cleanup operations.

This method performs the following actions: 1. Checks if the server is alive, not remote and terminates it if necessary. 2. If new spellings are not set to persist and there are new spellings, it unregisters the spellings and clears the list of new spellings.

property config: LanguageToolConfig | None

Get the server configuration.

This property is read-only as the configuration is set during initialization and cannot be changed while the server is running.

Returns:

The configuration object if set, otherwise None.

Return type:

Optional[LanguageToolConfig]

correct(text: str) str

Corrects the given text by applying language tool suggestions. Applies only the first suggestion for each issue.

Parameters:

text (str) – The text to be corrected.

Returns:

The corrected text.

Return type:

str

disable_spellchecking() None

Disable spellchecking by updating the disabled categories with spell checking categories.

property disabled_categories: Set[str]

Get the set of disabled rule categories.

Returns:

A set of disabled category names.

Return type:

Set[str]

property disabled_rules: Set[str]

Get the set of disabled rules.

Returns:

A set of disabled rule IDs.

Return type:

Set[str]

enable_spellchecking() None

Enable spellchecking by removing spell checking categories from the disabled categories set. This method updates the disabled_categories attribute by removing any categories that are related to spell checking, which are defined in the _SPELL_CHECKING_CATEGORIES class constant.

property enabled_categories: Set[str]

Get the set of enabled rule categories.

Returns:

A set of enabled category names.

Return type:

Set[str]

property enabled_rules: Set[str]

Get the set of enabled rules.

Returns:

A set of enabled rule IDs.

Return type:

Set[str]

property enabled_rules_only: bool

Get whether only enabled rules should be used.

Returns:

True if using only enabled rules, False otherwise.

Return type:

bool

property host: str

Get the local server host address.

This property is read-only as the host address is determined during initialization and cannot be changed while the server is running.

Returns:

The host address (e.g., ‘127.0.0.1’).

Return type:

str

property is_remote: bool

Get whether using a remote LanguageTool server.

This property is read-only as the remote status is determined during initialization and cannot be changed while the server is running.

Returns:

True if using a remote server, False if using a local server.

Return type:

bool

property language: LanguageTag

Get the language tag associated with the server.

Returns:

The language tag.

Return type:

LanguageTag

property language_tool_download_version: str

Get the LanguageTool version to download.

This property is read-only as the version is determined during initialization and the server cannot be re-downloaded with a different version at runtime.

Returns:

The LanguageTool version string.

Return type:

str

property mother_tongue: LanguageTag | None

Get the mother tongue language tag.

Returns:

The mother tongue language tag if set, otherwise None.

Return type:

Optional[LanguageTag]

property picky: bool

Get whether picky mode is enabled.

Returns:

True if picky mode is enabled, False otherwise.

Return type:

bool

property port: int

Get the local server port number.

This property is read-only as the port number is determined during initialization and cannot be changed while the server is running.

Returns:

The port number (e.g., 8081).

Return type:

int

property preferred_variants: Set[str]

Get the set of preferred language variants.

Returns:

A set of preferred variant codes.

Return type:

Set[str]

property proxies: Dict[str, str] | None

Get the proxies used for server requests.

Returns:

A dictionary of proxies if set, otherwise None.

Return type:

Optional[Dict[str, str]]

property url: str

Get the LanguageTool server URL.

This property is read-only as the URL is determined during initialization and cannot be changed while the server is running.

Returns:

The server URL (e.g., ‘http://localhost:8081/v2/’).

Return type:

str

class language_tool_python.server.LanguageToolPublicAPI(*args: Any, **kwargs: Any)

Bases: LanguageTool

A class to interact with the public LanguageTool API. This class extends the LanguageTool class and initializes it with the remote server set to the public LanguageTool API endpoint.

Parameters:
  • args (Any) – Positional arguments passed to the parent class initializer.

  • kwargs (Any) – Keyword arguments passed to the parent class initializer.

language_tool_python.server._kill_processes(processes: List[Popen[str]]) None

Kill all running server processes. This function iterates over the list of running server processes and forcefully kills each process by its PID.

Parameters:

processes (List[subprocess.Popen]) – A list of subprocess.Popen objects representing the running server processes.

language_tool_python.server.terminate_server() None

Terminates all running server processes. This function iterates over the list of running server processes and forcefully kills each process by its PID.

language_tool_python.utils module

Utility functions for the LanguageTool library.

class language_tool_python.utils.TextStatus(*values)

Bases: Enum

CORRECT = 'correct'
FAULTY = 'faulty'
GARBAGE = 'garbage'
language_tool_python.utils._extract_version(path: Path) Version

Extract the version number from a LanguageTool directory path.

This function parses the directory name to extract the version information from LanguageTool installation folders that follow the naming convention ‘LanguageTool-X.Y-SNAPSHOT’.

Parameters:

path (Path) – The path to the LanguageTool directory

Returns:

The parsed version object extracted from the directory name

Return type:

version.Version

Raises:

ValueError – If the directory name doesn’t start with ‘LanguageTool-’

language_tool_python.utils.classify_matches(matches: List[Match]) TextStatus

Classify the matches (result of a check on a text) into one of three categories: CORRECT, FAULTY, or GARBAGE. This function checks the status of the matches and returns a corresponding TextStatus value.

Parameters:

matches (List[Match]) – A list of Match objects to be classified.

Returns:

The classification of the matches as a TextStatus value.

Return type:

TextStatus

language_tool_python.utils.correct(text: str, matches: List[Match]) str

Corrects the given text based on the provided matches. Only the first replacement for each match is applied to the text.

Parameters:
  • text (str) – The original text to be corrected.

  • matches (List[Match]) – A list of Match objects that contain the positions and replacements for errors in the text.

Returns:

The corrected text.

Return type:

str

language_tool_python.utils.find_existing_language_tool_downloads(download_folder: Path) List[Path]

Find existing LanguageTool downloads in the specified folder. This function searches for directories in the given download folder that match the pattern ‘LanguageTool*’ and returns a list of their paths.

Parameters:

download_folder (Path) – The folder where LanguageTool downloads are stored.

Returns:

A list of paths to the existing LanguageTool download directories.

Return type:

List[Path]

language_tool_python.utils.get_jar_info() Tuple[Path, Path]

Retrieve the path to the Java executable and the LanguageTool JAR file. This function searches for the Java executable in the system’s PATH and locates the LanguageTool JAR file either in a directory specified by an environment variable or in a default download directory.

Raises:
  • JavaError – If the Java executable cannot be found.

  • PathError – If the LanguageTool JAR file cannot be found in the specified directory.

Returns:

A tuple containing the path to the Java executable and the path to the LanguageTool JAR file.

Return type:

Tuple[Path, Path]

language_tool_python.utils.get_language_tool_directory() Path

Get the directory path of the LanguageTool installation. This function checks the download folder for LanguageTool installations, verifies that the folder exists and is a directory, and returns the path to the latest version of LanguageTool found in the directory.

Raises:
  • NotADirectoryError – If the download folder path is not a valid directory.

  • FileNotFoundError – If no LanguageTool installation is found in the download folder.

Returns:

The path to the latest version of LanguageTool found in the directory.

Return type:

Path

language_tool_python.utils.get_language_tool_download_path() Path

Get the download path for LanguageTool. This function retrieves the download path for LanguageTool from the environment variable specified by LTP_PATH_ENV_VAR. If the environment variable is not set, it defaults to a path in the user’s home directory under .cache/language_tool_python. The function ensures that the directory exists before returning it.

Returns:

The download path for LanguageTool.

Return type:

Path

language_tool_python.utils.get_locale_language() str

Get the current locale language. This function retrieves the current locale language setting of the system. It first attempts to get the locale using locale.getlocale(). If that fails, it falls back to using locale.getdefaultlocale(). If both methods fail to provide a valid language code, it returns a default failsafe language code.

Returns:

The language code of the current locale.

Return type:

str

language_tool_python.utils.get_server_cmd(port: int | None = None, config: LanguageToolConfig | None = None) List[str]

Generate the command to start the LanguageTool HTTP server.

Parameters:
  • port (Optional[int]) – Optional; The port number on which the server should run. If not provided, the default port will be used.

  • config (Optional[LanguageToolConfig]) – Optional; The configuration for the LanguageTool server. If not provided, default configuration will be used.

Returns:

A list of command line arguments to start the LanguageTool HTTP server.

Return type:

List[str]

language_tool_python.utils.kill_process_force(*, pid: int | None = None, proc: Process | None = None) None

Forcefully kills a process and all its child processes. This function attempts to kill a process specified either by its PID or by a psutil.Process object. If the process has any child processes, they will be killed first.

Parameters:
  • pid (Optional[int]) – The process ID of the process to be killed. Either pid or proc must be provided.

  • proc (Optional[psutil.Process]) – A psutil.Process object representing the process to be killed. Either pid or proc must be provided.

Raises:

ValueError – If neither pid nor proc is provided.

language_tool_python.utils.parse_url(url_str: str) str

Parse the given URL string and ensure it has a scheme. If the input URL string does not contain ‘http’, ‘http://’ is prepended to it. The function then parses the URL and returns its canonical form.

Parameters:

url_str (str) – The URL string to be parsed.

Returns:

The parsed URL in its canonical form.

Return type:

str

Module contents

LanguageTool API for Python.

class language_tool_python.LanguageTag(tag: str, languages: Iterable[str])

Bases: object

A class to represent and normalize language tags.

Parameters:
  • tag (str) – The language tag.

  • languages (Iterable[str]) – An iterable of supported language tags.

_LANGUAGE_RE = re.compile('^([a-z]{2,3})(?:[_-]([a-z]{2}))?$', re.IGNORECASE)

A regular expression to match language tags.

_normalize(tag: str) str

Normalize a language tag to a standard format.

Parameters:

tag (str) – The language tag to normalize.

Raises:

ValueError – If the tag is empty or unsupported.

Returns:

The normalized language tag.

Return type:

str

languages: Iterable[str]

An iterable of supported language tags.

normalized_tag: str

The normalized language tag.

tag: str

The language tag to be normalized.

class language_tool_python.LanguageTool(language: str | None = None, mother_tongue: str | None = None, remote_server: str | None = None, new_spellings: List[str] | None = None, new_spellings_persist: bool = True, host: str | None = None, config: Dict[str, Any] | None = None, language_tool_download_version: str = 'latest', proxies: Dict[str, str] | None = None)

Bases: object

A class to interact with the LanguageTool server for text checking and correction.

Parameters:
  • language (Optional[str]) – The language to be used by the LanguageTool server. If None, it will try to detect the system language.

  • mother_tongue (Optional[str]) – The mother tongue of the user.

  • remote_server (Optional[str]) – URL of a remote LanguageTool server. If provided, the local server will not be started.

  • new_spellings (Optional[List[str]]) – Custom spellings to be added to the LanguageTool server.

  • new_spellings_persist (Optional[bool]) – Whether the new spellings should persist across sessions.

  • host (Optional[str]) – The host address for the LanguageTool server. Defaults to ‘localhost’.

  • config (Optional[str]) – Path to a configuration file for the LanguageTool server.

  • language_tool_download_version (Optional[str]) – The version of LanguageTool to download if needed.

  • proxies (Optional[Dict[str, str]]) – A dictionary of proxies to use for server requests (e.g., {‘http’: ‘http://proxy:port’, ‘https’: ‘https://proxy:port’}).

_SPELL_CHECKING_CATEGORIES: Set[str] = {'TYPOS'}

Categories used for spell checking.

_TIMEOUT: Literal[300] = 300

The timeout for server requests.

_available_ports: List[int]

A list of available ports for the server, shuffled randomly.

_config: LanguageToolConfig | None

The server configuration options (used when starting the local server).

_create_params(text: str) Dict[str, str]

Create a dictionary of parameters for the language tool server request.

Parameters:

text (str) – The text to be checked.

Returns:

A dictionary containing the parameters for the request.

Return type:

Dict[str, str]

The dictionary may contain the following keys: - ‘language’: The language code. - ‘text’: The text to be checked. - ‘motherTongue’: The mother tongue language code, if specified. - ‘disabledRules’: A comma-separated list of disabled rules, if specified. - ‘enabledRules’: A comma-separated list of enabled rules, if specified. - ‘enabledOnly’: ‘true’ if only enabled rules should be used. - ‘disabledCategories’: A comma-separated list of disabled categories, if specified. - ‘enabledCategories’: A comma-separated list of enabled categories, if specified. - ‘preferredVariants’: A comma-separated list of preferred language variants, if specified. - ‘level’: ‘picky’ if picky mode is enabled.

_disabled_categories: Set[str]

A set of disabled rule categories (used in requests to the server).

_disabled_rules: Set[str]

A set of disabled grammar/style rules (used in requests to the server).

_enabled_categories: Set[str]

A set of explicitly enabled categories (used in requests to the server).

_enabled_rules: Set[str]

A set of explicitly enabled rules (used in requests to the server).

_enabled_rules_only: bool

A flag to use only explicitly enabled rules (used in requests to the server).

_get_languages() Set[str]

Retrieve the set of supported languages from the server. This method starts the server if it is not already running, constructs the URL for querying the supported languages, and sends a request to the server. It then processes the server’s response to extract the language codes and adds them to a set. The special code “auto” is also added to the set before returning it.

Returns:

A set of language codes supported by the server.

Return type:

Set[str]

static _get_valid_spelling_file_path() Path

Retrieve the valid file path for the spelling file. This function constructs the file path for the spelling file used by the language tool. It checks if the file exists at the constructed path and raises a FileNotFoundError if the file is not found.

Raises:

FileNotFoundError – If the spelling file does not exist at the constructed path.

Returns:

The valid file path for the spelling file.

Return type:

Path

_host: str

The host to use for the server.

_language: LanguageTag

The language to use for text checking (used in requests to the server).

_language_tool_download_version: str

The version of LanguageTool to download.

_mother_tongue: str | None

The user’s mother tongue for better error detection (used in requests to the server).

_new_spellings: List[str] | None

A list of new spellings to register.

_new_spellings_persist: bool

A flag to indicate if new spellings should persist.

_picky: bool

A flag to enable stricter checking mode (used in requests to the server).

_port: int

The port number to use for the server.

_preferred_variants: Set[str]

A set of preferred language variants (used in requests to the server).

_proxies: Dict[str, str] | None

A dictionary of proxies for network requests (used in requests to the server).

_query_server(url: str, params: Dict[str, str] | None = None, num_tries: int = 2) Any

Query the server with the given URL and parameters.

Parameters:
  • url (str) – The URL to query.

  • params (Optional[Dict[str, str]], optional) – The parameters to include in the query, defaults to None.

  • num_tries (int, optional) – The number of times to retry the query in case of failure, defaults to 2.

Returns:

The JSON response from the server.

Return type:

Any

Raises:

LanguageToolError – If the server returns an invalid JSON response or if the query fails after the specified number of retries.

_register_spellings() None

Registers new spellings by adding them to the spelling file. This method reads the existing spellings from the spelling file, filters out the new spellings that are already present, and appends the remaining new spellings to the file. If the DEBUG_MODE is enabled, it prints a message indicating the file where the new spellings were registered.

_remote: bool

A flag to indicate if the server is remote.

_server: Popen[str] | None

The server process.

_server_is_alive() bool

Check if the server is alive. This method checks if the server instance exists and is currently running.

Returns:

True if the server is alive (exists and running), False otherwise.

Return type:

bool

_start_local_server() None

Start the local LanguageTool server. This method starts a local instance of the LanguageTool server. If the LanguageTool is not already downloaded, it will download the specified version. It handles the server initialization, including setting up the server command and managing the server process.

Raises:
  • PathError – If the path to LanguageTool cannot be found.

  • ServerError – If the server fails to start or exits early.

_start_server_if_needed() None

Starts the server if it is not already running and if it is not a remote server. This method checks if the server is alive and if it is not a remote server. If the server is not alive and it is not remote, it starts the server on a free port.

_start_server_on_free_port() None

Attempt to start the server on a free port within the specified range. This method continuously tries to start the local server on the current host and port. If the port is already in use, it increments the port number and tries again until a free port is found or the maximum port number is reached.

Raises:

ServerError – If the server cannot be started and the maximum port number is reached.

_terminate_server() None

Terminates the server process. This method performs the following steps: 1. Attempts to terminate the server process gracefully. 2. Closes associated file descriptor (stdin).

_unregister_spellings() None

Unregister new spellings from the spelling file. This method reads the current spellings from the spelling file, removes any spellings that are present in the _new_spellings attribute, and writes the updated list back to the file.

_update_remote_server_config(url: str) None

Update the configuration to use a remote server.

Parameters:

url (str) – The URL of the remote server.

_url: str

The base URL of the LanguageTool server (used in all server requests).

_wait_for_server_ready(timeout: int = 15) None

Wait for the LanguageTool server to become ready and responsive. This method polls the server’s /healthcheck endpoint until it responds successfully or until the timeout is reached. It also monitors the server process to detect early exits.

Parameters:

timeout (int) – Maximum time in seconds to wait for the server to become ready. Defaults to 15 seconds.

Raises:

ServerError – If the server process exits early with a non-zero code, or if the server does not become ready within the specified timeout period or if the server process is not initialized.

check(text: str) List[Match]

Checks the given text for language issues using the LanguageTool server.

Parameters:

text (str) – The text to be checked for language issues.

Returns:

A list of Match objects representing the issues found in the text.

Return type:

List[Match]

check_matching_regions(text: str, pattern: str, flags: int = 0) List[Match]

Check only the parts of the text that match a regex pattern. The returned Match objects can be applied to the original text with language_tool_python.utils.correct().

Parameters:
  • text – The full text.

  • pattern – Regular expression defining the regions to check

  • flags – Regex flags (re.IGNORECASE, re.MULTILINE, etc.)

Returns:

List of Match with offsets adjusted to the original text

Return type:

List[Match]

close() None

Closes the server and performs necessary cleanup operations.

This method performs the following actions: 1. Checks if the server is alive, not remote and terminates it if necessary. 2. If new spellings are not set to persist and there are new spellings, it unregisters the spellings and clears the list of new spellings.

property config: LanguageToolConfig | None

Get the server configuration.

This property is read-only as the configuration is set during initialization and cannot be changed while the server is running.

Returns:

The configuration object if set, otherwise None.

Return type:

Optional[LanguageToolConfig]

correct(text: str) str

Corrects the given text by applying language tool suggestions. Applies only the first suggestion for each issue.

Parameters:

text (str) – The text to be corrected.

Returns:

The corrected text.

Return type:

str

disable_spellchecking() None

Disable spellchecking by updating the disabled categories with spell checking categories.

property disabled_categories: Set[str]

Get the set of disabled rule categories.

Returns:

A set of disabled category names.

Return type:

Set[str]

property disabled_rules: Set[str]

Get the set of disabled rules.

Returns:

A set of disabled rule IDs.

Return type:

Set[str]

enable_spellchecking() None

Enable spellchecking by removing spell checking categories from the disabled categories set. This method updates the disabled_categories attribute by removing any categories that are related to spell checking, which are defined in the _SPELL_CHECKING_CATEGORIES class constant.

property enabled_categories: Set[str]

Get the set of enabled rule categories.

Returns:

A set of enabled category names.

Return type:

Set[str]

property enabled_rules: Set[str]

Get the set of enabled rules.

Returns:

A set of enabled rule IDs.

Return type:

Set[str]

property enabled_rules_only: bool

Get whether only enabled rules should be used.

Returns:

True if using only enabled rules, False otherwise.

Return type:

bool

property host: str

Get the local server host address.

This property is read-only as the host address is determined during initialization and cannot be changed while the server is running.

Returns:

The host address (e.g., ‘127.0.0.1’).

Return type:

str

property is_remote: bool

Get whether using a remote LanguageTool server.

This property is read-only as the remote status is determined during initialization and cannot be changed while the server is running.

Returns:

True if using a remote server, False if using a local server.

Return type:

bool

property language: LanguageTag

Get the language tag associated with the server.

Returns:

The language tag.

Return type:

LanguageTag

property language_tool_download_version: str

Get the LanguageTool version to download.

This property is read-only as the version is determined during initialization and the server cannot be re-downloaded with a different version at runtime.

Returns:

The LanguageTool version string.

Return type:

str

property mother_tongue: LanguageTag | None

Get the mother tongue language tag.

Returns:

The mother tongue language tag if set, otherwise None.

Return type:

Optional[LanguageTag]

property picky: bool

Get whether picky mode is enabled.

Returns:

True if picky mode is enabled, False otherwise.

Return type:

bool

property port: int

Get the local server port number.

This property is read-only as the port number is determined during initialization and cannot be changed while the server is running.

Returns:

The port number (e.g., 8081).

Return type:

int

property preferred_variants: Set[str]

Get the set of preferred language variants.

Returns:

A set of preferred variant codes.

Return type:

Set[str]

property proxies: Dict[str, str] | None

Get the proxies used for server requests.

Returns:

A dictionary of proxies if set, otherwise None.

Return type:

Optional[Dict[str, str]]

property url: str

Get the LanguageTool server URL.

This property is read-only as the URL is determined during initialization and cannot be changed while the server is running.

Returns:

The server URL (e.g., ‘http://localhost:8081/v2/’).

Return type:

str

class language_tool_python.LanguageToolPublicAPI(*args: Any, **kwargs: Any)

Bases: LanguageTool

A class to interact with the public LanguageTool API. This class extends the LanguageTool class and initializes it with the remote server set to the public LanguageTool API endpoint.

Parameters:
  • args (Any) – Positional arguments passed to the parent class initializer.

  • kwargs (Any) – Keyword arguments passed to the parent class initializer.

class language_tool_python.Match(attrib: Dict[str, Any], text: str)

Bases: object

Represents a match object that contains information about a language rule violation.

Parameters:
  • attrib (Dict[str, Any]) –

    A dictionary containing various attributes for the match. The dictionary is expected to have the following keys:

    • rule (Dict[str, Any]): A dictionary with keys category (which has an id) and id, issueType.

    • context (Dict[str, Any]): A dictionary with keys offset and text.

    • replacements (List[Dict[str, str]]): A list of dictionaries, each containing a value.

    • length (int): The length of the error.

    • message (str): The message describing the error.

  • text (str) – The original text in which the error occurred (the whole text, not just the context).

Example of a match object received from the LanguageTool API :

{
    'message': 'Possible spelling mistake found.',
    'shortMessage': 'Spelling mistake',
    'replacements': [{'value': 'newt'}, {'value': 'not'}, {'value': 'new', 'shortDescription': 'having just been made'}, {'value': 'news'}, {'value': 'foot', 'shortDescription': 'singular'}, {'value': 'root', 'shortDescription': 'underground organ of a plant'}, {'value': 'boot'}, {'value': 'noon'}, {'value': 'loot', 'shortDescription': 'plunder'}, {'value': 'moot'}, {'value': 'Root'}, {'value': 'soot', 'shortDescription': 'carbon black'}, {'value': 'newts'}, {'value': 'nook'}, {'value': 'Lieut'}, {'value': 'coot'}, {'value': 'hoot'}, {'value': 'toot'}, {'value': 'snoot'}, {'value': 'neut'}, {'value': 'nowt'}, {'value': 'Noor'}, {'value': 'noob'}],
    'offset': 8,
    'length': 4,
    'context': {'text': 'This is noot okay. ', 'offset': 8, 'length': 4}, 'sentence': 'This is noot okay.',
    'type': {'typeName': 'Other'},
    'rule': {'id': 'MORFOLOGIK_RULE_EN_US', 'description': 'Possible spelling mistake', 'issueType': 'misspelling', 'category': {'id': 'TYPOS', 'name': 'Possible Typo'}},
    'ignoreForIncompleteSentence': False,
    'contextForSureMatch': 0
}
FOUR_BYTES_POSITIONS: List[int] | None = None

The positions of 4-byte encoded characters in the text, registered by the previous match object (kept for optimization purposes if the text is the same).

PREVIOUS_MATCHES_TEXT: str | None = None

The text of the previous match object.

category: str

The category of the rule that was violated.

context: str

The context in which the error occurred.

error_length: int

The length of the error.

get_line_and_column(original_text: str) Tuple[int, int]

Returns the line and column number of the error in the context.

Parameters:

original_text (str) – The original text in which the error occurred. We need this to calculate the line and column number, because the context has no more newline characters.

Returns:

A tuple containing the line and column number of the error.

Return type:

Tuple[int, int]

property matched_text: str

Returns the substring from the context that corresponds to the matched text.

Returns:

The matched text from the context.

Return type:

str

message: str

The message describing the error.

offset: int

The offset of the error.

offset_in_context: int

The offset of the error in the context.

replacements: List[str]

A list of suggested replacements for the error.

rule_id: str

The ID of the rule that was violated.

rule_issue_type: str

The issue type of the rule that was violated.

select_replacement(index: int) None

Select a single replacement suggestion based on the given index and update the replacements list, leaving only the selected replacement.

Parameters:

index (int) – The index of the replacement to select.

Raises:
  • ValueError – If there are no replacement suggestions.

  • ValueError – If the index is out of the valid range.