Skip to content

machinelearningZH/simply-simplify-language_api

Repository files navigation

Simple Simplify Language API

Simply Simplify German Language -- API Version

GitHub License PyPI - Python GitHub Stars GitHub Issues GitHub Pull Requests Current Version linting - Ruff

Features

This is a simplified API version of our Language Simplification Tool.

The API is built with FastAPI and simplifies German text through an LLM provider. It is intended for programmatic integration with other services.

Installation

Requirements:

  • Python 3.12+
  • uv for package and environment management

Setup Project

  1. Clone this repository and change into the project directory
  2. Create a .env file for secrets:
OPENROUTER_API_KEY=sk-or-v1-...
API_AUTH_TOKEN=replace-with-a-long-random-token
  1. Adjust operator-tunable settings in config.yaml:
model:
  name: google/gemini-3-flash-preview
  provider_base_url: https://openrouter.ai/api/v1
  allowed_models:
    - google/gemini-3-flash-preview
  max_tokens: 8096
  max_chars_input: 100000
  timeout_seconds: 60
  max_retries: 2

cors:
  allowed_origins:
    - https://your-client.example
  allowed_methods:
    - POST
  allowed_headers:
    - Authorization
    - Content-Type

site:
  url: https://your-site.com
  name: Your App Name

Environment variables with matching names, such as MODEL_NAME, MAX_TOKENS, and CORS_ALLOWED_ORIGINS, can override config.yaml values for deployments. Set CONFIG_PATH to load a different YAML file.

  1. Install dependencies using uv:
uv sync

uv run automatically uses the project environment; manual activation is not required.

Start the FastAPI server

uv run uvicorn fastapi_app:app --reload

Running Tests

Run the automated test suite locally:

uv run pytest -v

Testing the API Manually

Send requests with the bearer token configured in API_AUTH_TOKEN:

curl -X POST http://127.0.0.1:8000/ \
  -H "Authorization: Bearer $API_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"data":[{"text":"Als Vernehmlassungsverfahren wird diejenige Phase bezeichnet."}]}'

API Reference

Endpoint

POST /

Simplifies German text based on the provided payload.

Request Body

Field Type Required Description
data array[object] Yes Array of text objects to simplify. Each object must have a text field.
leichte_sprache boolean No If true, simplifies the text into Leichte Sprache (plain language). Default: false
model string No LLM model to use. The model must be listed in config.yaml under model.allowed_models.

Example Request

{
  "data": [
    {
      "text": "Als Vernehmlassungsverfahren wird diejenige Phase innerhalb des Vorverfahrens der Gesetzgebung bezeichnet, in der Vorhaben des Bundes von erheblicher politischer, finanzieller, wirtschaftlicher, ökologischer, sozialer oder kultureller Tragweite auf ihre sachliche Richtigkeit, Vollzugstauglichkeit und Akzeptanz hin geprüft werden. "
    },
    {
      "text": "<p>Die Vorlage wird zu diesem <strong>Zweck</strong> den Kantonen, den in der Bundesversammlung vertretenen Parteien, den Dachverbänden der Gemeinden, Städte und der Berggebiete, den Dachverbänden der Wirtschaft sowie weiteren, im Einzelfall interessierten Kreisen unterbreitet.</p>"
    }
  ]
}

Example Response

{
  "simplifications": [
    {
      "text": "Das Vernehmlassungsverfahren ist ein Teil der Gesetzgebung. In diesem Teil prüft der Bund wichtige Vorhaben. Der Bund prüft, ob die Vorhaben richtig, durchführbar und akzeptiert sind."
    },
    {
      "text": "Der Bund legt den Vorschlag den Kantonen vor. Auch Parteien im Parlament sehen den Vorschlag. Verbände der Gemeinden, Städte und Berggebiete bekommen den Vorschlag. Wirtschaftsverbände und andere interessierte Gruppen sehen den Vorschlag auch."
    }
  ]
}

Response Codes

  • 200 OK: Successfully simplified the input data
  • 400 Bad Request: Requested model is not allowed
  • 401 Unauthorized: Bearer token is missing or invalid
  • 413 Payload Too Large: Input text exceeds model.max_chars_input from config.yaml
  • 422 Unprocessable Content: Required fields are missing or the payload does not match the request schema
  • 502 Bad Gateway: The model provider request failed or returned an invalid response
  • 500 Internal Server Error: An internal error occurred during processing

Notes

  • The endpoint requires an Authorization: Bearer ... header that matches API_AUTH_TOKEN
  • The data field must be a non-empty array of objects with a non-empty text field
  • HTML tags in the input text are preserved in the output
  • The leichte_sprache option uses specific prompts to generate text that follows Leichte Sprache guidelines for easier comprehension

Project Team

Chantal Amrhein, Patrick ArneckeStatistisches Amt Zürich: Team Data

Feedback and Contributing

Feedback and contributions are welcome. Email us or open an issue or pull request.

Use ruff for linting and formatting:

uv run ruff format .
uv run ruff check .

License

This project is licensed under the MIT License. See the LICENSE file for details.

Disclaimer

This software (the Software) uses a configurable external language model provider and has been developed according to and with the intent to be used under Swiss law. Please be aware that the EU Artificial Intelligence Act (EU AI Act) may, under certain circumstances, be applicable to your use of the Software. You are solely responsible for ensuring that your use of the Software and the configured language model complies with all applicable local, national, and international laws and regulations. By using this Software, you acknowledge and agree that it is your responsibility to assess which laws and regulations apply to your intended use and to comply with them. You also agree to hold us harmless from any action, claim, liability, or loss related to your use of the Software.

About

Behördendeutsch-Removal as an API

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages