===== Introduction ===== ==== Project context ==== The Work Package 7 – Regexta aims to provide a digital tool for the automatic extraction of regesta from historical documents through the use of advanced linguistic models (LLM – Large Language Models). The application allows users to upload or manually insert texts, predominantly in Latin, related to documentary sources in the historical-ecclesiastical domain (for example papal bulls or official documents), and to obtain as output a structured representation of the main information contained in the document. The main activity of the application therefore consists of the Extract Regestum operation, through which the provided text is analyzed and synthesized according to a predefined informational schema. The system supports: * manual text insertion; * uploading of documentary files; * selection of the document type; * automatic generation of the regestum; * export of results; * saving within the platform (for authenticated users). The application is accessible both in public mode and through authentication via the federated D4Science system (Keycloak), with differentiated functionalities depending on the level of access. ---- ==== Definition of Regestum and Documentary Scope ==== The regestum represents a structured summary of a historical document, aimed at highlighting its essential informational elements. Within the context of Regexta, the regestum is automatically generated starting from the text of the document and includes informational fields such as, by way of example: * Sender * Receiver * Place of emanation * Date of emanation * additional elements extracted according to the selected document type The documentary scope of reference mainly includes historical-ecclesiastical documentation, with particular attention to Latin sources such as papal bulls, official acts, and other institutional documents. The selection of the Document Type constitutes a mandatory step in the process and determines the behavior of the underlying linguistic model in the generation of the structured output.