Monant: Universal and Extensible Platform for Monitoring Online Environment
To monitor online content, especially to tackle false information, KInIT internally develops a data collection platform called Monant.
The primary aim of the Monant platform is to be a universal and easily extensible software infrastructure, which provides:
- highly reusable data providers – a diverse set of crawlers, parsers and API adapters to gather data from online sources,
- support for continuous data collection by scalable orchestration and scheduling of data providers,
- data access to stakeholders via the REST API.
The Monant platform is capable of monitoring various online sources, such as news sites / blogs, fact-checking portals, or social media. Underlying data providers are implemented as a Python library, which are consequently orchestrated by an open-source workflow management Apache Airflow ecosystem. All data are stored in the central storage providing unified schema of online content regardless of the original source. Finally, the Monant platform provides a REST API interface for accessing collected data and storing the results of the automated analyses (such as machine learning based predictions).
The main purpose of the platform is to provide researchers with an appropriate means how to collect and annotate in real time as much data as possible for false information characterization, detection and mitigation; and thus also to provide support for false information combatting scenarios (e.g. fact checking).