[CND-NLP] Virtueller NLP Hackathon diesen Mittwoch und Donnerstag, 24. und 25. März 2021

22 Mar 2021

Guten Tag
Gerne lade ich euch diesen Mittwoch und Donnerstag, 24. und 25. März
2021 zum virtuellen NLP Hackathon der Uni Bern ein. Auf dieser Website
und unten im Email sind die 4 spannenden Challenges aufgeführt, die bis
jetzt eingereicht wurden:
https://www.cnd.philnat.unibe.ch/ueber_uns/aktivitaeten/nlp_hackathon/
Der Ablauf des Hackathons ist wie folgt:
*Kickoff am Mittwoch, 24. März 2021, 9:00 - 10:00 Uhr*
- Begrüssung und Einführung
- Vorstellung der Challenges
- Team-Building
*Präsentation der Resultate am Donnerstag, 25. März 2021, 15:00 - 16:00 Uhr*
- Präsentation der Ergebnisse
- ab 16h virtuelles Abschlussbier
Meeting auf BigBlueButton: https://bbb.ch-open.ch/b/mat-f4n-qtn
Kommunikation per Slack: https://nlphackathon.slack.com
Aktuell sind rund 25 Personen angemeldet. Wer ebenfalls teilnehmen
möchte, kann sich per Email an dh(a)wbkolleg.unibe.ch anmelden.
Danke auch fürs Weiterleiten dieser Nachricht an weitere interessierte
Personen!
Wir freuen uns auf spannende zwei Tage NLP-Hacking!
Herzliche Grüsse,
Matthias Stürmer
    Challenges
Folgende vier Challenges sind aktuell eingereicht:
 1. Forschungsstelle Digitale Nachhaltigkeit Uni Bern: Kompetitive
    Challenge "Klassifikation von Schweizer Gerichtsurteilen"
    <https://www.kaggle.com/c/swiss-german-court-rulings/overview>
    The legal language is very special in many regards compared to
    regular natural language. It is highly structured, rather
    complicated, contains its own special terms and uses certain words
    differently than they are used in regular text. Text classification
    is simple to define but has a myriad of possible applications and
    good systems can provide immense value. Common general applications
    of text classification include for example spam filtering, email
    priority rating, or topic classification. And in the legal domain
    text classification includes legal judgement prediction (predict
    outcome of a case based on description of case's facts) or legal
    area prediction. So in this challenge, you will predict the chamber
    based on the text of a court decision. The chamber is structured in
    the form of {federal level}_{court}_{chamber number} (e.g. SG_KG_002
    => St. Gallen, Kantonsgericht, 002).
 2. Statistisches Amt Kanton Zürich: Kreative Challenge "STATBOT.CH"
<https://www.cnd.philnat.unibe.ch/ueber_uns/aktivitaeten/nlp_hackathon/statbotch> (English
    Documentation on GitHub
    <http://https/github.com/statistikZH/statbot/tree/main/documentation>)
    If you are searching for some form of statistical information, it is
    not always easy to find it in the shortest time possible.
    Particularly in Switzerland, the data and information are not only
    spread vertically over different federal levels. They are also
    spread within these federal levels horizontally over different
    offices and even there sometimes over different sites/channels with
    different formats. Looking for the needle in the haystack looks
    comparably easy next to that. Further, even search engines are only
    of limited help, as they follow an indexing logic that excludes
    information stored in databases or files. The background of a more
    difficult search for facts, is also a risk for democratic processes:
    The harder it is for the average citizen to find truthful
    information, the easier it is to spread fake news. Therefore, the
    Statistical Office of the Canton of Zurich, together with other
    organizations, would like to develop a Swiss Statistical Bot
    (STATBOT), which would provide data and statistical information
    directly and quickly across all organizations.
 3. Digital Humanities Uni Bern: Kreative Challenge "NER for Historical
    Documents" <https://www.kaggle.com/c/ner-turmbucher>
    Developments towards NER solutions have shown significant outcome in
    the past few years already. Nevertheless, applications for sparse
    language data are still a challenge, specially when dealing with
    data from pre-modern times. In this challenge, we focus on language
    data from the 16th to the 18th century from the Bernese Turmbücher
    (legal documents protocolled in the Tower of Bern, Switzerland).
    These documents are currently hosted in the State Archives of Bern.
    Language models are not provided.
 4. Digital Humanities Uni Bern: Visualization of Language Models
    Language models (e.g. character embeddings) are essential to succeed
    in NLP tasks. Especially when it comes to Part-of-Speech and Named
    Entity Recognition, tasks result in more precise models if supported
    by adequate language models already. Since the advent of word2vec
    and large transformer-based language models (such as BERT or GPT-3)
    a variety of specialized and fine-tuned language models is currently
    available. Despite the widespread use and the necessity when it
    comes to specific model training (e.g. for language entities with
    only sparse data), our understanding of the models themselves is
    limited at best. In order to strengthen our understanding of
    language models and to start the process of reflecting them, this
    challenge asks for creative ways of visualizing language models. We
    envision 3D-visualizations based on dimension reduction to identify
    the positioning of e.g. synonym/homonyms in vector spaces or listing
    of semantic fields (neighboring vector values). For context
    insensitive approaches (e.g. word2vec or GloVe) we imagine to use
    the fixed vectors and represent calculations in grids.
__________________________________
Universität Bern
Institut für Informatik
Forschungsstelle Digitale Nachhaltigkeit
PD Dr. Matthias Stürmer
Leiter der Forschungsstelle Digitale Nachhaltigkeit,
Dozentur Digitale Transformation am INF und
Dozentur Digitale Nachhaltigkeit am IWI
Büro 204 (2. Stock)
Schützenmattstrasse 14
CH-3012 Bern
Telefon +41 31 631 38 09 (Direkt)
Telefon +41 31 631 47 71 (Sekretariat)
Mobile +41 76 368 81 65
matthias.stuermer(a)inf.unibe.ch
www.digitale-nachhaltigkeit.unibe.ch

2025

2024

2023

2022

2021

[CND-NLP] Virtueller NLP Hackathon diesen Mittwoch und Donnerstag, 24. und 25. März 2021