Go to JKU Homepage
Institute of Computational Perception
What's that?

Institutes, schools, other departments, and programs create their own web content and menus.

To help you better navigate the site, see here where you are at the moment.

Natural language processing

Student Projects in Natural language processing

Contacts (if not stated otherwise): Markus Schedl, Oleg Lesota, Deepak Kumar, Shahed Masoudian

Here the list of possible topics applicable to various courses and projects, such as seminars, practical works, and theses. Feel free to contact us for concrete questions or regarding any other possible topic!

Topics:

  • Fundamentals of Large Language Models (LLM)
    • Novel architectures, training, and advancements of LLMs
    • LLM's interpretation and analyses: encoded linguistic knowledge, world model, limitations ­­
  • Fair NLP
    • Bias mitigation methods in NLP models
    • Representation disentanglement for fairness
    • Measuring societal biases in LLMs
    • Measuring societal biases in downstream tasks such as IR and job prediction 
  • Parameter-efficient training
    • Subnetworks in large deep learning models
    • Adapter-based training
  • Low-resource training
    • Few-shot learning in NLP
    • Multi-lingual or multi-domain knowledge transfer  
  • Applications and domains
    • Classification of LLM- vs. human-generated text
    • Using Emotional and Contextual encoders for depression detection from textual data
    • Analysis of gender and country bias in research papers
    • Information Retrieval models
    • Hate speech detection
    • Health and medical NLP
    • NLP for humanitarian response
    • NLP for climate change
    • Factual Question Answering models
    • Information extraction 
    • Summarization
  • Generative Language Models and Retrieval Augmented Generation (RAG)
    • Multi-modal RAG: Intelligent modal switching/selection in the retrieval part of the RAG pipeline for multi-modal QnA
    • Interpretability/explainability in RAG.
    • Ante-hoc controllable text summarization with multi-controllable attributes. 
    • LLM sensitization towards factual/faithful text generation. 
    • Multi-modal fact-verification in the healthcare/medical domain. 
    • Multi-Document Scientific Summarization. 
    • Context-driven dynamic decoding in open-ended text generation.
    • RL policy-shaping for adequate beam size/sampling.
    • A mixture of experts-based feature importance/ explainability in dense retrieval.