Mechanistic understanding and mitigation of language model non-factual hallucinationsJan 1, 2024ยทLei Yu,Meng Cao,Jackie Chi Kit Cheung,Yue Dongยท 0 min read CiteTypeJournal articlePublicationarXiv preprint arXiv:2403.18167Last updated on Jan 1, 2024 ← Mechanisms of non-factual hallucinations in language models Jan 1, 2024Robust LLM safeguarding via refusal feature adversarial training Jan 1, 2024 →