Zorik Gekhman

I am a computer science PhD student at the Technion, working with Roi Reichart. I am also in my second (year-long) research internship at Google Research, where I work with Jonathan Herzig and Roee Aharoni.

I work on the robustness and interpretability of large language models, with a particular focus on factuality and hallucinations. My research develops rigorous methods to evaluate robustness, explores the underlying causes of robustness issues, and proposes strategies to mitigate them.

Past

I worked as a full time research engineer at Google Health Research for four years, focusing on medical translation and transcription solutions.

I worked as a full time backend engineer at Microsoft Cyber Defense for three years, building a high scale cloud infrastructure in production.

I hold a BSc. in computer science from the Technion (Summa Cum Laude).

email icon zorikgekhman@gmail.com


Selected Publications

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, Jonathan Herzig

EMNLP 2024

We study the impact of introducing new factual knowledge during fine-tuning, on the LLM's capability to utilize its pre-existing knowledge, acquired during pre-training. We show that LLMs struggle to learn new knowledge through fine-tuning, but when they eventually do, it linearly increases their tendency to hallucinations w.r.t. their pre-existing knowledge. Our findings highlight the potential for unintended consequences when introducing new knowledge through fine-tuning, and imply that fine-tuning may be more useful as a mechanism to enhance the utilization of pre-existing knowledge.

Paper Tweet Cite
2024

TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models

Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor

EMNLP 2023

To train models that can evaluate factual consistency in summarization, we need labeled training data—comprising document-summary pairs labeled for consistency—which is unavailable in the summarization domain. As a result, existing models heavily rely on synthetic data, where negative (inconsistent) examples are created by perturbing human-written summaries to inject errors. However, this approach results in (1) limited coverage of possible errors, as perturbation logic is usually simple, and (2) an unnatural data distribution, as perturbed summaries differ from model-generated ones. To address these limitations, we introduce TrueTeacher, a method for generating synthetic data by annotating diverse model-generated summaries using a large language model (LLM). Unlike prior work, TrueTeacher does not rely on human-written summaries and is inherently multilingual, with the resulting summaries contain more realistic errors and a natural text distribution. We release a large-scale synthetic dataset (1.4M examples) generated using TrueTeacher, along with a model checkpoint trained on this data.

Paper Tweet Model Data Cite
2023

On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method

Zorik Gekhman*, Nadav Oved*, Orgad Keller, Idan Szpektor, Roi Reichart

TACL 2023

Most works on modeling the conversation history in Conversational Question Answering (CQA) report a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g. from large to small sets) and domain. In this work, we design and conduct the first large-scale robustness study of history modeling approaches for CQA. We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings. Equipped with the insights from our study, we design a novel prompt-based history modeling approach and demonstrate its strong robustness across various settings.

Paper Tweet Code Cite
2023

All Publications

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, Jonathan Herzig

EMNLP 2024
Paper Tweet Cite
2024

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Hadas Orgad, Michael Toker, Zorik Gekhman, Roi Reichart, Idan Szpektor, Hadas Kotek, Yonatan Belinkov

Preprint
Paper Cite
2024

NL-Eye: Abductive NLI for Images

Mor Ventura, Michael Toker, Nitay Calderon, Zorik Gekhman, Yonatan Bitton, Roi Reichart

Preprint
Paper Cite
2024

Measuring the Robustness of NLP Models to Domain Shifts

Nitay Calderon*, Naveh Porat*, Eyal Ben-David, Alexander Chapanin, Zorik Gekhman, Nadav Oved, Vitaly Shalumov, Roi Reichart

Findings of EMNLP 2024
Paper Cite
2024

Can LLMs Learn Macroeconomic Narratives from Social Media?

Almog Gueta, Amir Feder, Zorik Gekhman, Ariel Goldstein, Roi Reichart

Preprint
Paper Cite
2024

TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models

Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor

EMNLP 2023
Paper Tweet Model Data Cite
2023

On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method

Zorik Gekhman*, Nadav Oved*, Orgad Keller, Idan Szpektor, Roi Reichart

TACL 2023
Paper Tweet Code Cite
2023

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

INTERSPEECH 2023
Paper Cite
2023

RED-ACE: Robust Error Detection for ASR using Confidence Embeddings

Zorik Gekhman*, Dina Zverinski*, Jonathan Mallinson, Genady Beryozkin

EMNLP 2022
Paper Tweet Code Cite
2022

KoBE: Knowledge-Based Machine Translation Evaluation

Zorik Gekhman, Roee Aharoni ,Genady Beryozkin, Markus Freitag, Wolfgang Macherey

Findings of EMNLP 2020
Paper Code and Data Cite
2020