Update gkd_loss.py teacher_prompt supports two formats: by yhy19 · Pull Request #9562 · modelscope/ms-swift

yhy19 · 2026-06-15T07:48:04Z

teacher_prompt supports two formats:
1. str: Used only as the user content for the teacher, replacing the last user message.
2. list[dict]: A complete list of messages (including system/user/assistant),
which directly replaces the teacher's messages (retaining the assistant from the original data as the response).

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

teacher_prompt supports two formats: 1. str: Used only as the user content for the teacher, replacing the last user message. 2. list[dict]: A complete list of messages (including system/user/assistant), which directly replaces the teacher's messages (retaining the assistant from the original data as the response).

gemini-code-assist

Code Review

This pull request updates build_opsd_teacher_data in swift/rlhf_trainers/gkd_loss.py to support teacher_prompt as either a string or a list of message dictionaries. A review comment points out an inefficiency where an assistant message is fetched from the original messages and then immediately popped when strip_assistant is True. A code suggestion was provided to optimize this logic by only fetching the fallback assistant message when strip_assistant is False.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-15T07:48:57Z

+            # Fallback: If teacher_prompt does not end with an assistant message, fetch it from the original messages
+            if (not teacher_messages) or teacher_messages[-1].get('role') != 'assistant':
+                orig = data.get('messages', [])
+                for m in reversed(orig):
+                    if m.get('role') == 'assistant':
+                        teacher_messages.append(dict(m))
+                        break
+            if strip_assistant and teacher_messages and teacher_messages[-1].get('role') == 'assistant':
+                teacher_messages.pop()


When strip_assistant is True and teacher_prompt (as a list) does not end with an assistant message, the code currently fetches the assistant message from the original messages, appends it, and then immediately pops it. This is redundant and inefficient. We can optimize this by only running the fallback logic when strip_assistant is False.

Suggested change

# Fallback: If teacher_prompt does not end with an assistant message, fetch it from the original messages

if (not teacher_messages) or teacher_messages[-1].get('role') != 'assistant':

orig = data.get('messages', [])

for m in reversed(orig):

if m.get('role') == 'assistant':

teacher_messages.append(dict(m))

break

if strip_assistant and teacher_messages and teacher_messages[-1].get('role') == 'assistant':

teacher_messages.pop()

if strip_assistant:

if teacher_messages and teacher_messages[-1].get('role') == 'assistant':

teacher_messages.pop()

else:

# Fallback: If teacher_prompt does not end with an assistant message, fetch it from the original messages

if (not teacher_messages) or teacher_messages[-1].get('role') != 'assistant':

orig = data.get('messages', [])

for m in reversed(orig):

if m.get('role') == 'assistant':

teacher_messages.append(dict(m))

break

hjh0119 · 2026-06-15T07:51:59Z

If replacing the full messages, would using the teacher_messages key be more appropriate?

yhy19 · 2026-06-15T08:42:11Z

If replacing the full messages, would using the teacher_messages key be more appropriate?

That makes perfect sense, I'll update it right away.

gemini-code-assist Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update gkd_loss.py teacher_prompt supports two formats:#9562

Update gkd_loss.py teacher_prompt supports two formats:#9562
yhy19 wants to merge 1 commit into
modelscope:mainfrom
yhy19:patch-1

yhy19 commented Jun 15, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 15, 2026

Uh oh!

hjh0119 commented Jun 15, 2026

Uh oh!

yhy19 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yhy19 commented Jun 15, 2026

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

hjh0119 commented Jun 15, 2026

Uh oh!

yhy19 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants