LlmSpeechSummarization with AllAudioTracksJob handler#421
Conversation
…g BECAUSE it tries to talk to a vllm container TODO: parameterize the URL
…ither needed at build OR plumbing
… I have tested their functionality
…into feat/ff-all-audio-tracks
…enizers during docker build
…into feat/ff-all-audio-tracks
jrobble
left a comment
There was a problem hiding this comment.
@jrobble reviewed 22 files and all commit messages, and made 11 comments.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on eric-mccann-pro).
python/LlmSpeechSummarization/README.md line 3 at r2 (raw file):
# Overview The OpenMPF LLM Speech Summarization Component summarizes FeedForward video and audio tracks for speech detections.
Change to "feed-forward".
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 197 at r2 (raw file):
@staticmethod def _get_track_for_classifier(t, job: mpf.VideoJob|mpf.AudioJob, classifier_name, classifier, arg_factory):
t is not descriptive enough. Please change to track_cls. Same for the parameter in _get_classifier_track().
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 281 at r2 (raw file):
return results def _process_feed_forward_job(self, job, config, make_track, make_main_track):
Call make_track make_classifier_track for clarity.
Call make_main_track make_summary_track for clarity.
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 316 at r2 (raw file):
) main_detection_properties = {'TEXT': final_summary.summary}
Call this summary_detection_properties.
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 333 at r2 (raw file):
make_track, filter( lambda c: c[1].confidence >= classifier_confidence_minimum,
classifier is more descriptive than c.
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 341 at r2 (raw file):
return results def run_component_test(clientFactory = None,
In general, we try to keep the test code separate from the component code. Could this function be moved to test_llm_speech_summarization_component.py?
python/LlmSpeechSummarization/llm_speech_summarization_component/llm_speech_summarization_component.py line 367 at r2 (raw file):
if __name__ == '__main__':
Our other Python components don't have a main. Is this needed?
python/LlmSpeechSummarization/llm_speech_summarization_component/tests/test_llm_speech_summarization_component.py line 33 at r2 (raw file):
from llm_speech_summarization_component.llm_speech_summarization_component import run_component_test, _log_exception, logger logger.setLevel(logging.DEBUG)
logger isn't used in this file. Please remove it.
python/LlmSpeechSummarization/llm_speech_summarization_component/tests/test_llm_speech_summarization_component.py line 535 at r2 (raw file):
main_detection = result[0] classifier_detection = result[1] assert main_detection.detection_properties['TEXT'] == "The conversation is a multifaceted discussion centered on Major League Baseball, primarily revolving around the publication and content of a memoir titled 'Reminiscences of an Old Timer' by former player John (Dasher) Troy. The memoir serves as both a historical reflection on early professional baseball and a practical guide for aspiring players, emphasizing foundational skills, strategic decision-making, and the mental and physical demands of the game. Key themes include player positioning, batting and pitching techniques, base running, fielding mechanics, and the importance of experience, observation, and self-awareness. The discussion also highlights the legacy of early baseball players and teams, the evolution of the sport, and the enduring significance of traditional principles such as proper footwork and timing. While several fragments reference real estate, business operations, and promotional content in New York City—including venues in Harlem, Chelsea, and Manhattan—these appear to be incidental or transcribed artifacts and do not form a coherent narrative. The overwhelming focus remains on professional baseball gameplay, rules, player health, team discipline, and historical context, with consistent references to specific teams, players, stadiums, and equipment. The conversation reflects a deep engagement with the sport’s traditions, strategies, and cultural significance."
This same TEXT appears three times in this file. Refactor it into a common variable like this: https://github.com/openmpf/openmpf-components/blob/master/python/AzureTranslation/tests/test_acs_translation.py#L55
python/LlmSpeechSummarization/llm_speech_summarization_component/tests/test_llm_speech_summarization_component.py line 542 at r2 (raw file):
try: raise _log_exception(mpf.DetectionError.OTHER_DETECTION_ERROR_TYPE, 'It worked') assert False
It's not possible to assert False here. This is dead code. Please remove it.
python/TransformerTagging/transformer_tagging_component/transformer_tagging_component.py line 258 at r2 (raw file):
self.json = self._load_json(corpus_file_name) start = time.time() self.embed = model.encode(self.json["text"].tolist(), convert_to_tensor=True, show_progress_bar=False)
I will be landing a hotfix to develop in the next day or two. You will need to merge that into this before landing.
jrobble
left a comment
There was a problem hiding this comment.
@jrobble reviewed 3 files and all commit messages, and resolved 10 discussions.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on eric-mccann-pro and tstrass).
jrobble
left a comment
There was a problem hiding this comment.
@jrobble reviewed all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on eric-mccann-pro and tstrass).
jrobble
left a comment
There was a problem hiding this comment.
@jrobble reviewed 3 files and all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on eric-mccann-pro and tstrass).
jrobble
left a comment
There was a problem hiding this comment.
@jrobble partially reviewed 4 files.
Reviewable status: 22 of 24 files reviewed, all discussions resolved (waiting on eric-mccann-pro and tstrass).
jrobble
left a comment
There was a problem hiding this comment.
@jrobble partially reviewed 3 files.
Reviewable status: all files reviewed (commit messages unreviewed), all discussions resolved (waiting on eric-mccann-pro and tstrass).
jrobble
left a comment
There was a problem hiding this comment.
@jrobble reviewed all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on eric-mccann-pro and tstrass).
Issues:
Related PRs:
Please review our Contributor Guide.
This change is