Skip to content

Add arena hard v2.0#29

Open
ErlisLushtaku wants to merge 7 commits intomainfrom
erlislushtaku/feat/add-arena-hard-v2.0
Open

Add arena hard v2.0#29
ErlisLushtaku wants to merge 7 commits intomainfrom
erlislushtaku/feat/add-arena-hard-v2.0

Conversation

@ErlisLushtaku
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Collaborator

@geoalgo geoalgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just have small comments.

assert dataset in ["alpaca-eval", "arena-hard"]
assert dataset in [
"alpaca-eval",
"arena-hard",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we remove this one?

def download_all():
print(f"Downloading all dataset in {data_root}")
for dataset in ["alpaca-eval", "arena-hard", "m-arena-hard"]:
for dataset in ["alpaca-eval", "m-arena-hard"]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not putting all in the loop?

@@ -0,0 +1,69 @@
from pathlib import Path
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have those in a branch but not in main as it is user specific.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add "and github contributors"?

Comment on lines +21 to +25
"arena-hard-v0.1": ArenaHardSpec(
public_name="arena-hard-v0.1",
canonical_name="arena-hard-v0.1",
hf_repo_id=ARENA_HARD_HF_REPO_ID,
hf_variant="arena-hard-v0.1",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, public_name, canonical_name and hf_variant are all indentical why do we need them?
Also I think removing "arena-hard" in favor of the full specified dataset makes sense to me (like arena-hard-v0.1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants