Migrate sheets reading to use serde

We have a bunch of structs which we fill from Google Sheets, one per row, e.g.

https://github.com/CodeYourFuture/trainee-tracker/blob/aa0d5787598ab8191f6429b6858f3b62dd2a6f9f/src/prs.rs#L217-L224

We currently do this in an annoyingly manual and inflexible way.

For this particular struct, we populate it here:

https://github.com/CodeYourFuture/trainee-tracker/blob/aa0d5787598ab8191f6429b6858f3b62dd2a6f9f/src/reviewer_staff_info.rs#L48-L79

We ignore the headings completely, so if someone changed the sheet, we would start reading incorrect values. Then we hard-code the field order and types of each cell.

In other places, we manually check headings, and just error if they're wrong: https://github.com/CodeYourFuture/trainee-tracker/blob/aa0d5787598ab8191f6429b6858f3b62dd2a6f9f/src/register.rs#L105-L160

We also handle partial rows poorly.

I would love instead for us to just use `serde`. We could derive `Deserialize` for each struct we populate, and use a `serde` adaptor to read the sheet into a `Vec<T>`.

There is an existing [`serde_sheets` crate](https://crates.io/crates/serde_sheets) for doing this, but it has some issues. If it runs into issues with a row, it just prints an error to stdout and continues (https://docs.rs/serde_sheets/latest/src/serde_sheets/lib.rs.html#200).

Ideally we would either improve this crate, or write our own one. I'm imagining supporting a function along the lines of:

```rust
pub async fn read_all<T: DeserializeOwned>(
    sheets: &mut Sheets,
    document_id: &str,
    tab_name: &str,
) -> SheetReadResult<T>;

enum SheetReadResult<T> {
    AllOk(Vec<T>),
    SomeOk {
        ok: Vec<T>,
        partial_records: Vec<Record>,
        deserialization_error_records: Vec<(Record, serde::Error)>,
    },
    Err(SheetsError),
}

struct Record {
    row_number: usize,
    cells: Vec<CellData>,
}
```

	#[derive(Debug, PartialEq, Eq, Serialize)]
	pub(crate) struct ReviewerStaffOnlyDetails {
	pub(crate) name: String,
	pub(crate) attended_training: bool,
	pub(crate) checked: CheckStatus,
	pub(crate) quality: String,
	pub(crate) notes: String,
	}

	for (row_index, cells) in sheet.rows.iter().enumerate() {
	if row_index == 0 {
	continue;
	}
	if cells.len() < 6 {
	continue;
	}

	let github_login = GithubLogin::from(cell_string(&cells[0]));

	let notes = match cells.get(6) {
	Some(cell) => cell_string(cell),
	None => String::new(),
	};

	let checked = match (cell_bool(&cells[3]), cell_bool(&cells[4])) {
	(true, false) => CheckStatus::CheckedAndOk,
	(true, true) => CheckStatus::CheckedAndCheckAgain,
	(false, _) => CheckStatus::Unchecked,
	};

	reviewers.insert(
	github_login.clone(),
	ReviewerStaffOnlyDetails {
	name: cell_string(&cells[1]),
	attended_training: cell_bool(&cells[2]),
	checked,
	quality: cell_string(&cells[5]),
	notes,
	},
	);
	}

	for (row_number, cells) in sheet_data.into_iter().enumerate() {
	// Some sheets have documentation or pivot table
	if row_number == 0 && !cells.is_empty() && cell_string(&cells[0]) != "Name" {
	continue;
	}
	if cells.len() < 7 {
	return Err(anyhow::anyhow!(
	"Not enough columns for row {} - expected at least 7, got {} containing: {}",
	row_number,
	cells.len(),
	format!("{:#?}", cells),
	));
	}
	if row_number == 0 {
	let headings = cells.iter().take(7).map(cell_string).collect::<Vec<_>>();
	if headings
	!= [
	"Name",
	"Email",
	"Timestamp",
	"Course",
	"Module",
	"Day",
	"Location",
	]
	{
	return Err(anyhow::anyhow!(
	"Register sheet contained wrong headings: {}",
	headings.join(", ")
	));
	}
	} else {
	if cells[0].effective_value.is_none() {
	break;
	}
	let (sprint_number, attendance) = read_row(&cells, register_url.clone())
	.with_context(\|\| format!("Failed to read attendance from row {}", row_number))?;
	if attendance.timestamp.date_naive() <= start_date
	\|\| attendance.timestamp.date_naive() >= end_date
	{
	continue;
	}
	let sprint_index = sprint_number - 1;
	while sprints.len() < sprint_number {
	sprints.push(IndexMap::new());
	}
	if sprints[sprint_index].contains_key(&attendance.email) {
	warn!(
	"Register sheet contained duplicate entry for sprint {} trainee {}",
	sprint_number, attendance.email
	);
	} else {
	sprints[sprint_index].insert(attendance.email.clone(), attendance);
	}
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate sheets reading to use serde #39

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Migrate sheets reading to use serde #39

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions