Skip to content

feature/: Exit Withdraw Date code to handle inclusivity/exclusivity#201

Merged
rlittle08 merged 4 commits intoedanalytics:mainfrom
tnedu:tdoe_exit_withdraw_date_enhancement
Apr 1, 2026
Merged

feature/: Exit Withdraw Date code to handle inclusivity/exclusivity#201
rlittle08 merged 4 commits intoedanalytics:mainfrom
tnedu:tdoe_exit_withdraw_date_enhancement

Conversation

@smckee-tnedu
Copy link
Copy Markdown
Contributor

@smckee-tnedu smckee-tnedu commented Nov 13, 2025

Description & motivation

If you ask Edfi if the enrollment exit_withdraw_date "should considered inclusive (this is the last day of the enrollment and the student attended on that day) or exclusive (the last day the student attended for their enrollment was the day prior)" then they will tell you "treat it however it makes sense for your implementation". Dandy. But EDU assumes that this date is inclusive. Here at TDOE, this is the ONLY date that we consider exclusive. So lots of the models, and what not, are not accurate. In particular, the "is_active_enrollment" in fct_student_school_association. But also, many of the attendance models would also be inaccurate for us. The changes presented incorporate a new variable that can be set, edu:enroll:exit_withdraw_date_inclusive, and any models that use exit_withdraw_date will now utilize a few new macros to evaluate exit_withdraw_date from an inclusive or exclusive manner, however configured. The default for this variable is that exit_withdraw_date is true, so the output of EDU shouldn't change for people who came before TDOE.

Breaking changes introduced by this PR:

There should be no breaking changes. However, two caveats:

  • There may be a bug in the original tests/integrity_tests/all_student_school_associations_have_calendars.sql. There is a comparison between the first day of school and the enrollment exit_withdraw_date. But everywhere in EDU, it treats exit_withdraw_date as inclusive and uses <= instead of < when comparing dates. Unless I'm missing something, and, admittedly, I might be, this code should have fct_stu_school_assoc.exit_withdraw_date >= first_school_day.first_school_day instead of what is currently there. If the comparator was >= then this would include students who only attended the first day of school and then exited. (I think).
  • Please look closely at the Snowflake macro snowflake__day_count_in_range. There is a model fct_student_school_attendance_event that would calculate the number of days for an enrollment. In Databricks, for an inclusive end date, this calculated int would be wrong. Take a 2 day enrollment for example, Nov 1 to Nov 2. Supposing Nov 2 is inclusive in this example, the expected value is 2, but the code was producing a 1. I am assuming Snowflake works the same, so I am adding a +1 to the Snowflake calculation in order to account for this. I don't have Snowflake, so I could be wrong and this will need fixing. I am not sure. However, in any event, the value calculated isn't even used in the output of the model, so it may not matter that it might have been off by 1 day. In any event, there may be dragons here.

All other changes should be completely transparent since the default behavior of the enhancement favors the current EDU implementation.

PR Merge Priority:

  • Low
  • Medium
  • High

I would prefer to set this to Medium because we want to get off of our own forks of edu_wh and we won't be able to until certain other things are merged into edu_wh main. However, there is another change we have in the queue that deals with adding CDS (and an enhanced impl of CDS at that) to basically every dim and fact. So I left this as Low.

Changes to existing files:

  • models/core_warehouse/fct_student_daily_attendance.sql : Changed a case statement to use the new macro for determing if a date counts within the enrollment end date.
  • models/core_warehouse/fct_student_school_association.sql : Modified all the comparisons to exit_withdraw_date to use the new macros.
  • models/core_warehouse/fct_student_school_attendance_event.sql : Modified all the comparisons to exit_withdraw_date to use the new macros. This includes a join on dim_calendar_date.
  • models/qc/sections_per_enrollment.sql : Modified all the comparisons to exit_withdraw_date for the overlap logic with section dates.
  • tests/integrity_tests/all_student_school_associations_have_calendars.sql : Modified the comparison to exit_withdraw_date for the students who exited before the first day of school. The comment here is confusing. But also, I think there's a bug in the original code. See above section on stuff that might be broken.
  • tests/value_tests/invalid_enrollments.sql : Modified the comparisons to exit_withdraw_date for the columns of if they exited before their first day or before entry.
  • tests/value_tests/overlapping_enrollments.sql : Modified the join clause to not use between and use the macro instead of the second date normally in the between.

New files created:

  • macros/end_date_utils.sql : Contains the macros necessary to correctly evaluate dates inclusivity/exclusivity.

Tests and QC done:

I tested only the databricks side of this since I do not have Snowflake. Things appear to be working as expected.

edu_wh PR Review Checklist:

Make sure the following have been completed before approving this PR:

  • Description of changes has been added to Unreleased section of CHANGELOG.md. Add under ## New Features for features, etc.
  • Code has been tested/checked for Databricks and Snowflake compatibility - EA engineers see Databricks checklist here
  • Reviewer confirms the grain of all tables are unchanged, OR any changes are expected, communicated, and this PR is flagged as a breaking change (not for patch release)
  • If a new configuration xwalk was added:
    • The code is written such that the xwalk is optional (preferred), and this behavior was tested, OR
    • The code is written such that the xwalk is required, and the required xwalk is added to edu_project_template, and this PR is flagged as breaking change (not for patch release)
    • A description for the new xwalk has been added to EDU documentation site here
  • If a new configuration variable was added:
    • The code is written such that the variable is optional (preferred), and this behavior was tested, OR
    • The code is written such that the variable is required, and a default value was added to edu_project_template, and this PR is flagged as breaking change (not for patch release)
    • A description for the new variable has been added to EDU documentation site here

@smckee-tnedu
Copy link
Copy Markdown
Contributor Author

I meant to say, but forgot, that I don't care if you use exactly what I wrote or rename everything or design it differently or whatever. The bottom line for us is that exit_withdraw_date is exclusive in our world and so we need a version of edu_wh where that is a configurable option. Thanks!

I struggle with understanding all the implementation reasons and due to a requirement at TDOE, I have changed this back to what it was, but updated the comment so it made more sense.
@rlittle08 rlittle08 self-requested a review February 25, 2026 21:33
Copy link
Copy Markdown
Collaborator

@rlittle08 rlittle08 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the thoroughness, including updating tests.

The design, naming, etc. are all approved. Good catch on the all_student_school_associations_have_calendars bug, too. I think that'd only make a difference in very small edge cases, but it's nice to have the logic consistent.

Two things I want to think through before releasing

  1. Can/should we go ahead and add this logic in other places within edu_wh? The ones I'm aware of are stu-section and staff-section. This could easily go in a later PR, though, given it's not relevant to TDOE
  2. I'm actually worried that edu_wh dbt models are the least common place that this inclusivity logic is found. More often, custom models, ad-hoc queries, BI tools, or soon AI query builders are doing this logic. What is the best way for EDU to properly document when a certain end date should be considered inclusive vs. exclusive? Is this an argument for making a larger change to the data model, either adding some kind of metadata column or (more drastically) forcing standardization of end date inclusivity? I imagine the latter would not make you happy at TDOE, given queryers expect based on history that the dates will be exclusive...

@smckee-tnedu
Copy link
Copy Markdown
Contributor Author

  1. That would be fine with me. At TN the only exclusive date is exit_withdraw_date. ALL OTHERS are inclusive. This must be some sort of 20+ year old legacy decision because it has annoyed me the entirety of the 5 years I've been here at TDOE.
  2. I don't think you can force a standardization of end date inclusivity. I tried to argue that we do this before we started using Edfi, long before we knew about EDU. But the pushback I got centered around the fact that we'd have to make every SIS vendor change the way they send us data. They all see this date as exclusive and this is the date they record in their SIS packages. PROBABLY 20 years ago we told them to send it this way. Then also we'd have to change the way we think about our data, and not just us in IT, but literally everyone who uses the data on a day to day basis. Now, maybe you could sort of force the standardization a different way by defining a variable that then controls code that does a date_add(end_date, -1), turning the exclusive date into an inclusive date, but then we at TDOE would be in the unfortunate world where some of our enrollments would have an end date before the begin date. For example, one type of no-show enrollment for us is when a student's entry and exit date are equal. And our Districts and Cohort folks want to see these no-show enrollments. The unfortunate consequence is now the entry date is 8/4 and the exit date is 8/3, which fct_student_school_association has code to kick these out altogether. i mean, anything is possible. Maybe we have a new fct which is just no-shows of any type. I don't know. Maybe it's easier to enforce a standard where end dates are exclusive (adding +1 day instead), but that makes easy to understand sql clauses that use between to now be unusable. Again, I don't know.

@smckee-tnedu smckee-tnedu marked this pull request as draft February 26, 2026 18:29
@smckee-tnedu smckee-tnedu marked this pull request as ready for review February 26, 2026 18:31
@smckee-tnedu
Copy link
Copy Markdown
Contributor Author

Also, this PR is missing a commit that I made on Feb 19. I undid a change to put back the way it was. I thought it would show up automatically, but it's not for some reason.

@rlittle08
Copy link
Copy Markdown
Collaborator

@smckee-tnedu do you want to open a fresh pr to get the final change in? I am doing final review and hoping to release this week

@rlittle08
Copy link
Copy Markdown
Collaborator

@smckee-tnedu

Is this the change?

{% if var('edu:enroll:exclude_exit_before_first_day', True) -%}
      -- exclude students who exited before the first school day
      and ({{ date_within_end_date('bld_school_calendar_windows.first_school_day', 'exit_withdraw_date', var('edu:enroll:exit_withdraw_date_inclusive', True)) }}
          or bld_school_calendar_windows.first_school_day is null
          )
    {% endif %}

@smckee-tnedu
Copy link
Copy Markdown
Contributor Author

The entire where clause should read like so:

    where true
   {% if var('edu:enroll:exclude_exit_before_first_day', True) -%}
      -- exclude students who exited before the first school day
      and ({{ date_within_end_date('bld_school_calendar_windows.first_school_day', 'exit_withdraw_date', var('edu:enroll:exit_withdraw_date_inclusive', True)) }}
          or bld_school_calendar_windows.first_school_day is null
          )
    {% endif %}
    -- make sure exit is after entry
    and (exit_withdraw_date is null or exit_withdraw_date >= entry_date)
    -- exclude students who never actually enrolled
    {% set excl_withdraw_codes =  var('edu:enroll:exclude_withdraw_codes')  %}
    {% if excl_withdraw_codes | length -%}
      {% if excl_withdraw_codes is string -%}
        {% set excl_withdraw_codes = [excl_withdraw_codes] %}
      {%- endif -%}
      -- drop invalid enrollments
      and (stg_stu_school.exit_withdraw_type not in (
      '{{ excl_withdraw_codes | join("', '") }}'
      )
      -- keep rows with no exit_withdraw
      or stg_stu_school.exit_withdraw_type is null) 
    {% endif %}
    {% if var('edu:enroll:exclude_cross_year_enrollments', False)%}
    and dim_student.school_year = stg_stu_school.school_year
    {% endif %}

i had originally changed this line
and (exit_withdraw_date is null or exit_withdraw_date >= entry_date)
to something else but then realized that that was making a bad assumption so i changed it back to the way you guys had it.

@rlittle08
Copy link
Copy Markdown
Collaborator

rlittle08 commented Mar 30, 2026

@smckee-tnedu got it, thanks!

How should the first day of school be treated? With current logic, if edu:enroll:exclude_exit_before_first_day = True, and edu:enroll:exit_withdraw_date_inclusive = False, and exit_withdraw_date = first_day_of_school, the record will be dropped. Is that correct in TN?

@smckee-tnedu
Copy link
Copy Markdown
Contributor Author

smckee-tnedu commented Mar 31, 2026

There is a lot going on here and the entire thing is wrapped up in how we process no shows.
First, in reference to your question, but not to answer it because it assumes a scenario we don't have, edu:enroll:exclude_exit_before_first_day = False for us. The reason is because our Districts want to see no shows in the enrollment fact. Why? I don't know. I think there is some legacy behavior in here that they use for some reason that should probably go away.

In our world, we have two types of no shows (our districts don't use the no-show exit withdraw type).

  1. Enrollments where entry_date = exit_withdraw_date. Since exit_withdraw_date is exclusive, these two dates have no duration between them and so they are a no show.
  2. Enrollments where entry_date != exit_withdraw_date and there are no School Days within that enrollment range.

So Districts want to see BOTH of these types of no shows. So, in my testing, this chunk of code allows us to keep our No Show type 2:

   {% if var('edu:enroll:exclude_exit_before_first_day', True) -%}
      -- exclude students who exited before the first school day
      and ({{ date_within_end_date('bld_school_calendar_windows.first_school_day', 'exit_withdraw_date', var('edu:enroll:exit_withdraw_date_inclusive', True)) }}
          or bld_school_calendar_windows.first_school_day is null
          )
    {% endif %}

and this chunk of code allows us to keep our No Show type 1 (merely because your original code assumes exit_withdraw_date inclusivity:

and (exit_withdraw_date is null or exit_withdraw_date >= entry_date)

Should the above line change because I think your original intent is to EXCLUDE no shows from this fact table? Maybe, but it was working for our purposes, so I left it alone (after an initial attempt to fiddle with it that didn't work).

(For additional and unnecessary information, all our No Show types can be defined as "any Enrollment where there exists no School Days within the Enrollment date range". So, you could imagine a 3rd type of Enrollment that doesn't fit in with the 2 defined above where a Student's entire set of enrollment days occurs during the Christmas holiday. We still consider this a no show, even though it's not one of the two main versions.)

So, to answer your question, from my perspective, your scenario and the behavior seem to align to me:

Assumptions

  1. edu:enroll:exit_withdraw_date_inclusive = False
    a. The exit_withdraw_date is not included in the set of counted enrollment days.
  2. edu:enroll:exclude_exit_before_first_day = True
    a. Any enrollment whose entire set of included enrollment days occurs before the first day of school must be excluded.
  3. exit_withdraw_date = first_day_of_school

Reasoning

  1. Because exit_withdraw_date_inclusive = False, the enrollment does not include first_day_of_school as an active enrollment day.
  2. Therefore, all included enrollment days (if any) occur strictly before the first day of school.
  3. Because exclude_exit_before_first_day = True and the enrollment’s included days occur entirely before the first day of school, the enrollment meets the exclusion criteria.

Conclusion

The enrollment record should be dropped.

@rlittle08 rlittle08 changed the base branch from main to feature/incremental_wh April 1, 2026 15:21
@rlittle08 rlittle08 changed the base branch from feature/incremental_wh to main April 1, 2026 15:21
@rlittle08
Copy link
Copy Markdown
Collaborator

@smckee-tnedu much appreciated. We are going forward with your conclusion, that these records should be dropped. We may later revisit a configuration for this particular rule, but we don't need that at the moment.

I'm ready to approve and merge. I was able to get your latest commit into the PR by changing the base of the PR off main, then back on main, from your fork repo. Don't ask me why that works lol

@rlittle08 rlittle08 added 0.6.2 and removed 0.7.0 labels Apr 1, 2026
@rlittle08 rlittle08 merged commit 8f34489 into edanalytics:main Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants