Skip to content

Mapcat integration for make_atomic_filterbin_map and make_coadd_atomic_map#1534

Open
davidvng wants to merge 18 commits intomasterfrom
atomic-mapcat
Open

Mapcat integration for make_atomic_filterbin_map and make_coadd_atomic_map#1534
davidvng wants to merge 18 commits intomasterfrom
atomic-mapcat

Conversation

@davidvng
Copy link
Copy Markdown
Contributor

Modifies existing database querying and writing from original sqlite structure to using mapcat for both make_atomic_filterbin_map and make_coadd_atomic_map

Related changes will be needed in the site-pipeline-configs, especially to set the following environment variables used by mapcat:

MAPCAT_DATABASE_NAME
MAPCAT_DATABASE_TYPE
MAPCAT_ATOMIC_PARENT
MAPCAT_ATOMIC_COADD_PARENT

I've tested the coadd script and confirmed everything works as intended.

However, for the atomic mapmaking, I have not tested due to permissions issues. I'd appreciate @chervias or @JBorrow to double check the changes in make_atomic_filterbin_map.py make sense.

@davidvng davidvng requested review from JBorrow and chervias January 22, 2026 21:32
@JBorrow
Copy link
Copy Markdown
Member

JBorrow commented Jan 22, 2026

Not sure why the tests are failing due to the missing dependency.

Comment thread sotodlib/mapmaking/coadd_mapmaker.py Outdated
@iparask
Copy link
Copy Markdown
Member

iparask commented Jan 23, 2026

Hello @davidvng, take a look here to see how mapcat (#1485) is integrated with the depth1 mapmaker. There we pass the config as an input. In addition you do not really need select as the session provides a query method which is as easy to invoke as select.

@JBorrow
Copy link
Copy Markdown
Member

JBorrow commented Jan 23, 2026

@iparask using session.query() is deprecated. Forming a select statement is preferred behaviour and is the standard in SQLAlchemy 2.0+. See: https://docs.sqlalchemy.org/en/21/changelog/migration_20.html#migration-orm-usage

Comment thread sotodlib/mapmaking/coadd_mapmaker.py Outdated
Comment on lines +126 to +142
with settings.session() as session:
data = AtomicMapCoaddTable(
coadd_name=map_name,
prefix_path=prefix_path,
platform=platform,
interval=interval,
start_time=start_time,
stop_time=stop_time,
freq_channel=band,
geom_file_path=self.geom_file_path,
split_label=split_label,
atomic_maps=self.maps if self.coadd_atomic else [],
parent_coadds=self.maps if not self.coadd_atomic else []
)

session.add(data)
session.commit()
Copy link
Copy Markdown
Member

@iparask iparask Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would place this in a function similar to the depth 1 map in the PR I referenced

Comment thread sotodlib/mapmaking/coadd_mapmaker.py Outdated
Comment on lines +187 to +193
with settings.session() as session:
if '+' in band:
statement = select(AtomicMapTable).where((AtomicMapTable.telescope == platform) & (AtomicMapTable.split_label == split_label) & (AtomicMapTable.ctime >= start_time) & (AtomicMapTable.ctime <= stop_time))
else:
statement = select(AtomicMapTable).where((AtomicMapTable.telescope == platform) & (AtomicMapTable.freq_channel == band) & (AtomicMapTable.split_label == split_label) & (AtomicMapTable.ctime >= start_time) & (AtomicMapTable.ctime <= stop_time))
maps = session.execute(statement).scalars().all()
map_ids = [m.atomic_map_id for m in maps]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above

Comment thread sotodlib/mapmaking/coadd_mapmaker.py Outdated
Comment on lines +196 to +205
with settings.session() as session:
statement = select(AtomicMapCoaddTable).where((AtomicMapCoaddTable.interval == interval) & (AtomicMapCoaddTable.platform == platform) & (AtomicMapCoaddTable.freq_channel == band) & (AtomicMapCoaddTable.split_label == split_label) & (AtomicMapCoaddTable.start_time == start_time) & (AtomicMapCoaddTable.stop_time == stop_time)).order_by(AtomicMapCoaddTable.coadd_id.desc())
coadd = session.execute(statement).scalars().first()
if coadd is not None:
atomic_parent_ids = [a.atomic_map_id for a in coadd.atomic_maps]
coadd_parent_ids = [c.coadd_id for c in coadd.parent_coadds]

if (map_ids == atomic_parent_ids) or (map_ids == coadd_parent_ids):
return False

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be a function under mapcat.py

@davidvng
Copy link
Copy Markdown
Contributor Author

davidvng commented Feb 2, 2026

Not sure why the tests are failing due to the missing dependency.

@JBorrow The tests are still failing and I'm assuming it has to do with mapcat being in the optional dependencies? Thoughts on moving it to the core list of dependencies?

@iparask
Copy link
Copy Markdown
Member

iparask commented Feb 2, 2026

Not sure why the tests are failing due to the missing dependency.

@JBorrow The tests are still failing and I'm assuming it has to do with mapcat being in the optional dependencies? Thoughts on moving it to the core list of dependencies?

I am not sure that is the solution. #759 passes with mapcat as an optional dependency. It might be some other reason. There is similar discussion on slack.

@davidvng
Copy link
Copy Markdown
Contributor Author

davidvng commented Feb 4, 2026

@JBorrow @iparask Looks like the python-3.13 tests now pass, but the python-3.10 tests will always fail since mapcat doesn't support 3.10.

@JBorrow
Copy link
Copy Markdown
Member

JBorrow commented Feb 4, 2026

0.1.1 includes 'support' for 3.10.

@davidvng davidvng requested review from JBorrow and iparask February 9, 2026 22:49
Comment thread sotodlib/mapmaking/coadd_mapmaker.py Outdated
Comment on lines +126 to +142
with Settings(**mapcat_settings).session() as session:
data = AtomicMapCoaddTable(
coadd_name=map_name,
prefix_path=prefix_path,
platform=platform,
interval=interval,
start_time=start_time,
stop_time=stop_time,
freq_channel=band,
geom_file_path=self.geom_file_path,
split_label=split_label,
atomic_maps=self.maps if self.coadd_atomic else [],
parent_coadds=self.maps if not self.coadd_atomic else []
)

session.add(data)
session.commit()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be outside of the mapmaking module. Think what will happen if we decide to change mapcat with something else. We will have to change the code here as well.

Please move any database interactions in the site_pipeline scripts, and create a file for these interactions under the utils of site_pipeline to keep them.

Comment on lines +176 to 186
with Settings(**mapcat_settings).session() as session:
statement = select(AtomicMapCoaddTable)
statement = statement.where(AtomicMapCoaddTable.interval == lower_interval)
statement = statement.where(AtomicMapCoaddTable.platform == platform)
statement = statement.where(AtomicMapCoaddTable.freq_channel == band)
statement = statement.where(AtomicMapCoaddTable.split_label == split_label)
statement = statement.where(AtomicMapCoaddTable.start_time >= start_time)
statement = statement.where(AtomicMapCoaddTable.stop_time <= stop_time)
maps = session.execute(statement).scalars().all()
map_ids = [m.coadd_id for m in maps]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before this script needs to be database agnostic. You can get the map_ids before and pass them as an argument.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, could you clarify what you mean by this? Do you mean to create map names based on the provided info to query and get the map_ids? I don't understand why this is needed when this has been tested and works.

As for the atomic maps, I understand you can use mapmaking.build_obslists to create the map names as you do in your depth1 script. However, in this case, the make_atomic_filterbin_map.py script, which runs separately, already does this and is the script that commits to the database. I don't understand the need to repeat that in this coadd script when the entries for atomic maps already exist.

Comment on lines +193 to +202
with Settings(**mapcat_settings).session() as session:
statement = select(AtomicMapTable)
statement = statement.where(AtomicMapTable.telescope == platform)
statement = statement.where(AtomicMapTable.split_label == split_label)
statement = statement.where(AtomicMapTable.ctime >= start_time)
statement = statement.where(AtomicMapTable.ctime <= stop_time)
if '+' not in band:
statement = statement.where(AtomicMapTable.freq_channel == band)
maps = session.execute(statement).scalars().all()
map_ids = [m.atomic_map_id for m in maps]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure what this means either, can you clarify how this is not datbaase agnostic?

Copy link
Copy Markdown
Member

@iparask iparask Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapmaker itself knows about the database, right? It is not so much the type of database as it is the fact that it needs access to one. I do understand with the site_pipeline script needs to have access to the database, but not the mapmaker itself.

I would expect the mapmaker to take the information it needs to create the coadd, how they arrived or selected is the responsibility of the script that calls the mapmaker.

Comment on lines +205 to +218
with Settings(**mapcat_settings).session() as session:
statement = select(AtomicMapCoaddTable)
statement = statement.where(AtomicMapCoaddTable.interval == interval)
statement = statement.where(AtomicMapCoaddTable.platform == platform)
statement = statement.where(AtomicMapCoaddTable.freq_channel == band)
statement = statement.where(AtomicMapCoaddTable.split_label == split_label)
statement = statement.where(AtomicMapCoaddTable.start_time == start_time)
statement = statement.where(AtomicMapCoaddTable.stop_time == stop_time)
statement = statement.order_by(AtomicMapCoaddTable.coadd_id.desc())
coadd = session.execute(statement).scalars().first()
if coadd is not None:
atomic_parent_ids = [a.atomic_map_id for a in coadd.atomic_maps]
coadd_parent_ids = [c.coadd_id for c in coadd.parent_coadds]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is left there from the direct integration with sqlite.

Comment on lines 185 to 201
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this is not used anymore. I think you can remove it.

iparask and others added 3 commits March 9, 2026 14:50
* fix: mapcat integration with atomic maps

* fix: no need to return everything
@iparask iparask self-requested a review April 22, 2026 15:44
@iparask
Copy link
Copy Markdown
Member

iparask commented Apr 22, 2026

@JBorrow, @chervias can you also take a look here?

Comment on lines +155 to +157
mapcat_database_name: str = os.environ.get('MAPCAT_DATABASE_NAME', ''),
mapcat_database_type: str = os.environ.get('MAPCAT_DATABASE_TYPE', ''),
mapcat_atomic_parent: str = os.environ.get('MAPCAT_ATOMIC_PARENT', ''),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have mapcat settings here so just use that, it reads the environment for you.

@JBorrow
Copy link
Copy Markdown
Member

JBorrow commented Apr 23, 2026

Other than the duplicated reading of environment variables (you should use Settings from mapcat as the variables may be specified in another manner) this looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants