diff --git a/README.md b/README.md index cde5066a8..6c68eaa62 100644 --- a/README.md +++ b/README.md @@ -2,16 +2,10 @@ # HEDTools - Python -```{index} HEDTools, Python tools, validation, analysis -``` - > Python tools for validation, analysis, and transformation of HED (Hierarchical Event Descriptors) tagged datasets. ## Overview -```{index} HED, Hierarchical Event Descriptors, BIDS, NWB -``` - HED (Hierarchical Event Descriptors) is a framework for systematically describing both laboratory and real-world events as well as other experimental metadata. HED tags are comma-separated path strings that provide a standardized vocabulary for annotating events and experimental conditions. **Key Features:** @@ -27,14 +21,8 @@ HED (Hierarchical Event Descriptors) is a framework for systematically describin ## Quick start -```{index} quick start, getting started, installation -``` - ### Online tools (no installation required) -```{index} online tools, web tools, hedtools.org -``` - For simple validation or transformation tasks, use the online tools at [https://hedtools.org/hed](https://hedtools.org/hed) - no installation needed! Browser-based validation (no data upload) is available at [https://www.hedtags.org/hed-javascript](https://www.hedtags.org/hed-javascript) @@ -43,9 +31,6 @@ A development version of the online tools is available at: [https://hedtools.org ### Python installation -```{index} installation; Python, pip install, PyPI -``` - **Requirements:** Python 3.10 or higher Install from PyPI: @@ -94,9 +79,6 @@ pip install -e ".[dev,docs,test,examples]" ### Basic usage -```{index} usage examples, HedString, load_schema_version, validation example -``` - ```python from hed import HedString, load_schema_version @@ -116,9 +98,6 @@ else: ### Command-line tools -```{index} command-line tools, CLI, hedpy, validate-bids, extract-sidecar -``` - HEDTools provides a unified command-line interface with git-like subcommands: ```bash @@ -148,13 +127,12 @@ hed_validate_schemas /path/to/schema.xml **Note:** The `run_remodel` command has been removed. Table remodeling functionality is now available in the separate [table-remodeler](https://pypi.org/project/table-remodeler/) package. +**Note:** The visualization tools such as the word cloud visualization have been moved to a separate [hed-vis](https://pypi.org/project/hedvis/) project. + For more examples, see the [user guide](https://www.hedtags.org/hed-python/user_guide.html). ### Jupyter notebook examples -```{index} Jupyter notebooks, examples, interactive workflows -``` - **Note:** Example notebooks are available in the [GitHub repository](https://github.com/hed-standard/hed-python/tree/main/examples) only, not in the PyPI package. The [`examples/`](examples/) directory contains Jupyter notebooks demonstrating common HED workflows with BIDS datasets: @@ -185,9 +163,6 @@ See [`examples/README.md`](examples/README.md) for more details. ## Documentation -```{index} documentation, user guide, API reference, Sphinx -``` - 📖 **Full Documentation:** [https://www.hedtags.org/hed-python](https://www.hedtags.org/hed-python) - [User guide](https://www.hedtags.org/hed-python/user_guide.html) - Usage instructions @@ -196,9 +171,6 @@ See [`examples/README.md`](examples/README.md) for more details. ### Building docs locally -```{index} building documentation, sphinx-build -``` - ```bash # Install documentation dependencies pip install -e .[docs] @@ -212,9 +184,6 @@ To iew the built documentation open `docs/_build/html/index.html` in your browse ### Formatting with Black -```{index} Black, code formatting, style guide -``` - This project uses [Black](https://black.readthedocs.io/) for consistent code formatting. ```bash @@ -247,9 +216,6 @@ black --workers 1 . ## Related repositories -```{index} HED ecosystem, repositories, hed-schemas, hed-specification -``` - The HED ecosystem consists of several interconnected repositories: | Repository | Description | @@ -265,9 +231,6 @@ The HED ecosystem consists of several interconnected repositories: ## Contributing -```{index} contributing, development setup, pull requests -``` - We welcome contributions! Here's how you can help: 1. **Report issues:** Use [GitHub Issues](https://github.com/hed-standard/hed-python/issues) for bug reports and feature requests @@ -304,14 +267,8 @@ For detailed contribution guidelines, please see [CONTRIBUTING.md](CONTRIBUTING. ## Configuration -```{index} configuration, schema caching, cache directory -``` - ### Schema caching -~~~{index} schema; caching, ~/.hedtools -~~~ - By default, HED schemas are cached in `~/.hedtools/` (location varies by OS). ```python @@ -343,7 +300,8 @@ HEDTools is licensed under the MIT License. See [LICENSE](LICENSE) for details. ## Support -- [Documentation](https://www.hedtags.org/hed-python) -- [GitHub issues](https://github.com/hed-standard/hed-python/issues) -- [HED Homepage](https://www.hedtags.org) +- HED documentation: [www.hedtags.org/hed-resources](https://www.hedtags.org/hed-resources) +- HED homepage: [www.hedtags.org](https://www.hedtags.org) +- GitHub issues: [https://github.com/hed-standard/hed-python/issues](https://github.com/hed-standard/hed-python/issues) +- Questions or ideas: [HED discussions](https://github.com/orgs/hed-standard/discussions) - Contact: [hed-maintainers@gmail.com](mailto:hed-maintainers@gmail.com) diff --git a/hed/tools/visualization/__init__.py b/hed/tools/visualization/__init__.py deleted file mode 100644 index 95c639b36..000000000 --- a/hed/tools/visualization/__init__.py +++ /dev/null @@ -1,3 +0,0 @@ -"""Visualization tools for HED.""" - -from .tag_word_cloud import create_wordcloud, word_cloud_to_svg diff --git a/hed/tools/visualization/tag_word_cloud.py b/hed/tools/visualization/tag_word_cloud.py deleted file mode 100644 index 6263f04ad..000000000 --- a/hed/tools/visualization/tag_word_cloud.py +++ /dev/null @@ -1,120 +0,0 @@ -"""Utilities for creating a word cloud.""" - -import numpy as np -from PIL import Image -from hed.errors.exceptions import HedFileError -from hed.tools.visualization import word_cloud_util -from wordcloud import WordCloud - -MIN_WORD_CLOUD_SIZE = 100 - - -def create_wordcloud(word_dict, mask_path=None, background_color=None, width=400, height=300, **kwargs): - """Takes a word dict and returns a generated word cloud object. - - Parameters: - word_dict (dict): words and their frequencies - mask_path (str or None): The path of the mask file - background_color (str or None): If None, transparent background. - width (int): width in pixels. - height (int): height in pixels. - kwargs (kwargs): Any other parameters WordCloud accepts, overrides default values where relevant. - - Returns: - WordCloud: The generated cloud. (Use .to_file to save it out as an image.) - - :raises ValueError: - An empty dictionary was passed - """ - mask_image = None - if mask_path: - mask_image = load_and_resize_mask(mask_path, width, height) - width = round(mask_image.shape[1]) - height = round(mask_image.shape[0]) - if height is None and width is None: - width = 400 - height = 300 - elif height is None: - height = round(width / 1.5) - elif width is None: - width = round(height * 1.5) - width = max(width, MIN_WORD_CLOUD_SIZE) - height = max(height, MIN_WORD_CLOUD_SIZE) - kwargs.setdefault("contour_width", 3) - kwargs.setdefault("contour_color", "black") - kwargs.setdefault("prefer_horizontal", 0.75) - kwargs.setdefault("color_func", word_cloud_util.default_color_func) - kwargs.setdefault("relative_scaling", 1) - kwargs.setdefault("max_font_size", max(round(height / 20), 12)) - kwargs.setdefault("min_font_size", 8) - if "font_path" not in kwargs: - kwargs["font_path"] = None - elif kwargs["font_path"] and not kwargs["font_path"].lower().endswith((".ttf", ".otf", ".ttc")): - raise HedFileError("InvalidFontPath", f"Font {kwargs['font_path']} not valid on this system", "") - - wc = WordCloud(background_color=background_color, mask=mask_image, width=width, height=height, mode="RGBA", **kwargs) - - wc.generate_from_frequencies(word_dict) - - return wc - - -def word_cloud_to_svg(wc): - """Return a WordCould as an SVG string. - - Parameters: - wc (WordCloud): the word cloud object. - - Returns: - str: The svg for the word cloud. - - """ - svg_string = wc.to_svg() - svg_string = svg_string.replace("fill:", "fill:rgb") - svg_string = svg_string.replace("", word_cloud_util.generate_contour_svg(wc, wc.width, wc.height) + "") - return svg_string - - -def load_and_resize_mask(mask_path, width=None, height=None): - """Load a mask image and resize it according to given dimensions. - - The image is resized maintaining aspect ratio if only width or height is provided. - - Returns None if no mask_path. - - Parameters: - mask_path (str): The path to the mask image file. - width (int, optional): The desired width of the resized image. If only width is provided, - the image is scaled to maintain its original aspect ratio. Defaults to None. - height (int, optional): The desired height of the resized image. If only height is provided, - the image is scaled to maintain its original aspect ratio. Defaults to None. - - Returns: - numpy.ndarray: The loaded and processed mask image as a numpy array with binary values (0 or 255). - """ - if mask_path: - mask_image = Image.open(mask_path).convert("RGBA") - - if width or height: - original_size = np.array((mask_image.width, mask_image.height)) - output_size = np.array((width, height)) - # Handle one missing param - if not height: - scale = original_size[0] / width - output_size = original_size / scale - elif not width: - scale = original_size[1] / height - output_size = original_size / scale - - mask_image = mask_image.resize(tuple(output_size.astype(int)), Image.LANCZOS) - - mask_image_array = np.array(mask_image) - # Treat transparency (alpha < 128) or white (R>127, G>127, B>127) as white, else black - mask_image_array = np.where( - (mask_image_array[:, :, 3] < 128) - | ((mask_image_array[:, :, 0] > 127) & (mask_image_array[:, :, 1] > 127) & (mask_image_array[:, :, 2] > 127)), - 255, - 0, - ) - - return mask_image_array.astype(np.uint8) diff --git a/hed/tools/visualization/word_cloud_util.py b/hed/tools/visualization/word_cloud_util.py deleted file mode 100644 index 105848df8..000000000 --- a/hed/tools/visualization/word_cloud_util.py +++ /dev/null @@ -1,164 +0,0 @@ -"""Support utilities for word cloud generation.""" - -import random -from random import Random - -import numpy as np -from PIL import Image, ImageFilter -import matplotlib as mp1 -import wordcloud as wcloud - - -def generate_contour_svg(wc, width, height): - """Generate an SVG contour mask based on a word cloud object and dimensions. - - Parameters: - wc (WordCloud): The word cloud object. - width (int): SVG image width in pixels. - height (int): SVG image height in pixels. - - Returns: - str: SVG point list for the contour mask, or empty string if not generated. - """ - contour = _get_contour_mask(wc, width, height) - if contour is None: - return "" - return _numpy_to_svg(contour, radius=wc.contour_width, color=wc.contour_color) - - -def _get_contour_mask(wc, width, height): - """Slightly tweaked copy of internal WorldCloud function to allow transparency for mask. - - Parameters: - wc (WordCloud): Representation of the word cloud. - width (int): Width of the generated mask. - height (int): Height of generated mask. - - Returns: - Image: Image of mask. - - - """ - if wc.mask is None or wc.contour_width == 0 or wc.contour_color is None: - return None - - mask = wc._get_bolean_mask(wc.mask) * 255 - contour = Image.fromarray(mask.astype(np.uint8)) - contour = contour.resize((width, height)) - contour = contour.filter(ImageFilter.FIND_EDGES) - contour = np.array(contour) - - # make sure borders are not drawn before changing width - contour[[0, -1], :] = 0 - contour[:, [0, -1]] = 0 - - return contour - - -def _draw_contour(wc, img: Image): - """Slightly tweaked copy of internal WorldCloud function to allow transparency. - - Parameters: - wc (WordCloud): Wordcloud object. - img (Image): Image to work with. - - Returns: - Image: Modified image. - - """ - contour = _get_contour_mask(wc, img.width, img.height) - if contour is None: - return img - - # use gaussian to change width, divide by 10 to give more resolution - radius = wc.contour_width / 10 - contour = Image.fromarray(contour) - contour = contour.filter(ImageFilter.GaussianBlur(radius=radius)) - contour = np.array(contour) > 0 - if img.mode == "RGBA": - contour = np.dstack((contour, contour, contour, contour)) - else: - contour = np.dstack((contour, contour, contour)) - - # color the contour - ret = np.array(img) * np.invert(contour) - color = np.array(Image.new(img.mode, img.size, wc.contour_color)) - ret += color * contour - - return Image.fromarray(ret) - - -# Replace WordCloud function with one that can handle transparency -wcloud.WordCloud._draw_contour = _draw_contour - - -def _numpy_to_svg(contour, radius=1, color="black"): - """Convert a numpy array to SVG. - - Parameters: - contour (np.Array): Image to be converted. - radius (float): The radius of the contour to draw. - color(string): the color to draw it as, e.g. "red". - - Returns: - str: The SVG representation. - """ - svg_elements = [] - points = np.array(contour.nonzero()).T - for y, x in points: - svg_elements.append(f'') - - return "\n".join(svg_elements) - - -def random_color_darker(random_state=None): - """Random color generation function. - - Parameters: - random_state (Random or None): Previous state of random generation for next color generation. - - Returns: - str: Represents a hue, saturation, and lightness. - - """ - if random_state is None: - random_state = Random() - return f"hsl({random_state.randint(0, 255)}, {random_state.randint(50, 100)}%, {random_state.randint(0, 50)}%)" - - -class ColormapColorFunc: - """Represents a colormap.""" - - def __init__(self, colormap="nipy_spectral", color_range=(0.0, 0.5), color_step_range=(0.15, 0.25)): - """Initialize a word cloud color generator. - - Parameters: - colormap (str, optional): The name of the matplotlib colormap to use for generating colors. - Defaults to 'nipy_spectral'. - color_range (tuple of float, optional): A tuple containing the minimum and maximum values to use - from the colormap. Defaults to (0.0, 0.5). - color_step_range (tuple of float, optional): A tuple containing the minimum and maximum values to step - through the colormap. Defaults to (0.15, 0.25). - This is the speed at which it goes through the range chosen. - .25 means it will go through 1/4 of the range each pick. - """ - self.colormap = mp1.colormaps[colormap] - self.color_range = color_range - self.color_step_range = color_step_range - self.current_fraction = random.uniform(0, 1) # Start at a random point - - def color_func(self, word, font_size, position, orientation, random_state=None, **kwargs): - """Update the current color fraction and wrap around if necessary.""" - color_step = random.uniform(*self.color_step_range) - self.current_fraction = (self.current_fraction + color_step) % 1.0 - - # Scale the fraction to the desired range - scaled_fraction = self.color_range[0] + (self.current_fraction * (self.color_range[1] - self.color_range[0])) - - # Get the color from the colormap - color = self.colormap(scaled_fraction) - - return tuple(int(c * 255) for c in color[:3]) # Convert to RGB format - - -default_color_func = ColormapColorFunc().color_func diff --git a/pyproject.toml b/pyproject.toml index 373327e48..8ce487a95 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -46,16 +46,14 @@ dependencies = [ "et-xmlfile", "inflect", "jsonschema", - "matplotlib>=3.9.0", "numpy", "openpyxl", - "pandas", + "pandas>=2.2.3,<3.0.0", "portalocker", "python-dateutil", "pytz", "semantic-version", - "six", - "wordcloud==1.9.5" + "six" ] [project.urls] diff --git a/requirements.txt b/requirements.txt index dfd0a7f02..77a02e249 100644 --- a/requirements.txt +++ b/requirements.txt @@ -3,11 +3,8 @@ click-option-group>=0.5.0 defusedxml>=0.7.1 inflect>=7.5.0 jsonschema>=4.23.0 -matplotlib>=3.9.0 numpy>=2.0.2 openpyxl>=3.1.5 -pandas>=2.2.3 -pillow>=11.2.1 +pandas>=2.2.3,<3.0.0 portalocker>=3.1.1 semantic-version>=2.10.0 -wordcloud>=1.9.4 diff --git a/tests/data/visualization/word_mask.png b/tests/data/visualization/word_mask.png deleted file mode 100644 index e235d063e..000000000 Binary files a/tests/data/visualization/word_mask.png and /dev/null differ diff --git a/tests/tools/visualization/__init__.py b/tests/tools/visualization/__init__.py deleted file mode 100644 index e69de29bb..000000000 diff --git a/tests/tools/visualization/test_tag_word_cloud.py b/tests/tools/visualization/test_tag_word_cloud.py deleted file mode 100644 index 791d22946..000000000 --- a/tests/tools/visualization/test_tag_word_cloud.py +++ /dev/null @@ -1,202 +0,0 @@ -import unittest -import wordcloud -from hed.tools.visualization import tag_word_cloud -from hed.tools.visualization.tag_word_cloud import load_and_resize_mask -import matplotlib.font_manager as fm - -import numpy as np -from PIL import Image, ImageDraw -import os - - -class TestWordCloudFunctions(unittest.TestCase): - @classmethod - def setUpClass(cls): - cls.mask_path = os.path.realpath(os.path.join(os.path.dirname(__file__), "../../data/visualization/word_mask.png")) - - def test_create_wordcloud(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - width = 400 - height = 200 - wc = tag_word_cloud.create_wordcloud(word_dict, width=width, height=height) - - self.assertIsInstance(wc, wordcloud.WordCloud) - self.assertEqual(wc.width, width) - self.assertEqual(wc.height, height) - - def test_create_wordcloud_font_direct(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - width = 400 - height = 200 - - fonts = fm.findSystemFonts() - if not fonts: - self.skipTest("No system fonts found") - - # Try to find a valid TrueType/OpenType font - font_path = None - for font_candidate in fonts: - if font_candidate.lower().endswith((".ttf", ".otf", ".ttc")): - font_path = os.path.realpath(font_candidate) - try: - # Test if the font can actually be loaded - tag_word_cloud.create_wordcloud(word_dict, width=width, height=height, font_path=font_path) - # If successful, use this font for the actual test - break - except (OSError, Exception): - # This font doesn't work, try the next one - font_path = None - continue - - if font_path is None: - self.skipTest("No valid TrueType/OpenType fonts found on system") - - wc = tag_word_cloud.create_wordcloud(word_dict, width=width, height=height, font_path=font_path) - - self.assertIsInstance(wc, wordcloud.WordCloud) - self.assertEqual(wc.width, width) - self.assertEqual(wc.height, height) - self.assertIn(font_path, wc.font_path) - - def test_create_wordcloud_default_params(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - wc = tag_word_cloud.create_wordcloud(word_dict) - - self.assertIsInstance(wc, wordcloud.WordCloud) - self.assertEqual(wc.width, 400) - self.assertEqual(wc.height, 300) - - def test_mask_scaling(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - wc = tag_word_cloud.create_wordcloud(word_dict, self.mask_path, width=300, height=300) - - self.assertIsInstance(wc, wordcloud.WordCloud) - self.assertEqual(wc.width, 300) - self.assertEqual(wc.height, 300) - - def test_mask_scaling2(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - wc = tag_word_cloud.create_wordcloud(word_dict, self.mask_path, width=300, height=None) - - self.assertIsInstance(wc, wordcloud.WordCloud) - self.assertEqual(wc.width, 300) - self.assertLess(wc.height, 300) - - def test_create_wordcloud_with_empty_dict(self): - # Test creation of word cloud with an empty dictionary - word_dict = {} - with self.assertRaises(ValueError): - tag_word_cloud.create_wordcloud(word_dict) - - def test_create_wordcloud_with_single_word(self): - # Test creation of word cloud with a single word - word_dict = {"single_word": 1} - wc = tag_word_cloud.create_wordcloud(word_dict) - self.assertIsInstance(wc, wordcloud.WordCloud) - # Check that the single word is in the word cloud - self.assertIn("single_word", wc.words_) - - def test_valid_word_cloud(self): - word_dict = {"tag1": 5, "tag2": 3, "tag3": 7} - wc = tag_word_cloud.create_wordcloud(word_dict, mask_path=self.mask_path, width=400, height=None) - svg_output = tag_word_cloud.word_cloud_to_svg(wc) - self.assertTrue(svg_output.startswith("")) - self.assertIn("fill:rgb", svg_output) - - -class TestLoadAndResizeMask(unittest.TestCase): - @classmethod - def setUpClass(cls): - # Create a simple black and white image - cls.original_size = (300, 200) - cls.img = Image.new("L", cls.original_size, 255) # Start with a white image - - # Draw a black circle in the middle of the image - d = ImageDraw.Draw(cls.img) - circle_radius = min(cls.original_size) // 4 - circle_center = (cls.original_size[0] // 2, cls.original_size[1] // 2) - d.ellipse( - ( - circle_center[0] - circle_radius, - circle_center[1] - circle_radius, - circle_center[0] + circle_radius, - circle_center[1] + circle_radius, - ), - fill=0, - ) - cls.img_path = "temp_img.png" - cls.img.save(cls.img_path) - - # Start with a black fully transparent image - cls.img_trans = Image.new("RGBA", cls.original_size, (0, 0, 0, 0)) - - # Draw a black opaque circle in the middle - d = ImageDraw.Draw(cls.img_trans) - circle_radius = min(cls.original_size) // 4 - circle_center = (cls.original_size[0] // 2, cls.original_size[1] // 2) - d.ellipse( - ( - circle_center[0] - circle_radius, - circle_center[1] - circle_radius, - circle_center[0] + circle_radius, - circle_center[1] + circle_radius, - ), - fill=(0, 0, 0, 255), - ) - cls.img_path_trans = "temp_img_trans.png" - cls.img_trans.save(cls.img_path_trans) - - @classmethod - def tearDownClass(cls): - # Clean up the temp image - os.remove(cls.img_path) - os.remove(cls.img_path_trans) - - def test_no_resizing(self): - mask = load_and_resize_mask(self.img_path) - mask_img = Image.fromarray(mask) - self.assertEqual((mask_img.width, mask_img.height), self.original_size) - - def test_width_resizing(self): - width = 150 - mask = load_and_resize_mask(self.img_path, width=width) - mask_img = Image.fromarray(mask) - expected_width, expected_height = width, int(self.original_size[1] * width / self.original_size[0]) - self.assertEqual((mask_img.width, mask_img.height), (expected_width, expected_height)) - - def test_height_resizing(self): - height = 100 - mask = load_and_resize_mask(self.img_path, height=height) - mask_img = Image.fromarray(mask) - expected_shape = (int(self.original_size[0] * height / self.original_size[1]), height) - self.assertEqual((mask_img.width, mask_img.height), expected_shape) - - def test_both_dimensions_resizing(self): - width, height = 100, 75 - mask = load_and_resize_mask(self.img_path, width=width, height=height) - self.assertEqual(mask.shape, (height, width)) - - def test_mask_color(self): - mask = load_and_resize_mask(self.img_path) - # The mask should have 0 and 1, and no other values - unique_values = np.unique(mask) - self.assertCountEqual(unique_values, [0, 255]) - - def test_transparent_mask(self): - mask = load_and_resize_mask(self.img_path_trans) - # The mask should have 0 and 1, and no other values - unique_values = np.unique(mask) - self.assertCountEqual(unique_values, [0, 255]) - - mask = load_and_resize_mask(self.img_path_trans, width=500) - # The mask should have 0 and 1, and no other values - unique_values = np.unique(mask) - self.assertCountEqual(unique_values, [0, 255]) - # Verify sizes - self.assertEqual(mask.shape, (333, 500)) - - mask_img = Image.fromarray(mask) - expected_width, expected_height = 500, int(self.original_size[1] * 500 / self.original_size[0]) - self.assertEqual((mask_img.width, mask_img.height), (expected_width, expected_height))