Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/components/fundable/MenuSideBar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ import styles from "./styles.module.css";
const sections = [
{ id: 'jupyter-ecosystem', label: 'Jupyter ecosystem' },
{ id: 'package-management', label: 'Package management' },
{ id: 'scientific-computing', label: 'Scientific computing' }
{ id: 'scientific-computing', label: 'Scientific computing' },
{ id: 'apache-arrow', label: 'Apache Arrow and Parquet' }
];

export default function MenuSideBar() {
Expand Down
43 changes: 43 additions & 0 deletions src/components/fundable/descriptions/BinaryViewInArrowCpp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#### Overview

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.

Representation of string and binary data in Arrow traditionally uses the Binary layout, where the entire string data resides in a separate buffer that is accessed using indirect indexing from a buffer of offsets.

Recently, the Arrow project added the Binary View layout, a more efficient layout inspired from modern execution engines where the beginning of each string is packed directly within the offsets buffer. This allows short strings to be read and processed directly without going through an additional indirection.

However, while basic support is present, Binary View is not universally supported by all Arrow components.

We propose to finish implementing support for Binary View and String View types in all components of Arrow C++:

* scalar compute kernels:
- `equal`, `less_equal`, etc.
- `is_in`, `index_in`
- `ascii_*`, `binary_*`, `utf8_*`
- `string_is_ascii`
- `count_substring`
- `extract_regex`, `extract_regex_span`
- `split_pattern`, `split_pattern_regex`
- `coalesce`

* vector compute kernels:
- `take`, `filter`, `scatter`
- `run_end_encode`, `run_end_decode`
- `sort_indices`, `rank`, `rank_normal`, `rank_quantile`
- `partition_nth_indices`
- `select_k_unstable`
- `replace_with_mask`
- `fill_null_forward`, `fill_null_backward`, `drop_null`

* aggregate compute kernels:
- `count_distinct`
- `first`, `last`, `min`, `max`
- `index`

* CSV reader and writer

* ORC reader and writer

Funders can decide to fund the entire package, or choose the components they are interested in.

##### Are you interested in this project? Either entirely or partially, contact us for more information on how to help us fund it
6 changes: 6 additions & 0 deletions src/components/fundable/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ export function MainAreaFundableProjects() {
projectCategory={fundableProjectsDetails.scientificComputing}
/>
</section>
<section id="apache-arrow">
<ProjectCategory
projectCategoryName={"Apache Arrow and Parquet"}
projectCategory={fundableProjectsDetails.apacheArrow}
/>
</section>
<section id="propose-and-fund-a-project">
<h2 className={styles.project_category_header} style={{ margin: "0px" }}>Can't find a project?</h2>
<p style={{ marginTop: "var(--ifm-spacing-lg)" }}>If you have a project in mind that you think would be relevant to our expertise, please contact us to discuss it.</p>
Expand Down
17 changes: 17 additions & 0 deletions src/components/fundable/projectsDetails.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import JupyterGISToolsForPythonAPIMD from "@site/src/components/fundable/descrip
import EmscriptenForgePackageRequestsMD from "@site/src/components/fundable/descriptions/EmscriptenForgePackageRequests.md"
import SVE2SupportInXsimdMD from "@site/src/components/fundable/descriptions/SVE2SupportInXsimd.md"
import MatrixOperationsInXtensor from "@site/src/components/fundable/descriptions/MatrixOperationsInXtensor.md"
import BinaryViewInArrowCpp from "@site/src/components/fundable/descriptions/BinaryViewInArrowCpp.md"

export const fundableProjectsDetails = {
jupyterEcosystem: [
Expand Down Expand Up @@ -84,6 +85,22 @@ export const fundableProjectsDetails = {
currentFundingPercentage: 0,
repoLink: "https://github.com/xtensor-stack/xtensor"
}
],

apacheArrow: [
{
category: "Apache Arrow and Parquet",
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm not sure why this is duplicated from projectCategoryName above)

title: "Complete BinaryView / StringView support in Arrow C++",
pageName: "BinaryViewInApacheArrow",
shortDescription: "BinaryView is a more recent and more efficient alternative to Arrow's standard Binary type. It allows for inlined storage of short strings and fast prefix comparison.",
description: BinaryViewInArrowCpp,
price: "TBD",
maxNbOfFunders: 4,
currentNbOfFunders: 0,
currentFundingPercentage: 0,
repoLink: "https://github.com/apache/arrow"
}
]

}

9 changes: 9 additions & 0 deletions src/pages/fundable/BinaryViewInApacheArrow/GetAQuote.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import GetAQuotePage from '@site/src/components/fundable/GetAQuotePage';

export default function FundablePage() {
const { siteConfig } = useDocusaurusContext();
return (
<GetAQuotePage/>
);
}
9 changes: 9 additions & 0 deletions src/pages/fundable/BinaryViewInApacheArrow/index.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import LargeProjectCardPage from '@site/src/components/fundable/LargeProjectCardPage';

export default function FundablePage() {
const { siteConfig } = useDocusaurusContext();
return (
<LargeProjectCardPage/>
);
}
Loading