bigdatainfrastructure

Here are 9 public repositories matching this topic...

Ahmedbakr78 / engineer-a-production-grade-end-to-end-Big-Data-pipeline-that-demonstrates-advanced-proficiency

end-to-end Big Data pipeline that demonstrates advanced proficiency in distributed computing, data engineering, and automated validation. Rather than relying on a single script or a simplified analytics task, the project is designed to simulate a real-world enterprise data architecture where raw data is generated at scale

bigdatacloud bigdataanalytics bigdatainfrastructure

Updated May 6, 2026
Shell

divithraju / pyspark-examples

Star

Pyspark RDD, DataFrame and Dataset Examples in Python language

Updated Sep 8, 2024
Python

dablro12 / Potential-Hospital-Location-Variable-in-Yongin-si

Star

2023-2 Big Data Project : 응급 의료 취약 지역 탐색 및 잠재변수 유효성 검증

machine-learning-algorithms bigdatainfrastructure

Updated Dec 12, 2023
HTML

divithraju / awesome-spark

Star

A curated list of awesome Apache Spark packages and resources.

programming-language open-source package apache-spark bigdata python3 pyspark software-engineering contribution mit-license forked-repo dataengineering bigdatainfrastructure

Updated Sep 9, 2024
Python

eversonfilipe / bigdata-wyden-atendimento-clinica-veterinaria

Star

Este é um projeto acadêmico para a disciplina de "BigData" do curso de Ciências da Computação e ADS do Centro Universitário do Vale do Ipojuca. Consiste em um software de gerenciamento de atendimento de uma clínica veterinária regional.

development bigdata academic-project academic-website project-repository bigdatainfrastructure

Updated Sep 6, 2025

JuanParias29 / BigDataProcessing

Star

Repositorio con proyectos y laboratorios de procesamiento de datos utilizando Databricks, Apache Spark y Python. Incluye conceptos clave de Big Data, almacenamiento, procesamiento, análisis y aprendizaje automático.

python data-science sql database apache-spark bigdata nosql-database bigdataanalytics bigdatainfrastructure

Updated Jul 31, 2025
Jupyter Notebook

kevinndungu-source / Amazon_EMR_Serverless_Demonstration

Star

Explore the capabilities of Amazon EMR Serverless by processing semi-structured review data with Apache Spark, showcasing efficient big data analysis without managing clusters.

python apache-spark sql-query dataprocessing bigdatacloud emrserverless bigdatainfrastructure

Updated Jun 19, 2024
Python

kevinndungu-source / Amazon_EMR_Project_Resources

Star

Explore and replicate Amazon EMR (Elastic MapReduce) setup and utilization for big data processing and analytics tasks, featuring comprehensive demonstrations from VPC creation to Spark job execution.

python bigdata pyspark aws-ec2 dataprocessing datamanagement emr-cluster juypter-notebook bigdatainfrastructure

Updated Jun 19, 2024
Jupyter Notebook

atlas555 / atlas555.github.io

Star

a personal blog

blog technology bigdata reading bigdatainfrastructure

Updated May 18, 2026
HTML

Improve this page

Add a description, image, and links to the bigdatainfrastructure topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the bigdatainfrastructure topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bigdatainfrastructure

Here are 9 public repositories matching this topic...

Ahmedbakr78 / engineer-a-production-grade-end-to-end-Big-Data-pipeline-that-demonstrates-advanced-proficiency

divithraju / pyspark-examples

dablro12 / Potential-Hospital-Location-Variable-in-Yongin-si

divithraju / awesome-spark

eversonfilipe / bigdata-wyden-atendimento-clinica-veterinaria

JuanParias29 / BigDataProcessing

kevinndungu-source / Amazon_EMR_Serverless_Demonstration

kevinndungu-source / Amazon_EMR_Project_Resources

atlas555 / atlas555.github.io

Improve this page

Add this topic to your repo