apache-hive

Here are 14 public repositories matching this topic...

Ren294 / SmartTraffic_Lakehouse_for_HCMC

A Smart Traffic Management System for Ho Chi Minh City, Vietnam leveraging batch and real-time data processing, intuitive dashboards, and monitoring tools to optimize traffic flow, enhance safety, and support sustainable urban mobility through advanced analytics and user-friendly applications.

Updated Jan 17, 2025
Python

alikemalocalan / airflow-hdinsight-operators

Star

Azure HDInsight Operators For Apache Airflow

apache-spark azure azure-hdinsight apache-airflow apache-hive airflow-hdinsight-operators

Updated Nov 11, 2024
Python

ChahiriAbderrahmane / Sales-analytics-Data-Lakehouse

Star

This project simulates a real-world enterprise data migration and modernization strategy. It extracts transactional data from a simulated "On-Premise" environment (hosted on AWS EC2), performs heavy distributed processing using a Hadoop/Spark cluster, and ultimately serves the data via a Cloud-Native, serverless architecture to optimize costs .

sql-server apache-spark dashboard postgresql ec2-instance amazon-s3 amazon-athena apache-hive apache-sqoop apache-hadoop snowflake-schema scd-type-2 amazon-quicksight medallion-architecture

Updated Mar 19, 2026
Python

sawallesalfo / Big-Data-Technologies

Star

Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal

apache-spark apache-kafka apache-hive apache-hadoop apache-hbase pysark

Updated Apr 30, 2022
Python

tspannhw / nifi-smartplug

Star

Apache NiFi - Apache MiniFi - TLink SmartPlug

python smart-meter apache-nifi apache-hive hs110

Updated Oct 25, 2018
Python

shre1000 / Big-Data

Star

Big Data Analysis

sql big-data hadoop sentiment-analysis machine-learning-algorithms python3 nltk hdfs matplotlib hadoop-cluster hadoop-mapreduce apache-flume apache-hive live-graph

Updated May 19, 2018
Python

mikeacosta / san-francisco-crime

Star

SF crime data analysis with Apache Spark

apache-spark hadoop hdfs hortonworks apache-hive

Updated May 21, 2020
Python

lifeislearningforever / wikipedia-crawler-hive

Star

Production-ready Wikipedia crawler with PySpark and Apache Hive integration. Extracts article data and stores it in Hive with Parquet format and date partitioning.

python wikipedia pyspark data-engineering web-scraping parquet data-pipeline apache-hive

Updated Dec 22, 2025
Python

ranjithmathi1992-ECE / medicine-supply-pipeline

Star

End-to-end Big Data ETL Pipeline that tracks medicine stock across hospital branches. Automatically detects expired medicines and low stock situations daily. Built with Python MySQL Hadoop HDFS Apache Hive and Apache Airflow DAG automation.

mysql python airflow medicine big-data hive hadoop supply-chain data-engineering healthcare hdfs apache-hive etl-pipeline apache-hadoop

Updated Apr 21, 2026
Python

mesmacosta / hive-table-metadata-generator

Star

This script generates random metadata for the Hive metastore.

metadata bigdata datawarehouse apache-hive

Updated Nov 15, 2019
Python

Sebislaw / Crypto-Options-vs-Rates

Star

Big Data project integrating Polymarket prediction data and Binance cryptocurrency rates to analyse relationships between market expectations and real prices.

data-science big-data apache-spark hadoop cryptocurrency apache-kafka apache-nifi lambda-architecture apache-hive binance apache-hbase polymarket

Updated Jan 19, 2026
Python

Emad-AlKhorasani / BigData-Pipeline-Architect

Star

An automated engine that bridges Data Engineering and System Architecture. It fetches real-world Kaggle datasets and dynamically generates professional pipeline diagrams using Architecture-as-Code (AaC)

python automation kafka big-data apache-spark pipeline data-engineering hdfs apache-hive

Updated Dec 30, 2025
Python

fabiogiulitti / DBCatcher

Star

Database client designed to be light and accessible

playground database mongodb accessibility a11y postgresql pytest apache-hive apache-kyuubi

Updated Apr 30, 2026
Python

Kaiha0 / WQD7007-BDM-Big-Data-Implementation-in-Tourism-The-Walt-Disney-Company-Case-Study

Star

Comparative analysis of Apache Hive vs. Apache Spark on GCP Dataproc for analyzing 3.5M+ Disneyland attraction wait times and predictive modeling.

machine-learning big-data apache-spark hadoop pyspark apache-hive gcp-dataproc tourism-analytics

Updated Feb 13, 2026
Python

Improve this page

Add a description, image, and links to the apache-hive topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the apache-hive topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apache-hive

Here are 14 public repositories matching this topic...

Ren294 / SmartTraffic_Lakehouse_for_HCMC

alikemalocalan / airflow-hdinsight-operators

ChahiriAbderrahmane / Sales-analytics-Data-Lakehouse

sawallesalfo / Big-Data-Technologies

tspannhw / nifi-smartplug

shre1000 / Big-Data

mikeacosta / san-francisco-crime

lifeislearningforever / wikipedia-crawler-hive

ranjithmathi1992-ECE / medicine-supply-pipeline

mesmacosta / hive-table-metadata-generator

Sebislaw / Crypto-Options-vs-Rates

Emad-AlKhorasani / BigData-Pipeline-Architect

fabiogiulitti / DBCatcher

Kaiha0 / WQD7007-BDM-Big-Data-Implementation-in-Tourism-The-Walt-Disney-Company-Case-Study

Improve this page

Add this topic to your repo