Professional Documents
Culture Documents
Architecture For Data Ingestion Clean Processing and Visulizationyounesse
Architecture For Data Ingestion Clean Processing and Visulizationyounesse
Architecture For Data Ingestion Clean Processing and Visulizationyounesse
Third-Party Data:
When it comes to supplementing our internally generated data, I'll be utilizing AWS Glue, which operates
much like our familiar Hadoop-based tools. Glue fetches data from various sources, performs
transformations, and stores the processed data in our S3 storage.
Data Analysis:
To perform complex analysis, I've opted for Amazon EMR (Elastic MapReduce). This aligns with my current
approach by providing a managed Hadoop framework. I can run Apache Spark and Hive jobs here, just as
I do with our existing setup.