Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Informatica Power Center – This component processes huge volumes of data and converts

local repository into a global one. It supports ERP sources as well as diverse local and global
repositories.

Informatica Power Connect – This component mines the bulks of raw data and extracts
meaningful insights and metadata from the ERPs and third party applications.

Informatica Power Mart – This component is responsible for processing comparatively lesser
volumes of data and supports local repositories only. Unlike Informatica Power center, Power
Mart does not support ERPs and global repositories.

Informatica Power Exchange – This component supports batch, real-time and changed data
capture options in various set-ups. It allows the companies to leverage the data by avoiding
coding data extraction programs manually.

Informatica Power Analyses – This component provides various reporting facilities that help
companies have a clear vision into the business processes. This tool provides versatile benefits
ranging from accessing, examining and sharing the enterprise data in a lucid way.
Informatica Power Quality – This component scales the services to share them across multiple
machines. It consists of a set of applications and components that improve the enterprise-wide
data quality.

Informatica Architecture Overview


Informatica has a Service-oriented Architecture (SOA) which consists of following components

Informatica Domain – It is an administrative unit consisting nodes and services. These nodes
and services can be further categorized into folders and subfolders. There are basically two types
of services in the Informatica Domain- Service Manager and Application Services. While the
former is responsible for authenticating/authorizing the loggings and running the application
services, the latter represents the integration services, repository services and reporting services.

Repository Service – This service maintains a connection between the clients and PowerCenter
repository. It is a multi-threaded process that fetches, inserts and updates the metadata. It also
maintains a uniformity within the repository metadata.

Nodes – Nodes are the computing platforms where the aforementioned services are executed.

Reporting Service – Reporting services are responsible for handling the metadata and allowing
other services to access the same.
Integration Service – This service is the engine that executes the tasks created in the
Informatica tool. It is nothing but a process inside the server waiting for the tasks to be assigned.
As soon as a workflow is executed, the integration service gets the details and executes it.

PowerCenter Designer – It is a developer tool used for creating ETL mappings between source
and target.

Workflow Manager – Responsible for creating workflows/tasks and executing them

Workflow Monitor – Accountable for monitoring the execution of the workflows.

Repository Manager – It manages the objects in the repository.

Data Marts
Data mart can be defined as the subset of data warehouse of an organization which is limited to a
specific business unit or group of users. It is a subject-oriented database and is also known as
High Performance Query Structures (HPQS).

Data marts are of two types – Dependent and Independent.

Dependent Data Mart – This data mart depends on the enterprise data warehouse and works in
top-down manner.
Independent Data Mart – This data mart does not depend on the enterprise data warehouse and
works in bottom-up manner.

Benefits of Data Marts


 Allows the data to be accessed in lesser time

 Cost-efficient alternative to the bulky data warehouse

 Easy to use as designed according to the needs of specific user group

 Fastens the business processes.


Debt Manager "Collection and Recoveries for Loans and Credit cards.

The Credit risk Report server will compile the data required from the sources systems (Debt manager
and Transact SM) "

 Strategic Reporting
 Facts and Dimension
 In Memory is having Huge Cache
 Landing Zone
 Staging Zone

The pipeline is built upon the following AWS services and open source software:

AWS Step to manage and orchestrate the event driven process from file arrival to dataset publishing.

· AWS Lambda to execute short running synchronous tasks.


· AWS Batch to execute long running tasks in serverless containers.

· AWS Glue for writing the columnar files.

· AWS Athena for querying the published files with SQL.

http://versent.com.au/insights/aws-re-invent-2017-recap-the-security-version

The results from GuardDuty can be pushed to AWS CloudWatch Events to trigger AWS Lambda functions
to perform specific actions based on the type of the issue discovered by GuardDuty.

https://aws.amazon.com/guardduty/

=========

ZooKeeper:-Distribution Cordination Systems

Storm Cluster has 3 Sets of Nodes"

1.Nimbus Node

2.Zookeeper nodes

3.Supervisor Nodes

Nimbus is lIke JobTracker

Amazon RedShift

===============

• Installed and configured HadoopMapReduce, HDFS, Developed multiple MapReduce jobs in java
for data cleaning and preprocessing.

• Importing and exporting data into HDFS and Hive using Sqoop

• Experienced in defining job flows

• Experienced in managing and reviewing Hadoop log files

• Experienced in running Hadoop streaming jobs to process terabytes of xml format data

• Load and transform large sets of structured, semi structured and unstructured data
• Responsible to manage data coming from different sources

Supported Map Reduce Programs those are running on the cluster

• Participate in scoping, planning, cost estimating, pricing of consulting projects, statement of


work development and risk assessment, including but not limited to defining goals and objectives,
identifying and documenting client requirements, resource requirements, project budget, project risks,
and the translation of the client business requirements into specific deliverables

• Ensure projects are completed on-time, quality and within budget on client expectations

• Manage day-to-day relationship with client and internal stakeholders and resolution of all issues

• in Java 8 stream libraries, Lambda, generics

• RDFS and SPARQL Graph Query Language.

• With spring boot framework

Monolithic to Microservices

Roles & Responsibilities:

• Installed and configured Cloudera Hadoop, Developed multiple map reduce jobs in java for data
cleaning and preprocessing.

• Involved in loading data from UNIX file system to HDFS.

• Installed and configured Hive and also written Hive UDFs.

• Involved in creating Hive tables, loading with data and writing hive queries which will run
internally in map reduce way.

• Gained very good business knowledge on health insurance, claim processing, fraud suspect
identification, appeals process etc.

• Data Cleansing and Legacy Migration Approach

• Legacy Data Mapping

• Data selection criteria for Migration

• Reconciliation Concept

• Conversion Design (Extract) à Functional/Technical Specifications

• Mock Conversion Plan

• Delivery of controls and audits

===================

Data Loader

Data Sync
Data Replication

Contact Validation

Master Data Management

Data Quality Assessment

Custom

Apps

Data Replication

Master Data Management

Amazon EC2

Informatica iPaaS – Self Service Integration Supports

1.Cloud Data Integration

2.Cloud Application Integration

3.Cloud Test Data Management

4.Cloud Data Quality

5.Cloud Master data Management

6.Informatica Power Center

===========Informatica Cloud for Amazon Web Services===

Amazon RedShift

Amazon EMR

Amzon RDS

Amzon Dynamo DB

Amazon Aurora

Amzon S3

 Deliver deep level technical workshops, executing on the defined learning path and
developing technical aspects of key scenarios defined in the technology roadmap. Enable and
teach the partner in performing and delivering Architecture Design Session (ADS). Provide
guidance for completing competency/qualifications technical requirements.
 Enable the partner in identifying technology opportunities that enable and/or support the
creation of differentiated offers in the market. Enable the partner in creating/adapting SOW for
the offers, defining the delivery/operational models and including adoption services and
activities.
 Enable the partner to be successful during pre-sales activities which includes delivery of
Proof Of Concepts (POC), Pilots, Prototypes and technical blockers and objections removal.
 Enable the partner to discover ways to automating solution to reduce costs and create
repeatability, while documenting processes for knowledge retention and IP.

Qualifications
Experiences Required: Education, Key Experiences, Skills and Knowledge:

 Hands on experience on Big Data, HD Insights, IOT, Machine Learning.

 Deep understanding of cloud computing, business drivers, and emerging computing


trends and their impact on customer opportunities.

 Ability to create deep technical relationships and assess level of the partner technical
roles, designing a time effective learning path and evaluate the progress toward the
defined milestones.

 Deep technical skill and significant experience in the relevant practice area and
technology focus.

 Understand how to leverage technology solutions for supporting business goals,


providing guidance on supported and not supported technical scenarios. Capability to
design and implement practice based architectures or equivalent competitive
experience.

 10+ years of related experience in technology solutions/practice development, Cloud /


Infrastructure technologies. Knowledge of MS platform preferable, project management,
technical Sales and technical account management.

 Extensive experience of managing virtual teams across functions and geographies:

  Help build RFPs/RFIs for relevant parties


  Attend various training to update skill and knowledge
  Drive technology architecture adoption among customers
  Public speaking in various technology forums.

 Work with solutions architect(s) to provide a consensus based enterprise solution that is
scalable, adaptable and in synchronization with ever-changing business needs.

 · Risk Management of information and IT assets through appropriate standards and


security policies.
 · Direct or indirect involvement in the development of policies, standards and
guidelines that direct the selection, development, implementation and use of Information
Technology within the enterprise.

 · Build employee knowledge and skills in specific areas of expertise.

 · Knowledge of IT governance and operations

 · Knowledge of financial modeling as it pertains to IT investment

 Designing, prototyping, and delivering applications and solutions on emerging technologies


like Cloud, Big Data, Natural Language Processing (NLP), Machine Learning (ML),
Robotic Process Automation (RPA), Artificial Intelligence (AI) etc., offering, to respond to
business needs for cost-efficiency, improved quality, and agility.
  Understanding of industry best practices and experience in cloud services, automation
and integration would be preferred.
  Working together with development and delivery teams developing solution
architecture, implementation plans and estimates
  Designing and deploying dynamically scalable, highly available, fault tolerant, secure
and reliable applications on emerging tech stack
  Address peculiar problems faced by development & deployment team by providing
technical solutions, suggestions and also by tapping into Vendors solution / contact to
derive both short term & long term solutions to deep technological problems.
 Proven ability to design, optimize and integrate business processes across disparate systems.

  Experience overseeing team members.


  Have thorough understanding of OOP, design patterns, and enterprise application
integration.
  Excellent analysis skills and the ability to develop processes and methodologies
  Detail-oriented individual with the ability to rapidly learn and take advantage of new
concepts, business models, and technologies. .

The role will cover the following areas of responsibilities:

1. Data Analytics - Translate business objectives into analytic approaches and identify data
sources to support analysis. Design, develop and implement analytical techniques on
large, complex, structured and unstructured data sets to help make better decisions.

2. Data Integration - Connect offline and online data to continuously improve overall
understanding of customer behavior and journeys for personalization. Data pre-
processing including collecting, parsing, managing, analyzing and visualizing large sets of
data.
3. Data Quality Management - Cleanse the data and improve data quality and readiness
for analysis. Drive standards, define and implement/improve data governance strategies
and enforce best practices to scale data analysis across platforms.

4. Data Mining - Implement statistical and data mining techniques e.g. hypothesis testing,
joins, aggregations, regressions, associations, correlations, inferences, clustering, graph
analysis and retrieval processes on a large amount of data to identify trends, figures and
KPIs across business units, segments, regions etc.

5. Reporting and Dashboarding – Create data visualizations, reports and dashboards to


provide insights and recommendations for key business challenges.

6. Research - Research on advanced and better ways of solving data specific problems and
establish best practices.

7. Collaborate - Collaborate with other data scientists, subject matter experts, and business
team/s around the globe to deliver strategic advanced data analytics projects from
design to execution.

Tools/Platforms Expertise Required:

 Mandatory

 Excellent SQL / SAS / Teradata / Oracle expertise

 Advanced Excel / Power-Pivot / PowerBI

 Reporting & Dashboarding – Excel

 Desirable

 Reporting & Dashboarding - Tableau / Spotfire / Qlikview

 Advanced Statistics, Statistical Modeling

 NoSQL / MongoDB / Cassandra

Skillset Required:

 Excellent knowledge and 6+ years of practical experience on Data analysis methodologies


 Ability to think creatively to solve real world business problems
 Understanding of relational databases and good knowledge of SQL
 Strong analytical and problem solving abilities
 Ability to self-motivate, manage concurrent projects and work with remote team/s
 Excellent attitude and ability to contribute to a team
 Excellent listening, written and verbal communication skills
 Research focused mindset with strong attention to detail
 High motivation, good work ethic and maturity
 Ability to work in a global collaborative team environment

Primary skills:
 10+ Yrs: Datastage Technical Manager (strong ETL Background); Technical Project
Manager - DWH Projects
 Oversee/design info architecture for data warehouse, incl. all info structures i.e. staging
area, data warehouse, data marts, & operational data stores.
 Oversee standardization of data def. & dev. of physical/logical modeling; develop
strategies for warehouse & database impl./mgmt.;
 Support both development and production support activities.
 Provides technical consulting and leadership in identifying and implementing new uses
of information technologies, which assist the functional business units in meeting their
strategic objectives.
 Acts as technical resource to lead Delivery staff in all phases of the development and
implementation process.
 4. Prepare estimates for project work.
 5. Oversees the technical direction of design and development to ensure
alignment with architecture, business requirements, and industry best practices.

1. Squoop
2. Flume
3. Storm
Document data base
data management steps—for example, capturing, cleaning and integrating data. Second, natural-
language processing (NLP) has expanded my thinking on how to measure attitudes. This new knowledge
inspired me t

SAS No. 70 is generally applicable when an independent auditor ("


Scalability Vs Decoupling

A Microservice is an independent entity that executes a minimal amount of work upon each
service call; it is independent because normally it does not have to share neither the
persistence support with other Microservices, and communication happens among the
boundaries through interfaces and messages passing.

A hierarchy of Microservices normally implement a moderately complex system.

The data in a data warehouse is typically loaded through an extraction, transformation, and loading
(ETL) process from multiple data sources.
Modern data warehouses are moving toward an extract, load, transformation (ELT) architecture in
which all or most data transformation is performed on the database that hosts the data warehouse.
Although the discussion above has focused on the term "data warehouse", there are two other
important terms that need to be mentioned. These are the data mart and the operation data store
(ODS).
data marts exist in two styles. Independent data marts are those which are fed directly from source
data. They can turn into islands of inconsistent information. Dependent data marts are fed from an
existing data warehouse. Dependent data marts can avoid the problems of inconsistency, but they
require that an enterprise-level data warehouse already exist.
The Java interface called DataStax Driver Cassandra
Interfaces
 com.datastax.driver.core.Cluster

You might also like