Professional Documents
Culture Documents
Naming Convention EDH
Naming Convention EDH
Naming Convention EDH
05.08.21
Internal
Document History
Version By Description
26-Mar-2020 • Adrian Thong • Created
• Sahrilnizam Ismail
30-Mar-2020 • Adrian Thong • Added explanations
01-Apr-2020 • Adrian Thong • Added guideline for ETL jobs
27-Apr-2020 • Adrian Thong • Revised naming conventions based on object hierarchies
observed in Talend Data Catalog
09-Jun-2020 • Adrian Thong • Added naming convention for API
• Sahrilnizam Ismail
25-Jun-2020 • Oo Woei Luen • Added naming convention for Data Mapping (Talend) and Other
Jobs (Talend)
01-Jul-2020 • Oo Woei Luen • Added naming convention for Data Lake
24-Jul-2020 • Oo Woei Luen • Amendment on Enrich zone where CDM will only be reflected
at Data Warehouse
• [Ref: Email from Zeffry Suzi Nasir (DE-PS/DIGITAL) dated 24
July 2020 3:37 PM]
5 July 2021 • Farah Hanum • Added staging section
5 August 2021 • Nafiz Izzat • Added data model object section and restructured the deck
10 January 2022 • Armizee Shah • Added SharePoint
14 July 2022 • Nurul Asyikin • Revise all Data Modeling object naming conventions (Data
Warehouse, Data Mart, general guideline)
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Note to user
2) The naming convention in this document is currently based on the Talend ETL and the
capability of the Talend Data Catalog.
3) The objects from AWS and Azure will appear in the Azure Talend Data Catalog in a
centralized Metadata repository, hence the name Directory Structure and the Model
must be able to tell that the object is in Cloud / Container / Zone.
4) Objects in EDH is meant for Enterprise consumption hence naming of the Directory
Structure and Folders in Raw, Process and Enrich should not be too project specific.
6) Do contact ED Architecture team with the project data domains to validate the
CommonDataModel grouping prior to resource request for EDH DEV.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Abbreviation
Short name of the object should be in Capital Letter.
Cloud AZ Azure
Zone ER Enrich
Zone PR Process
Zone RW Raw
Internal
Abbreviation
Short name of the object should be in Capital Letter.
Internal
Abbreviation
Short name of data model object should be in Capital Letter.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Data Structure
Naming convention for Staging, Raw, Process and Enrich zone’s directory structure
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Staging zone’s directory structure / folder
• Only for staging container, Access to ADLS container will require Security Group or Service
Principal. Request for security group no. should be made via ICT2U, ORF only captures the name
provided by ICT2U.
• For Staging at EDH AWS, project only requires to key-in directory section, as S3 credential are to
be provided by EDH CORE DevOps.
• Folder in Staging Zone is only a temporary file for system that require to push/dump data into
EDH. Directory will be deleted once the data is moved into raw zone.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s directory structure / folder
* Required
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder
Zone Azure example AWS example
Raw | <Version>
Process |
V01 V01
Enrich
* Required
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder
Zone Azure example AWS example
Raw |
Process | VBAK
Enrich
• Subfolder for enrich to follow data lake / data warehouse table name.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder (year)
Zone Azure example AWS example
Raw |
Process | 2020 2020
Enrich
• Highly recommended to have time series subfolders. Creation of these subfolders can be done
automatically by the ingestion or ETL process.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder (month)
Zone Azure example AWS example
Raw |
Process | 01 01
Enrich
• Highly recommended to have time series subfolders. Creation of these subfolders can be done
automatically by the ingestion or ETL process.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder (day)
Zone Azure example AWS example
Raw |
Process | 31 30
Enrich
• Highly recommended to have time series subfolders. Creation of these subfolders can be done
automatically by the ingestion or ETL process.
• How low the time series subfolders go depends on the velocity of the data, for example, if data is
ingested 2 or more times a day, then subfolders can go to day level. If data is ingested one a week,
then MM is sufficient.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / sub-folder (example)
Zone Azure example AWS example
Raw | SAPECC_P35 ePTW
Process | ->V01 ->V01
Enrich -->VBAK -->2020
--->2020 --->04
---->01 ---->30
----->31
• Directory structure only exists in the Talend Data Catalog and is not a folder in the EDH zone.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Structure
Naming convention for Raw, Process and Enrich zone’s
directory structure / folder / file
Zone Azure example AWS example
<UnixTimeStamp>_<Object>.format
Raw 1587102843_ZFIGLD014.json 1587102843_ZFIGLD014.json
Process 1587102843_ZFIGLD014.json 1587102843_ZFIGLD014.json
Enrich 1587102843_ZFIGLD014.json 1587102843_ZFIGLD014.json
• *<UnixTimeStamp> = Unix time stamp in Malaysia’s local time. Only required when the timestamp is
required. Example: Transactional data
• *<Object> = Data object. If it is database table, it should follow the table name.
* Required
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Data Lake
Naming convention for Staging, Raw, Process and Enrich zone’s folder
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Lake
Naming convention for Raw, Process and Enrich zone’s sub-folder
Zone Example
<DataSourceName>
Raw PGPS
Process PGPS
Enrich PGPS
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Lake
Naming convention for Raw and Process zone’s sub-folder (example)
• Refer to “Data Structure - Naming convention for Directory Structure / Folder / File”.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Data Lake
Naming convention for Enrich zone’s sub-folder (example)
• Refer to “Data Structure - Naming convention for Directory Structure / Folder / File”.
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
05. Naming convention for Enterpise Data Warehouse & Business Tenant -
Database / Schema / Table/ Attribute/ Keys
06.
Naming convention for Enterprise Data Mart & Business Tenant Mart -
07. Database / Schema / Table/ Attribute/ Keys
Internal
Enterprise Data Warehouse
Naming convention for the Enterprise Data Model schema name, table name, attribute
name, key
Schema Table Name AttributeName Key
<ModelStandard> <Tablename> <AttributeName> <TableName>+Id
For Synapse
(or databases that
can accept multi- EDM CustomerSales CustomerName CustomerId
case object names
in Pascal Case)
For Redshift EDM VENDOR_TYPE CUSTOMER_NAME CUSTOMER_ID
(or databases that Or edm or vendor_type or customer_name Or customer_id
cannot accept multi-
case object names,
prefer upper case Prefer upper case Prefer upper case Prefer upper case Prefer upper case
* Required
For <ModelStandard>, there are four standard name: CFIHOS, EDM, OSDU, PPDM
For naming <Tablename>, <AttributeName> refer to the respective objects in the data model standard
Internal
Enterprise Data Warehouse
Naming convention for custom object within Enterprise Data Model table name,
attribute name
Schema Table Name AttributeName
<ModelStandard> <Tablename> <AttributeName>
For custom at Table
EDM P_CustomerCertificate CertificateName
level
For custom at Attribute
level but at Table is not EDM CustomerSales P_Gender
a custom table
* Required
For <ModelStandard>, there are four standard name: CFIHOS, EDM, OSDU, PPDM
For naming custom objects, refer Data Model Naming and Definition Guideline
Internal
Enterprise Data Warehouse & Business Tenant
Naming convention for the Non Enterprise Data Model (Custom) schema name, table
name, attribute name, key
* Required
For name of Schema <DataDomain> in Enterprise Data Warehouse, refer to Data Model and Design, ED.
For naming custom objects, refer Data Model Naming and Definition Guideline
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Talend
Naming convention for data mapping in Talend
Reference * Required
*<Cloud> AZ, AW
*<ProcessName> ADLS, S3
*<DataSourceType> SAPS4, SAPECC, ORACLE, MSSQL, MYSQL, POSTGRES, API
(Short business description of the data source)
*<Source> RW, PR, ER, DMT, DW
*<Destination> RW, PR, ER, DMT, DW
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
05. Naming convention for Enterpise Data Warehouse & Business Tenant -
Database / Schema / Table/ Attribute/ Keys
06.
Naming convention for Enterprise Data Mart & Business Tenant Mart -
07. Database / Schema / Table/ Attribute/ Keys
Internal
Enterprise Data Mart & Business Tenant Data Mart
Naming convention for the view schema name, table name, attribute name
* Required
Data Governance - Refer Data Model Naming and Definition Guideline in naming data model objects
For name of Schema <DataDomain> in Enterprise Data Mart, refer to Data Model and Design, ED
© 2021 Petroliam Nasional Berhad (PETRONAS) |
Internal
Enterprise Data Mart & Business Tenant Data Mart
Naming convention for analytical data mart (star schema name, fact/ dim table name,
attribute name)
* Required
For name of Schema <DataDomain> in Enterprise Data Warehouse, refer to Data Model and Design, ED.
For naming data model objects, refer Data Model Naming and Definition Guideline
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Talend
Naming convention for other jobs in Talend
Reference * Required
*<Cloud> AZ, AW
*<ProcessName> BG, M, QM, SM
*<Zone> RW, PR, ER, DMT, DW
*<DataSourceType> SAPS4, SAPECC, ORACLE, MSSQL, MYSQL, POSTGRES, API
(Short business description of the data source)
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
API
Naming convention for API – Azure App Function
1) Follow the naming convention as owned by the Cloud Operation Center or COC.
2) ONE Azure App function for ONE Database Schemas for example App function.
3) ALL Azure App function will be connected to the ONE Azure Service Account.
Internal
API
Naming convention for API – Azure App Function
* Required
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
API
Naming convention for API – Triggers in Azure App function
* Required
Data Governance - Refer to the general rule on Letters, Description of File Names and Elements in File Names as
specified in “Some General Dos and Don’ts on Naming Convention” at the end of this pack.
Internal
API
Naming convention for API – URI in the API gateway layer
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Data Model Objects
Naming and definition guideline
Data model objects must be named described at a sufficient level of detail to ensure that they are discrete
and clearly understood.
Names are accompanied by definition, which describes in more detail the object the name refers to.
Internal
General Do’s and Don’ts on Data Mode Objects Naming
Convention
No. Topic General Rule Correct Incorrect Explanation
1. Keys All keys are Examples: Examples: 1. All keys regardless of type
standardized to (surrogate, primary or
<Tablename>+Id 1) CustomerId 1) CustomerSK foreign keys) will follow
2) Id <Tablename>+Id standard
* Currently Data Modeling & Design, ED do not produce a list of standard abbreviation except for <DataDomain>.
© 2021 Modern
Petroliam Nasional Berhad databases
(PETRONAS) have
|
large max length for data object names. If a database has technical limitation on length, refer to Data Modeling & Design team.
Internal
General Do’s and Don’ts on Data Mode Objects Naming
Convention
No. Topic General Rule Correct Incorrect Explanation
4 Letters case For database that Examples: Examples: 1. Only pascal case is
accepts multicase, accepted where words
the default is to use 1) CustomerName 1) customer_name begin with a large case.
Pascal case, with 2) customerName 2. Use of delimiter (e.g: -,_) is
no hypens or other 3) CUSTOMER_NAME not acceptable in between
delimiters 4) Customer-Name two words
Upper case is
preferable
* Currently Data Modeling & Design, ED do not produce a list of standard abbreviation except for <DataDomain>.
© 2021 Modern
Petroliam Nasional Berhad databases
(PETRONAS) have
|
large max length for data object names. If a database has technical limitation on length, refer to Data Modeling & Design team.
Internal
Content
01. Notes to user
02. Abbreviation
03. Naming convention for Data Structure - Directory Structure / Folder / File
Internal
Some General Do’s and Don’ts on Naming Convention
No. Topic General Rule Correct Incorrect Explanation
1. Letters It is best to use Examples: Examples: 1. The file name will be
capital letters identifiable by using
except in the 1) AZ_ADLS_RW_SAPECC 1) az_adls_rw_sapecc capital letters to
situation where 2) AZ_ADLS_PR_SAPECC 2) az_adls_pr_sapecc differentiate between
the format for 3) AZ_ADLS_ER_FIN 3) az_adls_er_fin the words.
letters of a file
name MUST not
all be in capital
letters.
Internal
Some General Do’s and Don’ts on Naming Convention
No. Topic General Rule Correct Incorrect Explanation
3. Dates If a date is be Examples: Examples: 1. By using the format,
used in the file the sequence of the
name, use the 1) 2020 1. 1April2020 records will be
following format: 2) 202004 2. Apr1st2020 maintained.
a) YYYY; or 3) 20200401
b) YYYMM; or 2. It also helps if need to
c) YYYYMMD retrieve the latest
D dated record.
Internal
Some General Do’s and Don’ts on Naming Convention
No. Topic General Rule Correct Incorrect Explanation
5. Numbering When there is a Examples: Examples: To include the zero for
need to include numbers 0-9. This is to
a number in a FIN_01 FIN1 maintain the numeric order.
file name, use FIN_02 FIN2 This will assist to retrieve
two-digit number FIN_03 FIN3 the latest record number.
instead of one FIN_04 FIN4 i.e. 01, 02 … 99 except if it
digit number. FIN_05 FIN5 is a year or another
FIN_06 FIN6 number with more than two
FIN_07 FIN7 digits.
FIN_08 FIN8
FIN_09 FIN9
Internal
Some General Do’s and Don’ts on Naming Convention
No. Topic General Rule Correct Incorrect Explanation
7. Timestamp When there is a Examples: Examples: Unix time is a system for
need to use describing a point in time.
Unix timestamp, 1) 1587102843_P35.json 1) P35_1587102843.json By using,
to start with the 2) 1587102843_FIN.json 2) FIN_1587102843.json <UnixTimeStamp>_<Objec
Unix timestamp, t>.format, it is certainly
followed by the easier to be read by a
data object and layperson
then the format.
Internal
Some General Do’s and Don’ts on Naming Convention
1) Except where the file names require to include characters such as * : \ / < > | " ? [ ] ; =
+ & £ $ , try to AVOID saving files containing the said characters as these characters
will make it difficult to search the file name.
2) DON'T use codes, abbreviations and initials which are not usually understood.
3) Except if it easier to retrieve the record, AVOID using common words such as ‘draft’ or
‘letter’ at the beginning of file names.
5) AVOID using team names as the basis for folder names as company structure may
change.
6) AVOID names with similar meanings or unclear names such as document.pdf. Use
descriptive information and include dates in files names if possible. It is best if
we know what’s in the file without having to open it.
7) Do not use cryptic codes that only YOU understand (example: prk22–1sple.doc).
Make it meaningful to everyone else.
Internal
File type summary
Internal