Code checklist template

ETL Code Review CheckList Template

DataStage Parallel Jobs Code Review Checklist

Author:

Date:

Version:

ContentsINTRODUCTION3FINDINGS GUIDELINES3IDENTIFICATION5SEQUENCE JOBS6SEQUENCE STAGES6ETL JOBS7DATABASE STAGES7TRANSFORMER STAGE8LOOKUP9AGGREGATOR9SEQUENTIAL FILE STAGE9FINDINGS SUMMARY10

INTRODUCTION

This document establishes the common feature for code reviewing a DataStage Extract, Transform, and Load [ETL] DataStage Jobs. This code review checklist is intended to support the ETL Developers Handbook providing a quick reference for compliance with coding and quality standards, and for documenting finding. If a conflict should occur between this checklist and the guidance provide in the ETL Developers Handbook the Handbook with be considered the authoritative source of guidance.

FINDINGS GUIDELINES

The Severity Guidelines describe the defect severity and resolution guidelines. Code Review Finding will be formally document on this form and included in development documentation. Code review Findings will be assigned severity levels as defined below. The Highest severity level that is identified during the review will constitute the overall findings outcome of the code review.

Severity Level

Assessment

Explanation

Action

5

Blocker

Code/configuration is:

Substantially flawed

No workaround is available.

Does not meet requirements as written

Does not conform to required coding standards

Stops Testing.

No additional testing can occur until this incident is resolved.

The appropriate development/configuration team will take immediate action to work towards a resolution.

A follow-up code review is required

4

Critical

Code/Configuration

Stops Major Functionality.

An unacceptable workaround exists.

Is substantially flawed

Does not/may not meet performance requirements

Meets requirements as written

Conforms to required coding standards

Test efforts may proceed while severely limited.

The appropriate development/configuration team will take immediate action to work towards a resolution.

A follow-up code review is required

3

Major

Code/Configuration:

Limits Functionality.

An acceptable workaround exists.

Could be simplified or a more appropriate coding approach has been identified.

May not meet performance requirements

Meets requirements as written

Conforms to required coding standards

Defect is addressed after all Severity Level 5; 4 defects are resolved.

Test efforts may proceed; however, testing cannot progress to the next test phase until all Severity Level 3s are resolved.

Appropriate development/configuration team will resolve in time for next application build/release.

A follow-up code review is required

2

Minor

Code/Configuration:

Slows Testing.

An acceptable workaround exists, if applicable.

Defect resolution can be delayed without impacting test efforts. Testing may proceed to the next phase with outstanding Level 2 defect.

Code is fundamentally correct, but minor changes are recommended.

Meets performance requirements

Meets requirements as written

Conforms to required coding standards

Defect is addressed after all Severity Level 5; 4 and 3 defects are resolved.

Testing may proceed to the next phase with unresolved Severity Level 2 defects.

Appropriate development/configuration team will resolve in time for next for next application build/release deployment release into production.

No follow-up code review required

1

Trivial

Defect addresses a requested cosmetic change or enhancement of the developed solution.

No impact to testing.

Defect is addressed after all Severity Level 5, 4, 3, and 2 defects are resolved.

Appropriate application development/configuration will resolve as time allows; for next application build/release deployment and/or O&M release into production

No follow-up code review required.

IDENTIFICATION

DataStage Project Name

< Project Name>>

Job Name

Location of Job

Developer Name

Date of Developer Validation

Reviewer Name

Date of Review

SEQUENCE JOBS

Item Number

Checklist item

Developer [Y/N]

Reviewer [Y/N]

Severity

1.1

Naming conventions are properly applied to sequences? [see ETL Developers Handbook]

1

1.2

All Sequences are contained within a parent Sequence or scheduler event No Orphans?

5

1.3

All jobs are contained within a Sequence No Orphans-exception one time use jobs?

5

1.4

Naming conventions have been applied to Stages and Links? [see ETL Developers Handbook]

1

1.5

All Parameter Default value expressions have been properly populated.

From a data element or parameter set

With an appropriate constantif applicable

Make sure you do not use the default values in parallel jobs

4

1.6

Error handling is contained in each sequencer and conforms to requirements

Email

Stop - Only if absolutely necessary

Job Dependencies

Recovery Steps [especially, manual steps]- as needed

3

1.7

Sequence structure provides for optimal concurrent processing and does not exceed system constraints

3

1.8

General Tab has been properly completed with developer comments [see ETL Developers Handbook]

1

1.9

Does Sequences have an Annotation or Description Annotation?

1

1.10

Has job stream status reporting/Audit Logging been included?

3

SEQUENCE STAGES

Item Number

Checklist item

Developer [Y/N]

Reviewer [Y/N]

Severity

2.1

All stages in the sequence have been named in accordance with developer handbook and/or standard practices guidelines

1

2.2

General Tab > Description: has been properly completed with developer comments [see ETL Developers Handbook]

1

2.3

General Tab > Logging: has been properly completed with job Name and activity performed description?

1

2.4

Trigger dependency conditions have been properly assigned [e.g. "Execution finished with warnings" OR "Executed OK"]

3

2.5

Variables have been properly populated from parameter set name and/or not set to as pre-defined.

3

2.6

Execution Action has been set [reset, if required then run]

2

2.7

Has the Do not checkpoint run checkbox been properly set based on operations manual restart procedures?

2

2.8

If needed, System execute command are clearly explained

Parallel JOBS

Item Number

Checklist item

Developer [Y/N]

Reviewer [Y/N]

Severity

3.1

Server Jobs were not used, unless no other acceptable parallel job method exists?

3

3.2

Has Compile in trace Mode been disabled in Job Properties > Execution?

3

3.3

Job Properties [edit > job Properties > Parameters]: does not contain the $APT_ variable? unless justified

3

3.4

Naming conventions are properly applied to jobs?

1

3.5

All jobs are contained within a Sequence or scheduler event No Orphans?

5

3.6

Have unneeded columns been removed as early as possible in the data flow?

2

3.8

Has a maximum length for Varchar columns been set [performance tip]?

2

3.9

Job minimizes and combines use of Sorts where possible [performance tip]?

Depends on the source, might use RDBMS sort and join for performance

2

3.10

Edit > Properties > General Tab: Short Job Description

1

3.11

Edit > Properties > General Tab: Full Job Description

1

3.12

Check if the data type and length of all the mapped fields is consistent throughout the job from the source across all stages.

2

3.13

If assigning Surrogate IDs [SIDs] definitions. Check that business keys are defined correctly for SID generation.

4

3.14

Coding was accomplished in accordance with design specification and/or the design specification has been updated with appropriate changes

3

3.15

Check if the logic in the job is in sync with the logic mentioned in the Detailed design.

3

3.16

For jobs which load a source file, have behaviors for corrupt or unreadable files been incorporated

3

3.17

Have multiple processing run behaviors been defined to ensure data accuracy and non-duplication of data?

3

3.18

For loads which require only current data be loaded, have behaviors been defined for records from previous loads/periods?

3

3.19

Data volume is your job able to handle large datasets reliably within defined processing window and/or requirement measure?

3

3.20

High level view - check the job is not overly complicated and the developer can present/show the high-level data flows in the job. If not, break the job to multiple jobs

2

3.21

Job input/output are re-usable, if so use dataset fo

Video liên quan

Chủ Đề