DataStage Vs Informatica

Comparison Between DataStage (Server Edition) and Informatica
  1. Datastage is closely integrated with its repository (universe database). Informatica is not. With the introduction of repository server, they have isolated the server calls with repository calls to reduce the load. This has both advantages and disadvantages.
  2. Datastage is more powerful transformation engine by using functions (Oconv and IConv) and routines. We can do almost any transformation. Informatica is more visual, programmer friendly.
  3. Lookups are much faster in Datastage, because the way the hash files are built. You can tune the hash files to get an optimal performance.
  4. Datastage best practices calls for landing the data in between transformations and have smaller simpler job. The job when compiled generates a Basic routine and bigger the routine is, slower the job performs. For a simple project, you might end up having 3-4 times more jobs in Datastage than in Informatica.
  5. Datastage does not perform very well with heterogeneous sources. You might end up extracting data from all the sources and putting them into a hash and start your transformation. This may not be the case with Informatica.
  6. DataStage and Informatica support XML.  DataStage comes with XML input, transformation and output stages.
  7. Both products have an unlimited number of transformation functions since you can easily write your own using the command interface.
  8. Both products have options for integrating with ERP systems such as SAP, PeopleSoft and Seibel but these come at a significant extra cost.  You may need to evaluate these.  SAP is a reseller of DataStage for SAP BW, PeopleSoft bundles DataStage in its EPM products.
  9. DataStage has some very good debugging facilities including the ability to step through a job link by link or row by row and watch data values as a job executes.  Also server side tracing.
  10. DataStage 7.x releases have intelligent assistants (wizards) for creating the template jobs for each type of slowly changing dimension table loads.  The DataStage Best Practices course also provides training in DW loading with SCD and surrogate key techniques.
  11. Ascential and Informatica both have robust metadata management products.  Ascential MetaStage comes bundled free with DataStage Enterprise and manages metadata via a hub and spoke architecture.  It can import metadata from a wide range of databases and modeling tools and has a high degree of interaction with DataStage for operational metadata.  Informatica SuperGlue was released last year and is rated more highly by Gartner in the metadata field.  It integrates closely with PowerCenter products.  They both support multiple views (business and technical) of metadata plus the functions you would expect such as impact analysis, semantics and data lineage.
  12. DataStage can send emails.  The sequence job has an email stage that is easy to configure.  DataStage 7.5 also has new mobile device support that can administer the DataStage jobs via a palm pilot. There are also 3rd party web based tools that let you run and review jobs over a browser. We can send sms admin messages from a DataStage UNIX server.
  13. DataStage has a command line interface.  The dsjob command can be used by any scheduling tool or from the command line to run jobs and check the results and logs of jobs.
  14. Both products integrate well with Trillium for data quality, DataStage also integrate with QualityStage for data quality.  This is the preferred method of address cleansing and fuzzy matching.
  15. Deployment facility: Ability to handle initial deployment, major & minor releases and patches with ease.
    • Informatica: Yes.
    • DataStage: No
  16. Support for looping the source row (For While Loop).
    • Informatica: Supports for comparing immediate previous record
    • DataStage; Does not support.
  17. Slowly Changing Dimension.
    • Informatica: Supports Full History, Recent Values, Current & Previous Values.
    • DataStage: Supports only through Custom scripts. Does not have a wizard to do this.
  18. Time Dimension generation.
    • Informatica: Does not support.
    • DataStage: Does not support
  19. Rejected records.
    • Informatica: Cab be captured.
    • DataStage: Cannot be captured (Cab be captured in a separate file).
  20. Debugging Facility.
    • Informatica: Does not Support.
    • DataStage: Supports basic debugging facilities for testing.
  21. Ability to Customize views of metadata for different users (DBA Vs Business user).
    • Informatica: Supports.
    • DataStage: Supports.
  22. Metadata repository can be stored in RDBMS
    • Informatica: Yes.
    • DataStage: No.
  23. Support And Maintenance: Command line operation.
    • Informatica: Yes (pmcmd).
    • DataStage: Yes (dsjob).
  24. Ability to maintain versions of mappings/jobs.
    • Informatica: Yes.
    • DataStage: Yes.
  25. Job Controlling & Scheduling.
    • Informatica: Yes.
    • DataStage: Yes.

No comments:

Post a Comment