GO Loader Aviation 1.2 is here! Take a look at what's new.
Skip to end of metadata
Go to start of metadata

Configure products

'Product' refers to the package that is uploaded to the GO Publisher Admin Console, and contains configuration information for your specified dataset, using the settings configured in your GO Publisher Desktop project (.gpp).

You will need to configure a GO Publisher Workflow Product directory, and then upload it as a ZIP file to your GO Publisher Workflow system. It is possible to set up multiple 'products' to allow you to publish from multiple GO Publisher project files.

This page outlines how to configure a product directory, which is referred to as {productdirectory}.

GO Publisher Workflow supports the ASCII character-encoding scheme. All characters used in GO Publisher Workflow's configuration must be ASCII characters.

Product directory configuration

The example-resources directory provided with GO Publisher Workflow contains an example called "my-product.zip" which is based on the Treasure Island training we offer.

We recommend unzipping and copying the my-product directory so you can have a frame of reference to refer back to when editing your 'product' directory. The folder can be changed to a name other than 'my-product.'

The my-product.zip file contains:

...
	my-product/product.xml 
		|
		fragment/fragment-config.xml [OPTIONAL]		
		metadata-templates/{template files} [OPTIONAL]		
		projects/my-project.gpp
		schemas/{schema files} [OPTIONAL]   
		schematron/{schematron files} [OPTIONAL]
		xslt/{xslt files} [OPTIONAL]

 

Configure the product.xml document

Each 'product' directory is relative to a product.xml document which defines all of the configuration relating to the 'product.' 

A product.xml document is required in each 'product' directory. This document must be named 'product.xml' regardless of the product name, and the document must be located in the product's folder.

The product.xml document requires configuring as outlined below, where M = mandatory and O = optional configuration

 

Example product.xml document
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Configuration of a GO Publisher Workflow product -->
<gpa:PublishProduct xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:gpa="http://www.snowflakesoftware.com/agent/go-publisher" xmlns:sfa="http://www.snowflakesoftware.com/agent" xmlns:xlink="http://www.w3.org/1999/xlink">
	<!-- [Required] Name of the product. This should be unique in the application server environment -->
	<gpa:name>my-product</gpa:name>

	<!-- [Optional] Default path patterns of job files, if no path pattern is specified in the job request files. -->
	<gpa:defaultPathPatterns>
		<!-- Path pattern of a chunking scheme job. -->
		<gpa:defaultPathPattern type="FILE">/{path}/{chunk}.xml</gpa:defaultPathPattern>
		<!-- Path pattern of a non chunking scheme job. -->
		<gpa:defaultPathPattern type="FILE">/{document}.xml</gpa:defaultPathPattern>
		<!-- Path pattern of a chunking scheme and file size chunked job. -->
		<gpa:defaultPathPattern type="FILE">/{path}/{chunk}_{sequence}.xml</gpa:defaultPathPattern>
		<!-- Path pattern of a non chunking scheme and file size chunked job. -->
		<gpa:defaultPathPattern type="FILE">/{document}_{sequence}.xml</gpa:defaultPathPattern>
		<!-- [Optional] Path patterns need to be defined if additional components are turned on. For example METADATA paths are needed if metadata is turned on. All defined paths should be unique -->
		<!-- <gpa:defaultPathPattern type="METADATA">/{path}/{chunk}-Metadata.xml</gpa:defaultPathPattern> -->
		<!-- <gpa:defaultPathPattern type="METADATA">/{document}-Metadata.xml</gpa:defaultPathPattern> -->
		<!-- <gpa:defaultPathPattern type="METADATA">/{path}/{chunk}_{sequence}-Metadata.xml</gpa:defaultPathPattern> -->
		<!-- <gpa:defaultPathPattern type="METADATA">/{document}_{sequence}-Metadata.xml</gpa:defaultPathPattern> -->
	</gpa:defaultPathPatterns>

	<!-- [Required] The JNDI name identifying the JDBC resource responsible for providing connections to the database server from which data will be published. This database will also contain the configuration tables. -->
	<gpa:publisherJndiDataSource>GOPublisherDS</gpa:publisherJndiDataSource>

	<!-- [Optional] The database schema where the configuration tables reside. If not configured, the 'publisherJndiDataSource' default schema will be used.
	<gpa:configTablesSchemaName>GOPUBLISHER</gpa:configTablesSchemaName> -->

	<!-- [Required] Location of the GO Publisher project file created by the GO Publisher Desktop application relative to this file. -->
	<gpa:publisherProject xlink:href="projects/my-project.gpp"/>

	<!-- [Optional] Meta-data generation step - Enables the creation of meta-data files based on the following templates --><!--
	<gpa:metadataTemplate>
		--><!-- [Required] Meta-data template describing how the locations of the meta-data files created for published file(s) belonging to a publish job are stored within the meta-data file created for the publish job --><!--
		<gpa:embeddedJobFileMetadataTemplate xlink:href="metadata-templates/embed_metadata.xml"/>
		--><!-- [Required] Meta-data template describing information stored for published file(s) created as part of a publish job --><!--
		<gpa:jobFileMetadataTemplate xlink:href="metadata-templates/jobfile_metadata.xml"/>
		--><!-- [Required] Meta-data template describing information stored for a publish job --><!--
		<gpa:jobMetadataTemplate xlink:href="metadata-templates/job_metadata.xml"/>
	</gpa:metadataTemplate> -->

	<!-- [Optional] ISO Schematron validation step - Validates features using the rules defined in the referenced resource --><!--
	<gpa:schematronRules xlink:href="schematron/my-schematron-rules.sch"/> -->

	<!-- [Optional] XML Schema validation step - Validates features using the following XML Schema resource(s) --><!--
	<gpa:schemaValidationResources>
		<gpa:schemaValidationResource xlink:href="schemas/treasure.xsd" />
		--><!-- Any additional XML Schema validation resources can be added here --><!--
	</gpa:schemaValidationResources> -->

	<!-- [Optional] XSLT post-processing step - Transforms published file(s) using the following XSLT transformation resources(s) --><!--
	<gpa:additionalProducts>
		<gpa:xsltProduct>
			<gpa:name>my-additional-product</gpa:name>
			<gpa:defaultXSLTPathPatterns>
				<gpa:chunked>/my-additional-product/{path}/{chunk}.txt</gpa:chunked>
				<gpa:nonChunked>/my-additional-product/{document}.txt</gpa:nonChunked>
			</gpa:defaultXSLTPathPatterns>
			<gpa:xsltSource xlink:href="xslt/my-xslt-transform.xsl"/>
		</gpa:xsltProduct>
		--><!-- Any additional XSLT transformation resources can be added here --><!--
	</gpa:additionalProducts> -->

	<!-- [Optional] Feature fragmentation step - Enables batching of features into fragments. --><!--
	<gpa:fragmentConfig xlink:href="fragment/fragment-config.xml"/> -->
	<!-- [Optional] Export step - Store published file(s) using the following configuration --><!--
	<gpa:externalPublishing>
	    --><!-- [Required] The JNDI name used to identify the database server to export published file(s) to --><!--
		<gpa:jndiName>ExportDS</gpa:jndiName> 
	    --><!-- [Required] The mapping to the database table --><!--
		<gpa:tableMapping name="TABLE_NAME"> 
	        --><!-- [Required] The mapping to the column within the database table where the file identifier is stored as string --><!--
			<gpa:id mapTo="FILE_ID_COLUMN"/> 
	        --><!-- [Required] The mapping to the column within the database table where the file data is stored as a Binary Large Object (BLOB) --><!--
			<gpa:data mapTo="FILE_DATA_COLUMN"/> 
	        --><!-- [Optional] The mapping to the column within the database table where the file path is stored as string --><!--
			<gpa:path mapTo="FILE_PATH_COLUMN"/>
		</gpa:tableMapping>
	</gpa:externalPublishing> -->

	<!-- [Required] Job File compression step. If turned on all output files will be compressed using GZip compression. -->
	<gpa:compressFiles>false</gpa:compressFiles>

	<!-- [Required] ZIP build compression step. If turned on then a ZIP file containing the output files for a job will be generated and available to download via the UI and API. -->
	<gpa:generateArchive>true</gpa:generateArchive>
	
	<!-- [Optional] Empty files step. If set to true, empty output files will be available for download via the UI and the API. If set to false, empty output files will not be available.
	<gpa:downloadEmptyFiles>true</gpa:downloadEmptyFiles>
</gpa:PublishProduct>

Product Name (Mandatory)

	<gpa:name>my-product</gpa:name>

Each product requires a unique name, which will be referenced by publish jobs.

Default Path Pattern for XML Documents (Optional)

The product.xml file can contain entries for default path patterns for the published XML documents. These path patterns are used to create the structure of the Zip output file.

Please see Path Patterns for information on how to configure your Zip output file.

The default path pattern entries are not mandatory in the product.xml file, because they can be overridden in the publish job. However, the path patterns must be defined in at least one location. If no appropriate paths for the job are found in the job definition or product.xml, the job will not be accepted into the system.

JNDI Datasource (Mandatory)

	<gpa:publisherJndiDataSource>GOPublisherDS</gpa:publisherJndiDataSource>

To enable connections to the database server containing the data to be published, you need to specify the JNDI name you wish to connect with. The JNDI connections will have been set up previously to this in JNDI Data Source Connections. You now need to add the datasource in the product.xml document so that GO Publisher Workflow will use it.

JBoss EWS 2.0 (Tomcat 7) Deployments Only

Add 'java:comp/env/' prefix to datasource name;

<gpa:publisherJndiDataSource>java:comp/env/GOPublisherDS</gpa:publisherJndiDataSource>


Publish Job Configuration Tables Schema (Optional)

	<gpa:configTablesSchemaName>Schema2</gpa:configTablesSchemaName>

Configuration tables are required for certain publishing jobs. For more information on which jobs require these tables and how to set them up, see Publish Job Configuration Tables.

Depending on your how your database is set up, you have the option to configure an additional non-default datasource schema, not included in the GO Publisher project. This is useful if you want to create a schema to keep the publish job configuration tables separate from the data.

If you do not uncomment the <gpa:configTablesSchemaName> tag, the 'publisherJndiDataSource' default schema will need to contain the Publisher Job Configuration Tables.

Please keep in mind that 'public' is the default schema for PostgreSQL databases.

This image illustrates a datasource containing a separate schema for the configuration tables and the configuration of GO Publisher Workflow.

The user configured in the datasource connection must have permission to access the other schema(s) in order for this to work.

Project Name (Mandatory)

The projects folder within the main product directory should contain the GO Publisher project file (.gpp) relevant to the product.

Save your gopublisher project file e.g. my-project.gpp to the projects folder in your product directory:

{productdirectory}\projects\my-project.gpp

You now need to add the project name to the <gpa:publisherProject> tag in the product.xml document so that GO Publisher Workflow will recognise it.

	<gpa:publisherProject xlink:href="projects/my-project.gpp" />

Metadata templates (Optional)

Metadata can be published alongside the data at two levels:

  •  the job level
  •  the individual job file level (as a job can result in multiple published files)

The example-resources product contains default, configurable metadata templates that are ISO compliant. With these ISO templates, you can generate metadata files for each job and published file created.

	my-product/
		|
		metadata-templates/
			|
			embed_metadata.xml
			jobfile_metadata.xml
			job_metadata.xml
...
Metadata FileDefinition
job_metadataUsed to publish the job level information.
jobfile_metadataUsed to publish the individual job file level information for each published file.
embed_metadataEmbeds the list of job files for a particular job into the job_metadata.xml document.

If you use the provided metadata templates, you will need to configure the job_metadata.xml and jobfile_metadata.xml documents to include information specific to your organisation and data. Values which will be the same for all jobs, such as information about your organisation, can be added directly to these documents. Other information which may change depending on the job, such as the validity period of the data ({START_DATE} and {END_DATE}), are set as parameters in the job_metadata.xml and jobfile_metadata.xml documents.

You will need to add the values for the metadata parameters when you Create a Publish Job.

To turn on metadata publishing, include the <gpa:metadataTemplate> element in your product.xml document.

	<gpa:metadataTemplate>
		<gpa:embeddedJobFileMetadataTemplate xlink:href="metadata-templates/embed_metadata.xml"/>
		<gpa:jobFileMetadataTemplate xlink:href="metadata-templates/jobfile_metadata.xml"/>
		<gpa:jobMetadataTemplate xlink:href="metadata-templates/job_metadata.xml"/>
	</gpa:metadataTemplate>

Thanks, but I don't need Metadata...

 Comment out the Metadata section if you do not need it.

Schematron Rules (Optional)

Schematron is the method of validating published data against a set of Business Rules held in a schematron file.

Do I need to use XML Fragmentation?

We recommend you use XML Fragmentation if you are using Schematron validation, though the performance of Schematron Validation without fragmentation will depend on your memory allocation.

The schematron folder within the {productdirectory} should contain a customer supplied Schematron file (.sch). GO Publisher Workflow will validate the published data according to the rules found in this file.

Save your Schematron file (my-schematron-rules.sch) to the schematron folder in the {productdirectory}:

{productdirectory}\schematron\my-schematron-rules.sch

Add the Schematron Rules to the product.xml document so that GO Publisher Workflow will know where to find the schematron file to use for validation.

	<gpa:schematronRules xlink:href="schematron/my-schematron-rules.sch"/>

Schema Validation (Optional)

SchemaValidation box

If you wish to do schema validation using workflow you should check that the validation check box within the go publisher project is unchecked. The box can be found under Settings / XML context tab. Validation is then run using the settings below.

 

You can validate published data against the xml schemas referenced in the GO Publisher project file to identify whether the output is valid and well formed.

GO Publisher Workflow can implement XML Schema Validation if you un-comment this option in the product.xml document.

 

	<gpa:schemaValidationResources>
		<gpa:schemaValidationResource xlink:href="schemas/treasure.xsd" />
		<gpa:schemaValidationResource xlink:href="schemas/island.xsd" />
		<!-- Additional schema validation resources go here -->
	</gpa:schemaValidationResources>

 

XSLT Transformations (Optional)

You can configure GO Publisher Workflow to perform XSLT transformations at a column or table level (generated during publishing) or at a project wide level (post processing on the published xml data).

Pre-requisite:

For all scenarios, you need to have a valid XSLT file.

To use XSLT transformations within GO Publisher Workflow you need to configure the workflow Product to locate your XSLT file.

To do this, save your XSLT file (e.g. my-transformations.xslt) to the XSLT folder in the {productdirectory}:

{productdirectory}\xslt\my-transformations.xslt

Add the xslt file name to the <additional products> configuration, in the <gpa xslt source> tag, within the xlink:href location.

You will also need to configure the XSLT  Path Patterns in the product.xml document. This is to provide a correctly formatted file name for the output xml file, generated by the project level application of XSLT.

For more information on configuring the default XSLT path patterns, please see Path Patterns.


	<gpa:additionalProducts>
		<gpa:xsltProduct>
			<gpa:name>my-additional-product</gpa:name>
			<gpa:defaultXSLTPathPatterns>
				<gpa:chunked>/my-additional-product/{path}/{chunk}.txt</gpa:chunked>
				<gpa:nonChunked>/my-additional-product/{document}.txt</gpa:nonChunked>
			</gpa:defaultXSLTPathPatterns>
			<gpa:xsltSource xlink:href="xslt/my-xslt-transform.xsl"/>
		</gpa:xsltProduct>
	<!-- Any additional XSLT transformation resources can be added here --><!--
	</gpa:additionalProducts> -->

Applying XSLT options

To use the:

XML Fragmentation (Optional)

The example-resources product contains a default fragmentation configuration file:

	my-product/
		|
		fragment/fragment-config.xml

XML Fragmentation fragments the published output files into smaller files which makes Schematron performance more scalable.

When should I use XML Fragmentation?

We recommend you use XML Fragmentation if you are using Schematron validation, since only those feature types listed in the fragment-config.xml document are fragmented, and are validated using Schematron rules.

A balance needs to be struck between the overhead of creating and processing each fragment, versus validating very large fragments that require large amounts of memory resource during validation.

Navigate to the fragment folder in the {productdirectory} directory:

{productdirectory}\fragment\fragment-config.xml

Open the fragment-config.xml document in a xml editor (or text editor).

Go to the <sf:fragmentConfig> tag and specify the fragmentSize attribute (the default=1). The fragmentSize attribute specifies how many features there will be in a single fragment. This value specifies the maximum number of features within a published file to be validated at any one time.

Specify the set of feature types that you are expecting via the <sf:FeatureEntrySet> tag.

<sf:FragmentConfig xmlns:sf="http://www.snowflakesoftware.co.uk/example/feature">
	<!-- The set of feature elements to be considered (as fragments) during the publish.-->
 	<sf:FeatureEntrySet>
 		<!-- Treasure feature element -->
 		<sf:FeatureEntry id="Treasure">
 			<sf:FeatureDetail>
 				<sf:uri>http://www.snowflakesoftware.co.uk/example/treasure</sf:uri>
 				<sf:localName>Treasure</sf:localName>
 			</sf:FeatureDetail>
 		</sf:FeatureEntry>
 	<!-- Additional features can be specified here -->
 	</sf:FeatureEntrySet>
</sf:FragmentConfig>

You now need to turn on the xml fragmentation in the product.xml document so that GO Publisher Workflow will know to use it during Schematron validation.

	<gpa:fragmentConfig xlink:href="fragment/fragment-config.xml" />

External Database Publishing (Optional)

It is possible to configure GO Publisher Workflow to publish to an external database table by specifying external publishing details in the product.xml document.

You will first need to set up the external database table to publish into.

The table can have the following columns:

ColumnRequirementDefinition
File IDMandatoryThis column will store the file identifier as string
File DataMandatoryThis column will store the file data as Binary Large Object (BLOB)
File PathOptionalThis column will store the file path as string

You will then need to configure a JNDI connection in your application server.

After you've configured the JNDI connection, you will need to specify the JNDI name as well as the mappings to the table and columns in the product.xml document.

In the product.xml document, edit the following elements in the <gpa:externalPublishing> tag:

	<gpa:externalPublishing>
	    <!-- [Required] The JNDI name used to identify the database server to export published file(s) to -->
		<gpa:jndiName>ExportDS</gpa:jndiName> 
	    <!-- [Required] The mapping to the database table -->
		<gpa:tableMapping name="TABLE_NAME"> 
			<gpa:id mapTo="FILE_ID_COLUMN"/> 
			<gpa:data mapTo="FILE_DATA_COLUMN"/> 
			<gpa:path mapTo="FILE_PATH_COLUMN"/>
		</gpa:tableMapping>
	</gpa:externalPublishing>

File Compression (Mandatory)

    <gpa:compressFiles>true</gpa:compressFiles>

GO Publisher Workflow has the ability to compress the generated output files (excluding XSLT and metadata) before making them available to download. For systems where you wish all files generated from a product to be compressed by default, you can set the job file compression flag in the product to true. This will cause all jobs to generate compressed files. Alternatively, if you do not wish the files to be compressed by default, you can set the compression flag in the product to false.

Compression FlagResult
trueall files generated from a product will be compressed by default
falseall files generated from a product will NOT be compressed by default

All compressed files are compressed using GZip compression.

This setting can be overridden in the job configuration.

Archive Generation (Mandatory)

    <gpa:generateArchive>true</gpa:generateArchive>

In addition to being able to compress individual files, GO Publisher Workflow can compress all of the output files for a job (including XSLT and metadata) into a ZIP archive. For systems where you would like an archive created for all jobs that reference a particular product by default, you can set the generate archive flag in the product to true. Alternatively, if you do not want an archive to be created by default, you can set the generate archive flag in the product to false.

This setting can be overridden in the job configuration.

Skip Job Validation (Optional / Performance Tune)

GO Publisher Workflow validates each job file as it gets uploaded into the system, to ensure files are well formed and valid xml. In addition they are checked to ensure each job meets system requirements, for example to check that when a job file references a Product, that Product is actually present in the GO Publisher Workflow system. This ensures system integrity by preventing incorrect job files from entering the system. This feature is optional. For instances where job files are created and uploaded by an integrated, external system which validates itself, it is advised to turn OFF Job Validation.

Performance benefit:

Turning OFF job validation adds a performance boost to the speed at which you can create a job. However system integrity will need to be monitored if job files are not being submitted by integrated systems.

To turn OFF job validation you will need to add a tag to the gopublisher.xml file within the app server configuration.

	..
	<!-- Skip job validation on upload -->
	<gpa:skipAllJobValidation>true</gpa:skipAllJobValidation>

Start Job on Creation (Optional)

Performance benefit:

Automatically starting a job once it has been created halves the number of API calls you need to make when using the CREATE and START job API calls.

Use the <startJobOnCreate> tag within the gopublisher.xml file to start all jobs automatically after they have been created (uploaded to the system).

	..
	<!-- Start jobs on create -->
	<gpa:startJobOnCreate>true</gpa:startJobOnCreate>

Download Empty Files (Optional)

GO Publisher Workflow can limit the availability of empty output files created from a job.

Use the <downloadEmptyFiles> tag within the gopublisher.xml file to make empty output files from a job available for download after they have finished processing. This feature is optional

 

	..
	<!-- Download Empty Files -->
	<gpa:downloadEmptyFiles>true</gpa:downloadEmptyFiles>


Create a product ZIP file

You should now have a directory structured similar to the one below, and a product.xml document tailored to your product:

...
	my-product/product.xml 
		|
		fragment/fragment-config.xml [OPTIONAL]		
		metadata-templates/{template files} [OPTIONAL]		
		projects/my-project.gpp
		schemas/{schema files} [OPTIONAL]   
		schematron/{schematron files} [OPTIONAL]
		xslt/{xslt files} [OPTIONAL]

You will now need to create a ZIP file from your product directory.

Multiple Products

 If you wish to publish from multiple GO Publisher project files, you can configure additional products. The maximum number of products that you can deploy is stipulated in your licence agreement.

Further Reading

Now that you have configured them, you can Manage products using the Admin Console or API.

  • No labels