Selection: Filtering the Input of a Publishing Job
Selection affects the input of GO Publisher Workflow.
It is possible to limit the data that is retrieved from the database using certain criteria. These criteria can be specified by the following filter methods:
Each one is explained in this section, as well as how to configure a job to use the filter.
There are three ways to filter the published data based on geographic selection:
Geographic selection using Extent ID
Geographic selection using Extent Geometry
Geographic selection using a Bounding Box
Either Extent IDs, Extent Geometries, or Bounding Boxes can be specified in the job request. Specifying combinations will result in an exception.
Geographic selection using an Extent ID requires a GP_GEOGRAPHIC_SELECTION table to exist in the source database
Geographic Selection via Extent ID filters the data retrieved for a job based on geometries mapped to a unique ID.
To filter the input data based on multiple geometries, you can group geometries by creating a geographic selection table. These 'groups' can be referenced in the selection.
The below table, 'GP_GEOGRAPHIC_SELECTION' in the publishing database, contains a list of geometries mapped to a unique ID.
Using this table, it is possible to specify a filter which will use a geometry referenced by ID. This will limit the results from the dataset based on geometry.
|SouthamptonCity||SDO_GEOMETRY(2003, 81989, NULL, SDO_ELEM_INFO_ARRAY(1, 1003, 1), SDO_ORDINATE_ARRAY (442700, 109200, 442900, 110000, 443300, 111200, 443200, 111900, 444000, 112500, 443900, 113000, 442900, 113100, 441100, 113100, 440300, 111400, 441400, 110500, 442700, 109200))|
|SouthamptonCommon||SDO_GEOMETRY(2003, 81989, NULL, SDO_ELEM_INFO_ARRAY(1, 1003, 1), SDO_ORDINATE_ARRAY(441900, 113700, 441900, 114300, 442000, 114900, 442300, 115500, 441800, 115300, 440900, 115000, 441100, 114100, 441200, 113900, 441100, 113400, 441500, 113700, 441700, 113700, 441900, 113700))|
This table and its contents could be used to create a job which would limit the selected elements to those located within Southampton City.
Alternatively, the geometry fragment can be directly specified directly in the job request. This bypasses the process of looking up the Extent ID in the database, and it does not require any configuration tables.
This fragment of geometry is then used to build the filter for geographic selection.
Publish jobs with geographic selection using extent geometry are validated against GML 3.2.1 schemas. The 'gml:id' attribute in the Polygon element must therefore be present
, and the namespace for GML 3.2 (
") must be specified in the XML declaration or in the Polygon element.
Similar to geographic selection with an extent geometry, a bounding box can be directly specified in a job using a gml:Envelope.
The GML 3.2.1 schemas no longer treat gml:Envelopes as geometries, so they are handled separately in GO Publisher Workflow. Envelopes do not require a gml:id.
Feature ID Selection
Feature selection requires a GP_FEATURE_SELECTION table to exist in the source database.
Feature ID selection filters the input data for a job based on a feature ID.
To filter the input data based on multiple feature IDs, you can group the feature IDs by creating a feature selection table. These 'groups' can be referenced in the selection.
The below table shows features in TABLE1 with an airport_id of 8535 and 8536 as part of a group called 'Airport'. When a publishing job selection filter is specified with 'Airport' as the key, 3535 and 8536 will be the only two features used in processing.
Attribute selection can be used to filter the data returned by the database by an attribute value. This will select records from any mapped table where a column with the specified name is present and where the value of the column is equal to the value(s) provided.
If you are modelling aviation data and have 4 tables: RUNWAY, APRON, BUILDING, TREE. All these tables have an AIRPORTID column which signifies the airport that the record belongs to.
In this scenario you can use attribute selection to restrict the incoming data to specified airports. The XML below shows how to configure a job to only select records which have the AIRPORTID of SOU
Temporal selection filters the data retrieved from the source database tables based on date columns.
You will need to ensure that the Date for temporal queries (see Edit Table Properties) is set in your GO Publisher project for each table from which you wish to publish data. The 'Date for temporal queries' field indicates which column in the table contains date attributes.
As shown in the below example, you can define a date range in your temporal selection job. When you process the job, the filter will be applied to the column specified in the Date for temporal queries (see Edit Table Properties) in your GO Publisher project. Only data containing dates which fall within the range indicated in your publishing job will be returned.
The below example job will publish objects with data attributes between 01 Jan 2012 and 01 Feb 2012. Features with date attributes equaling 01 Jan 2012 and 01 Feb 2012 will be included.
Selection by parameter
The schema translation mapping model of GO Publisher allows a pre-selection of data using custom SQL filters. You can apply a custom SQL filter to your project in GO Publisher Desktop using the SQL Filters Tab, and specify selection parameter names.
When a parameter name is stated within curly brackets, you can then specify a value for selection in the GO Publisher Workflow publishing job.
Snowflake's training "Treasure Island" data contains search instructions to find treasure on a fictional island.
If the search instructions had distance properties with different units of measurement (such as meters, #paces, and yards), you could choose to only publish the distance data stored in '#paces' using a selection by parameter publishing job in GO Publisher Workflow.
The first step is to specify the unit of measure parameter name using a custom SQL filter in your GO Publisher project. In this example, the WHERE clause defines the parameter name as 'UNIT' in curly brackets.
Example of Custom SQL for Oracle
You can then state the parameter name and the value you'd like to select in the publishing job. In this example, we just want to publish distance data that has '#paces' as the unit of measurement.
In the <gpa:parameter> tag:
- <gpa:name> will be UNIT (as indicated in the customer SQL in the project)
- <gpa:value> will be #paces
The resulting SQL filter applied to the source database by GO Publisher Workflow would be:
The data published by GO Publisher Workflow will only contain the search instruction distance properties with a '#paces' unit of measure.
You can use parameter filters on multiple columns at once.
Interested in splitting your published data into multiple files? See Data Chunking for more information.