Compare the Values in FME: Techniques for Feature Filtering

When working with data streams in FME, a common requirement is to Compare The Values of attributes, either within the same feature or across different features, to make informed decisions about data processing. This often involves filtering features based on these comparisons to ensure only the most relevant data proceeds through your workflow. This article explores several techniques to effectively compare values and filter features in FME, addressing scenarios from simple attribute comparisons within a single feature to more complex comparisons across feature streams.

One initial approach to compare values within a feature involves using the AttributeManager transformer combined with conditional logic. This method is particularly useful when you need to determine which of two attributes holds a higher value and then use that information to assign a new attribute value or make a filtering decision downstream.

For example, consider a scenario where you have two attributes, Count1 and Count2, representing counts from different sources for the same entity (e.g., fruit types). You want to keep only the highest count value for each entity. The AttributeManager’s conditional value setting allows you to directly compare the values of Count1 and Count2 and set a new attribute, say HighestCount, to the larger of the two.

Alt text: Configuration panel of the AttributeManager transformer in FME showing the conditional value setting. The condition is set to compare Count1 and Count2 attributes. If Count1 is greater than Count2, the output attribute value is set to Count1. If Count2 is greater than Count1, the output attribute value is set to Count2. If they are equal, no change is made.

In this setup, within the AttributeManager, you would configure a new attribute, perhaps named “HighestCount”. By selecting ‘Conditional Value’ in the Attribute Value dropdown, you can define a condition that compares the values. The expression would be structured to check if Count1 is greater than Count2. If true, the value of HighestCount is set to Count1. Otherwise, if Count2 is greater than Count1, HighestCount is set to Count2. This effectively compares the values of the two count attributes and retains the maximum.

However, the Tester transformer, while powerful for filtering, operates on individual features. It is designed to compare attribute values within a single feature against a specified value or another attribute of the same feature. It cannot directly compare values across different features or feature streams.

If your goal is to filter features based on the highest value of an attribute across a set of features, a different approach is needed. For instance, if you want to select only those features that possess the maximum value for a specific attribute, such as ‘ATTRIBUTE_1’, you need to first determine what that maximum value is. This is where the StatisticsCalculator transformer becomes invaluable.

The StatisticsCalculator can process a stream of features and compute various statistics, including the maximum value of a chosen attribute. You would configure the StatisticsCalculator to analyze ‘ATTRIBUTE_1’ and calculate its maximum value. The StatisticsCalculator outputs a Summary feature containing the calculated statistics.

To then filter your original features based on this maximum value, you would merge the Summary feature from the StatisticsCalculator back to your original feature stream using a FeatureMerger. Merge the features unconditionally to ensure every original feature receives the summary attributes, including the maximum ‘ATTRIBUTE_1’ value.

Finally, use a Tester transformer after the FeatureMerger. In the Tester, you can now compare the value of ‘ATTRIBUTE_1’ in each original feature to the ‘maximum ATTRIBUTE_1’ value obtained from the Summary feature (which is now an attribute on each merged feature). Set the test condition to check if ‘ATTRIBUTE_1’ is equal to the maximum value. Features that satisfy this condition (i.e., those with the maximum ‘ATTRIBUTE_1’ value) will pass through the ‘Passed’ port, effectively filtering for features with the highest attribute value.

Alt text: Screenshot of an FME workspace in a visual dataflow interface. It depicts a data stream entering a FeatureCounter transformer, then branching into two paths. One path connects to a Tester transformer. The other path merges back into the Tester’s path via a FeatureMerger transformer. This setup is designed for comparing feature counts and filtering data based on the comparison.

Another scenario involves comparing the number of features in different data streams. If your objective is to determine which stream has a higher feature count and potentially direct processing based on this comparison, the FeatureCounter transformer is the appropriate tool. The FeatureCounter tallies the features within a stream and outputs the count as an attribute (e.g., _feature_count). You can use multiple FeatureCounter transformers on different streams, and then use techniques similar to those described above (FeatureMerger and Tester or AttributeManager with conditional logic) to compare these count values and control the subsequent flow of your FME workspace.

In summary, FME offers a range of transformers to compare the values of attributes and feature counts, enabling sophisticated data filtering and workflow control. Whether you are comparing attributes within a single feature using AttributeManager and Tester, filtering based on maximum values across features with StatisticsCalculator, or comparing feature counts using FeatureCounter, understanding these techniques is crucial for effective data processing in FME. Choosing the right approach depends on the specific nature of your data comparison task and the desired outcome in your FME workflow.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *