Tuesday, January 25, 2011

Address Locator Style: Introduction

Vered asked me for help in creating new address locator styles in ArcGis 10.

 

Have you ever learned Programming Languages course? I have. Address Locator Styles are just a formal representation of the language of the Address Locator. It’s like in C# lexical and syntactic grammar which define how to build a comment (either with // or /*).

 

What is geocoding?

Transforming a description of a location (coordinates or address) to a feature on the map (“geographic features with attributes”).

what is Address Locator Style?

The interface for the Address Locator, it defines the parameters and the return value for the Address Locator.

what is Address Locator?

“Main tool for geocoding in ArcGIS”, we use the Address Locator as a geocode service in our ArcGis server (our Silverlight application uses it to search for an address).

The Address Locator uses a scoring system to check the input sentence with the possible output features:

ESRIAddress-scoring

(Not mine, ESRI’s)

The scoring is done by the Locator Styles that are in:

Desktop:  C:\Program Files (x86)\ArcGIS\Desktop10.0\Locators
Server:    C:\Program Files (x86)\ArcGIS\Server10.0\Locators
Engine:    C:\Program Files (x86)\ArcGIS\Engine10.0\Locators

 

Reading the Locator Style files:

I am going through this guide by ESRI.

(The XML viewed here is a copy of USAddress.lot.xml)

The Address Locator Style name and description:

  1. <?xml version="1.0" encoding="utf-8"?>
  2. <?xml-stylesheet type="text/xsl" href="LocatorStyle.xslt"?>
  3. <locators xsi:noNamespaceSchemaLocation="LocatorStyle.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  4.   <locator>
  5.     <name>Dll Shepherd Address</name>
  6.     <desc>Locator style for Dll Shepherd addresses</desc>
  7.     <version>10</version>

Inputs

Define “how the Geocode Addresses geoprocessing tool appears and functions for the style”

  1. <inputs>
  2.   <default_input name="Single Line Input" length="100" grammar_ref="Location">
  3.     <caption xml:lang="en">Full Address</caption>
  4.     <std_elt standard="FGDC">CompleteAddress</std_elt>
  5.     <recognized_name>Address</recognized_name>
  6.     <recognized_name>Addr</recognized_name>
  7.     <recognized_name>Address_1</recognized_name>
  8.     <recognized_name>Customer_Address</recognized_name>
  9.   </default_input>
  10.   <input name="Street" length="100">
  11.     <caption xml:lang="en">Street or Intersection</caption>
  12.     <recognized_name>Address</recognized_name>
  13.     <recognized_name>Addr</recognized_name>
  14.     <recognized_name>Address_1</recognized_name>
  15.     <recognized_name>Customer_Address</recognized_name>
  16.   </input>

Address Locator Style – Nice view:

Open the XML with an Internet Browser. It can be viewed as a nicer XML format:

address-locator-style-nicer-xml-view

Clicking on SpatialOperator (on the right of Location) will move the view to the location of SpatialOperator in the XML view (or at least they are supposed to but it doesn’t work if the the tree isn’t open on that element).

 

Grammar:

Defines “address elements known to the locator and their possible usage in an address”.

Top level elements:
  1. <section desc="Top level elements">

address-locator-style-top-level-elements

In XML:

  1. <def name="Location">
  2.   <alt>
  3.     <elt ref="FullAddress" separator_list=".,;" />
  4.   </alt>
  5.   <alt ref="Coordinates"/>
  6.   <alt ref="SpatialOperator"/>
  7. </def>

Element on the left and on the right are component elements starting with a colon(':'), a pipe('|') seperates the components and a semicolon(';') ends the options set. The component elements state the options foe the element.

For Example: Location is the element, FullAddress is the first option and SpatialOperator the last option. Location can be either FullAddress, Coordinates or SpatialOperator.

Superscripts are scores for the Address Locator:

address-locator-style-scores

In XML:

  1. <def name="NormalAddress">
  2.   <alt>
  3.     <elt ref="House" weight="20"/>
  4.     <elt ref="FullStreetName" weight="60" pre_separator="required"/>
  5.     <elt ref="OptionalUnit" weight="0"/>
  6.   </alt>
  7. </def>

Object in braces exposes how the engine uses a function:

Address-Locator-Style-function-engine-usage

In XML:

  1. <def name="Postal">
  2.   <alt>
  3.     <elt ref="GenZIP" search_context="ZIPSearch"/>
  4.     <result>
  5.       <search_value ref="ZIPSearch"/>
  6.     </result>
  7.   </alt>
  8. </def>

Commentary is regular text after the braces

Address-Locator-Style-commentry

For example: format: { ['Distance'] ['Units'] " bearing " ['Bearing'] " from " ['Match_addr'] }

In XML:

  1. <def name="DirectedOffset">
  2.   <alt>
  3.     <elt ref="positiveRealNumber" weight="0"/>
  4.     <elt ref="LinearUnits" weight="0"/>
  5.     <elt ref="Bearing" weight="0"/>
  6.     <elt ref="From" weight="0"/>
  7.     <elt ref="Location"/>
  8.     <result tag="SpatialOperator">
  9.       <method ref="directed_offset">
  10.         <parameter>
  11.           <component_value component="_1"/>
  12.         </parameter>
  13.         <parameter>
  14.           <component_value component="_2"/>
  15.         </parameter>
  16.         <parameter>
  17.           <component_value component="_3"/>
  18.         </parameter>
  19.         <parameter>
  20.           <component_value component="_5"/>
  21.         </parameter>
  22.       </method>
  23.       <format>
  24.         <component_value component="Distance"/>
  25.         <component_value component="Units" pre_separator=" "/>
  26.         <value xml:space="preserve"> bearing </value>
  27.         <component_value component="Bearing"/>
  28.         <value xml:space="preserve"> from </value>
  29.         <component_value component="Match_addr" record="1"/>
  30.       </format>
  31.     </result>
  32.   </alt>
  33. </def>

In grey are behaviors fallback situation:

Address-Locator-Style-behaviors-fallback-situation

and they are relevant only for the component elements. For example here they are relevant only for unitAndNumber but if there was another component element it wouldn’t be relevant for it.

In XML:

  1. <def name="OptionalUnit">
  2.   <alt/>
  3.   <alt fallback_score="75">
  4.     <elt ref="unitAndNumber"/>
  5.   </alt>
  6. </def>

Separator hints:

Address-Locator-Style-separator-hints

(Not mine, ESRI’s)

Usage:

Address-Locator-Style-FullStreetName

TODO: Still can’t understand this voodoo

Interpret the above graphic as meaning that a FullStreetName may be made up as
■ prefix + pre_type_no_sthwy + StName + suftype + suffix entirely separated, or
■ Prefix + pre_type_sthwy + OptHyphen + StName + suftype + suffix, where StName may be optionally concatenated with a preceding hyphen after pre_type_sthwy
The first form might be like "North Avenue Walnut Road East," and the second like
"North Road Number 6 West" or "I-10."

In XML:

  1. <def name="FullStreetName">
  2.   <alt>
  3.     <elt ref="prefix" weight="5" stan_weight="11" pre_separator="required" post_separator="required"/>
  4.     <elt ref="pre_type_no_sthwy" match_as="pretype" weight="6" stan_weight="1000" />
  5.     <elt ref="StName" weight="70" stan_weight="10" pre_separator="required" post_separator="required"/>
  6.     <elt ref="suftype" weight="7" stan_weight="1000"/>
  7.     <elt ref="suffix" weight="5" stan_weight="15" pre_separator="required"/>
  8.   </alt>
  9.   <alt fallback="true">
  10.     <elt ref="prefix" weight="5" stan_weight="11" pre_separator="required" post_separator="required"/>
  11.     <elt ref="pre_type_sthwy" match_as="pretype" weight="6" stan_weight="2000" />
  12.     <elt ref="OptHyphen" weight="0"/>
  13.     <elt ref="StName" weight="70" stan_weight="10" pre_separator="optional" post_separator="required"/>
  14.     <elt ref="suftype" weight="7" stan_weight="1000"/>
  15.     <elt ref="suffix" weight="5" stan_weight="15" pre_separator="required"/>
  16.   </alt>
  17. </def>

Aliases:

“Aliases are commonly recognized values for elements and may be sets of alternate literal values”:

Address-Locator-Style-city-aliases

In XML:

  1. <alias_list name="CityAliases">
  2.   <alias_def>
  3.     <alt>acres</alt>
  4.     <alt>acr</alt>
  5.   </alias_def>

Values starting with underscore ('_') are stated in the prefix/suffix section.

Address-Locator-Style-prefix-suffix

In XML:

  1. <alias_def>
  2.   <alt ref="_bch"/>
  3. </alias_def>

 

Mapping Schemas

“Defines how reference data logically relates to grammar elements”.

Address-Locator-Style-mapping-schema

Names are not used and might be removed in the next version (at least according to ESRI).

In XML (partly):

  1. <mapping_schema name="SingleAddress" geom_type="point">
  2.   <desc>Single house addresses (points)</desc>
  3.   <desc>For point address datasets.</desc>
  4.   <fields>
  5.     <field name="Shape" type="geometry">
  6.       <desc>Shape field</desc>
  7.     </field>
  8.     <field name="ID">
  9.       <desc>Unique ID field</desc>
  10.     </field>
  11.     <field name="House" grammar_ref="House">
  12.       <desc>House number</desc>
  13.       <preferred_name>HN</preferred_name>
  14.       <preferred_name>ADDRESS</preferred_name>
  15.     </field>

TODO: unknown weird field definition:

  1. <field name="StreetName" grammar_ref="StName">
  2.   <desc>Street Name</desc>
  3.   <preferred_name>STNAME</preferred_name>
  4.   <preferred_name>STREET_NAME</preferred_name>
  5.   <scoring_method ref="calculate_score">
  6.     <init_properties>
  7.       <prop name="CharacterTable">scoring</prop>
  8.     </init_properties>
  9.     <parameter>
  10.       <input_value />
  11.     </parameter>
  12.     <parameter>
  13.       <field_value ref="StreetName" />
  14.     </parameter>
  15.   </scoring_method>
  16. </field>

Non required field (the required or non required doesn’t show on the nice view):

  1. <field name="Rank" required="false">
  2.   <desc>Specifies rank (primary or secondary) for alternate records of the same geometry</desc>
  3.   <preferred_name>Rank</preferred_name>
  4. </field>

The default filter:

  1. <selection_clause dbms="default">
  2.   <field_ref ref="StreetName"/> &lt;&gt; '' AND
  3.   <field_ref ref="StreetName"/> &lt;&gt; ' ' AND
  4.   UPPER(<field_ref ref="StreetName"/>) &lt;&gt; 'UNNAMED' AND
  5.   UPPER(<field_ref ref="StreetName"/>) &lt;&gt; 'UNNAMED STREET'
  6. </selection_clause>

Defining the index structure on the data:

  1. <index>
  2.   <dictionary ref="House"/>
  3.   <dictionary ref="PreDir"/>
  4.   <dictionary ref="PreType"/>

Defines a lookup structure between the fields specified, making search faster:

  1. <relationship>
  2.   <field_ref ref="StreetName"/>
  3.   <field_ref ref="City"/>
  4.   <field_ref ref="State"/>
  5. </relationship>

Define a reverse relationship so form ID we can get ZIP:

  1. <reverse_relationship>
  2.   <field_ref ref="ZIP"/>
  3.   <field_ref ref="ID"/>
  4.   <field_ref ref="User_fld"/>
  5. </reverse_relationship>

TODO: WHY bother…

  1. <properties>
  2.   <prop name="StorageSegmentSizeKB" type="Int">128</prop>
  3.   <prop name="supportsEmptyHouseNumber" type="Boolean">false</prop>
  4.   <prop name="supportsIntersections">false</prop>
  5.   <prop name="StoreStandardizedRefData">true</prop>
  6.   <prop_list name="BatchPresortInputs">
  7.     <value>State</value>
  8.     <value>ZIP</value>
  9.     <value>City</value>
  10.   </prop_list>
  11. </properties>

TODO:

  1. <outputs>
  2.   <output component="Shape" type="geometry"/>
  3.   <output component="Status" candidate_mode="false" length="1"/>
  4.   <output component="Score" type="float" decimal_digits="2"/>
  5.   <output component="Match_addr" length="120"/>
  6.   <output ref="House" batch_mode="false" length="12"/>
  7.   <output ref="PreDir" batch_mode="false" length="6"/>
  8.   <output ref="PreType" batch_mode="false" length="6"/>
  9.   <output ref="StreetName" batch_mode="false" length="32"/>
  10.   <output ref="SufType" batch_mode="false" length="6"/>
  11.   <output ref="SufDir" batch_mode="false" length="6"/>
  12.   <output ref="City" batch_mode="false" length="20"/>
  13.   <output ref="State" batch_mode="false" length="2"/>
  14.   <output ref="ZIP" batch_mode="false" length="5"/>
  15.   <output name="Ref_ID" ref="ID" type="fromdata" selector="WriteReferenceIDField"/>
  16.   <output name="X" component="X" type="float" selector="WriteXYCoordFields"/>
  17.   <output name="Y" component="Y" type="float" selector="WriteXYCoordFields"/>
  18.   <output ref="User_fld" type="string" length="120" selector="WriteAdditionalOutputFields" />
  19.   <!--output name="Z" component="Z" type="float" selector="WriteXYCoordFields"/-->
  20.   <output component="Addr_type" length="20"/>
  21.   <output component="Match_time" type="float" selector="ShowElapsedTime"/>
  22. </outputs>

Reverse geocoding (WTF?):

  1. <reverse_geocoding>
  2.   <reverse_geocoding_method name="Address">
  3.     <outputs>
  4.       <output component="Shape" type="geometry" />
  5.       <output name="Street" length="120" >
  6.         <format>
  7.           <field_value ref="House" nulls="0"/>
  8.           <field_value ref="PreDir" pre_separator=" "/>
  9.           <field_value ref="PreType" pre_separator=" "/>
  10.           <field_value ref="StreetName" pre_separator=" "/>
  11.           <field_value ref="SufType" pre_separator=" "/>
  12.           <field_value ref="SufDir" pre_separator=" "/>
  13.         </format>
  14.       </output>
  15.       <output ref="City" length="20"/>
  16.       <output ref="State" length="2"/>
  17.       <output ref="ZIP" length="5"/>
  18.       <output component="Match_time" type="float" selector="ShowElapsedTime"/>
  19.     </outputs>
  20.   </reverse_geocoding_method>
  21.   <reverse_geocoding_method name="MGRS" use_spatial_search="false">
  22.     <method ref="reverse_MGRS"/>
  23.     <outputs>
  24.       <output component="Shape" type="geometry" />
  25.       <output component="MGRS" length="12" />
  26.       <output component="Match_time" type="float" selector="ShowElapsedTime"/>
  27.     </outputs>
  28.   </reverse_geocoding_method>
  29. </reverse_geocoding>

standardization –> build: “The build section specifies how street names are parsed into the locator index when you create a locator.”

  1. <standardization>
  2.   <build>
  3.     <use_standard_values>false</use_standard_values>
  4.     <format>
  5.       <field_value ref="PreDir" />
  6.       <field_value ref="PreType" pre_separator=" "/>
  7.       <field_value ref="StreetName" pre_separator=" "/>
  8.       <field_value ref="SufType" pre_separator=" "/>
  9.       <field_value ref="SufDir" pre_separator=" "/>
  10.     </format>
  11.     <grammar>
  12.       <alt>
  13.         <elt ref="FullStreetNameForStd" />
  14.       </alt>
  15.     </grammar>
  16.     <outputs>
  17.       <output name="PreDir" component="prefix" length="12"/>
  18.       <output name="PreType" component="pretype" length="40"/>
  19.       <output name="StreetName" component="StName" length="60"/>
  20.       <output name="SufType" component="suftype" length="40"/>
  21.       <output name="SufDir" component="suffix" length="12"/>
  22.     </outputs>
  23.   </build>

standardization –> tool:  “The tool section controls how the Standardize Addresses tool operates.”

  1. <tool>
  2.   <use_standard_values>true</use_standard_values>
  3.   <grammar>
  4.     <alt>
  5.       <elt ref="_HouseNum" match_as="House" stan_weight="11"/>
  6.       <elt ref="FullStreetNameForStd" />
  7.       <elt ref="OptionalUnit" stan_weight="1"/>
  8.     </alt>
  9.     <alt>
  10.       <elt ref="OptionalUnit" stan_weight="1"/>
  11.       <elt ref="_HouseNum" match_as="House" stan_weight="11"/>
  12.       <elt ref="FullStreetNameForStd" />
  13.     </alt>
  14.   </grammar>
  15.   <outputs>
  16.     <output name="ADDR_HN" alias="HouseNum" component="House" length="12"/>
  17.     <output name="ADDR_PD" alias="PreDir" component="prefix" length="12"/>
  18.     <output name="ADDR_PT" alias="PreType" component="pretype" length="40"/>
  19.     <output name="ADDR_SN" alias="StreetName" component="StName" length="60"/>
  20.     <output name="ADDR_ST" alias="SufType" component="suftype" length="40"/>
  21.     <output name="ADDR_SD" alias="SufDir" component="suffix" length="12"/>
  22.   </outputs>
  23. </tool>
Reference Data Styles

Address-Locator-Style-Reference-Data-Styles

ESRI-AltStreet-and-Data-Source-elements

(Not mine, ESRI’s)

In XML (partly):

  1. <ref_data_style>
  2.   <name>Single House</name>
  3.   <desc>US Single House Addresses</desc>
  4.   <table_roles>
  5.     <table_role name="Primary">
  6.       <display_name>Primary Table</display_name>
  7.       <desc>Address feature class</desc>
  8.       <field_roles>
  9.         <field_role name="Primary.Shape" is_geometry="true">
  10.           <display_name>Geometry</display_name>
  11.           <preferred_name>Shape</preferred_name>
  12.           <preferred_name>Feature</preferred_name>
  13.         </field_role>
  1.   </field_roles>
  2. </table_role>
  3. <table_role name="AltStreet" required="false">
  4.   <display_name>Alternate Name Table</display_name>
  5.   <desc>Alternate Streets (Optional)</desc>
  6.   <field_roles>
  7.     <field_role name="AltStreet.ID" required="true">
  8.       <display_name>JoinID</display_name>
  9.       <preferred_name>ALTNAME_ID</preferred_name>
  10.       <preferred_name>JOIN_ID</preferred_name>
  11.       <preferred_name>JOINID</preferred_name>
  12.       <preferred_name>ID</preferred_name>
  13.       <preferred_name>OBJECTID</preferred_name>
  14.     </field_role>
  1.   </table_role>
  2. </table_roles>
  3. <data_source type="indexed">
  4.   <mapping_schema ref="SingleAddressPolygonCentroid" />
  5.   <queries>
  6.     <query>
  7.       <tables>
  8.         <table role_ref="Primary" />
  9.       </tables>
  10.       <fields>
  11.         <field ref="ShapePtPoly" field_role_ref="Primary.Shape" />
  12.         <field ref="ID" field_role_ref="Primary.ID" />
  13.         <field ref="House" field_role_ref="Primary.House" />
  14.         <field ref="PreDir" field_role_ref="Primary.PreDir" />
  15.         <field ref="PreType" field_role_ref="Primary.PreType" />
  16.         <field ref="StreetName" field_role_ref="Primary.StreetName" />
  17.         <field ref="SufType" field_role_ref="Primary.SufType" />
  18.         <field ref="SufDir" field_role_ref="Primary.SufDir" />
  19.         <field ref="City" field_role_ref="Primary.City" />
  20.         <field ref="ZIP" field_role_ref="Primary.ZIP" />
  21.         <field ref="State" field_role_ref="Primary.State" />
  22.         <field ref="User_fld" field_role_ref="Primary.User_fld" />
  23.         <field ref="Alt_JoinID" field_role_ref="Primary.Alt_JoinID" />
  24.       </fields>
  25.       <join_clause />
  26.       <selection_clause />
  27.     </query>
Output Formats:
  1. <output_formats>
  2.   <format_definition name="format_intersections">
  3.     <field_value ref="PreDir" record="1"/>
  4.     <field_value ref="PreType" record="1" pre_separator=" "/>
  5.     <field_value ref="StreetName" record="1" pre_separator=" "/>
  6.     <field_value ref="SufType" record="1" pre_separator=" "/>
  7.     <field_value ref="SufDir" record="1" pre_separator=" "/>
  8.     <value xml:space="preserve"> &amp;</value>
  9.     <field_value ref="PreDir" record="2" pre_separator=" "/>
  10.     <field_value ref="PreType" record="2" pre_separator=" "/>
  11.     <field_value ref="StreetName" record="2" pre_separator=" "/>
  12.     <field_value ref="SufType" record="2" pre_separator=" "/>
  13.     <field_value ref="SufDir" record="2" pre_separator=" "/>
  14.     <field_value ref="City" record="1" pre_separator=", "/>
  15.     <field_value ref="State" record="1" pre_separator=", "/>
  16.     <field_value ref="ZIP" record="1" pre_separator=" "/>
  17.   </format_definition>

 

 

TODO: customize spelling mistakes

TODO: customize parameters for address locator to – City, Street, house number

 

 

Resources:

What is geocoding?

Essential geocoding vocabulary

Customizing ArcGIS 10 locators (An Esri Geocoding Technical Paper)

 

Keywords: ESRI, ArcGis Server, geocode, geocoding, address locator