Answer
Description
ESRI Shape (aka Shapefile) is one of the most commonly used geographic formats. It is well supported by almost every GIS product.
A Shape dataset physically consists of several files. The most important ones (required) are:
- .shp contains the actual geometry
- .dbf is a dBaseIII file containing the attributes
- .shx contains the link between the .shp and the .dbf
In addition to these files, a .prj file is also often found, this contains the projection information and a .xml file that contains the metadata information.
All files that make up one Shapefile should have the same filename (e.g. myfile.shp, myfile.shx, myfile.dbf and so on).
Intricacies
Geometry Types
Shapefile only supports one type of geometry per dataset. ie a dataset can consist of lines or polygons, but not both. This has to be specified in the output feature's properties. Attempting to write any other type of geometry to the file will cause FME to fail.
Text
Shape does not support text features. Such features will be converted to point features with attributes on writing Shape.
Number Field Width
Q) When I create a workspace using a Shape dataset FME mis-reads the width of Number fields. For example, a number(7,2) field is read as number(10,2) - Why?
A) This is actually an intended feature, required because a DBF file does not rigorously enforce the "type" of a number field.
i.e. a number(7,2) which ought to be of the format 1234.12, for example…
3333.33
…could also hold a number of the format 1234567, for example…
3333333
This is very non-standard behaviour, and writing 3333333 to a different format (eg Oracle) using a 7.2 definition could cause an error to occur.
To accommodate such wackiness we therefore "extend" the definition when we read these columns, since most formats would consider "3333333" with 2dp to be number(10,2).
Unrecognized Coordinate Systems
Check out the information on
ESRI Exceptions for information on what to do when an ESRI coordinate system is encountered that FME does not recognize, or ArcGIS encounters an FME-generated coordinate system which it doesn’t recognize.
FAQ
Q) How do I set the colour of my Shape features?
A) You don't. The Shape format does not support style information (color, line width etc). If your features are appearing in colour when you view them in ArcMap, Universal Viewer or another GIS application it is because the application has arbitrarily assigned a colour to each feature type, not because the colour was stored in the Shape data.
Also, you can't preserve colour by opening a source dataset directly in ArcGIS using FME extensions.
Q) Can I create a Spatial Index for a Shape dataset with FME?
A) FME does not have an option to create Shape spatial indexes. However, this sort of thing is possible by using ESRI's Python geoprocessor in an FME shutdown script. See below for an example.
Q) Can I translate a Shape Dataset more than 2Gb in size?
A) If you have a source Shape Dataset that is more than 2GB it is not a valid Dataset and probably wasn't created with ESRI software. Also, writing an oversized Shape dataset would be invalid. There's several reasons why...
-
A limitation of the operating system and system architecture. Internal pointers between the shape file index (shx) and shape file data (shp) are stored as signed 32 bit integers, which cap them at 2GB.
-
DBF files have a 2GB size limit. Nothing can be done about that (other than to reduce the size or # of columns, if you must keep all your records).
-
The Shape header has information on the size of the file. This info is held as a signed integer meaning we can't write a Shape dataset >2Gb because the header would be incorrect.
Having said that, because indexes are measured in "words" we should be okay to write a 4GB file with FME. FME will also read such a dataset back. However, it may not function correctly on other application, and on some 32 bit operating systems there isn't a way to jump to a location in a file > 2GB from the beginning.
Q) Can I explicitly set the name of the output shp file at run-time?
A) Yes, using an
AttributeSetter and
feature type fanout. The example below demonstrates how.
Q) What is meant by the warning - (filename) doesn’t follow Shape spec. Content length is wrong. FME is performing auto-compensation to correct?
A) This message means there is a mismatch of the record lengths in the Shape spatial data (.shp) file.
The .shp file is made up of a number of variable-length records the positions of which are defined in the file header. I'm told that the position should (according to Shape specifications) be defined as the position in the file AFTER the header.
Apparently some applications write Shape files and incorrectly include the length of the header in the records' positions. So FME starts to read the data, finds the records aren't in the positions they should be for a proper Shape file and reports the problem. Because we know the likely reason why, FME can auto-compensate and make a stab at adjusting the data to bring it back into line.
The other potential cause is that the records themselves are not defined properly (eg written as 8-bit bytes instead of 16-bit words). Again FME is able to auto-compensate for such problems.
Check with your data supplier to find out how it was created.
If FME doesn't stop completely I’d take that as a sign it has worked around the problem and properly adjusted the file (but obviously that has to be your decision).
Q) I have a Shape file whose name matches the Feature Type but the Unexpected Input Remover still throws away my data. Why?
A) Check the case of the filename.
roads.shp is not the same as
Roads.shp or
ROADS.shp
Examples
Creating a Spatial Index
This workspace in the attached zip file (CreatingShapeIndex.zip) illustrates the steps to implement a spatial index using ESRI's Python geoprocessor are:
Force FME to use the Python expected by Esri: Set an OS environment variable “FME_PYTHON_VERSION 2.7” to specify the version of python to use. This forces fme to use Python 2.7, regardless of what your computer's default Python version is.
Modify the shutdown script to meet your needs: Open shutdown.py in a text editor. If you don't need spatial indexing, you can comment out line 57 (add a # to the beginning of the line), or delete the spatialIndex function. This function will extract the paths to every shapefile in the workspace, and add a spatial index to all of them. If you only want to index certain shapefiles, you can comment out line 57, and uncomment line 60 (remove the # form the beginning of the line) and replace "path_to_shape.shp" with the path to the shapefile you want to index (remember to escape your "\"s. C:\Temp should be C:\\Temp).
If you need attribute indexing, you can modify line 67. You will need to uncomment the line by removing the #. Replace path_to_shape.shp with the path to your shapefile (remember to escape your "\"s. C:\Temp should be C:\\Temp) should have the path to your shapefile, and the second set of “”s should have the name of the attribute you wish to index.
Add the script to your workspace: Now you can either copy and paste the script into the Shutdown Python Script, or copy the shutdown.py file to the same directory as your workspace, and set the shutdown python script to “import shutdown”.
Setting the Output Filename
In Shape format the name of an output file is the name of the feature type - so (rather than being user-defined) takes the name of the source feature type by default.
However, some users may wish to explicitly set the name of the feature type being written to on a case-by-case basis. The workspace in the attached file ShapeNamePublishing.zip shows how this can be done.
An attribute is created and the value published so that the user can set it at run-time. The destination simply has a Feature-Type fanout applied to fanout by the user-defined attribute: for example set the attribute to myfilename gives an output of myfilename.shp
You should use caution since if data is not all the same geometry type there will be problems writing to a single shp file.