Generalization and GIS - Holism and Purpose, Conceptualization and Representation

 

Francis Harvey
Institute for Geomatics - SIRS-EPFL
CH-1015 Lausanne
Switzerland

1. Some Insight through the forgotten Roots

Generalization is often considered the sole domain of cartographers. Until recently the word generalization refered solely to cartographic production. It has effects on the data sets used in GIS, but remains distinct and disjoint from other aspects of observing, measuring, and analysing geographic phenomenon. This paper disagrees with this position and argues that generalization touches on core issues for the triad of cartography, geography, and GIS.

Improving the production of maps definitely retains its importance. In this sense, cartographic generalization remains an important and substantial area of research. With the boom of GIS, however, cartographers and geographers need also to extend their attention and assist with the development of improved solutions for the myriad of new uses GIS is being put to. It certainly is clear that with the increased use of digital data sets generalization operations have important effects beyond traditional cartographic map production concerns on data quality (Goodchild, 1996; Morehouse, 1995; Müller, Lagrange, & Weibel, 1995). These effects make a fundamental rethinking of the roles and relationships between generalization, geography, and GIS necessary.

GIS turns the traditional cartographic production process on its head. Instead of large state institutions, the equipment and software available for modest prices provides broad capabilities for producing cartographic products. Usually called maps, the extensive sets of needs and flexibility of GIS permits a staggering number of geographic visualizations. Even if still called maps, these cartographic products actually go beyond maps. Given this potential, it comes as no surprise that cartographic production for most GIS users is radically different than for national mapping agencies (NMA's). For the majority of GIS users (even for those in large institutions), we instead need to talk about flexible production. Cartography and generalization mean something very different when using GIS than in the Fordist national scale production schemes of NMA's. Suddenly, on one hand, generalization seems to be just a minute phase in GIS processing, but, on the other hand, it seems to mean so much more.

Although most perspectives on generalization from the GIS community still understand generalization as cartographic generalization, recent cartographic literature on generalization has captured the changing form of production and use of generalization (Lagrange, 1997; McMaster, 1995; Müller, et al., 1995). In this regard, generalization literature is often characterized by a holistic awareness and perspective. Following holistic principles, I take an extensive view here, asserting that the consideration of generalization effects demands consideration of underlying geographic phenomenon.The goal of this paper is to connect generalization to GIS by examining the common conceptual roots for GIS and generalization and afterwards linking them to practical uses. In this paper, geographic integration through vector overlay narrows this focus. Overlay is the quintessential GIS operation that must also address the same conceptual roots.

Like overlay, generalization is ubiquitous. Digital generalization is a crucial elements in the collection and use of digital geographic information. Unless the data is collected and prepared for rigid surveying applications, some type of generalization operation has altered not only the data but its latent information potential. The literature on generalization refers to this, and GIS research on error and accuracy also elaborates this issue. Although there is evidently acute awareness of the connection between geographic interpretation and generalization, this issue remains underdeveloped.

In a return to the roots, examining the writings of classical geographers of the twentieth century, like Hartshorne and Hettner, shows how they developed the complementary roles of generalization and geographic understanding through an analytical separation of conceptualization and representation. Although the linkages between generalization and geography are certainly as old as the disciplines themselves, their work is the starting point for this research.

Given the goal of this paper, to be effective their theoretical work must be connected to GIS. Traditional maps are becoming less and less essential for geographic understanding, digital cartographic databases and GIS are becoming ever more crucial in ways geographers and cartographers 50 years ago could scarcely imagine. Because of the ubiquity of GIS, it is essential to work on methods that provide support for the wide range of applications. Connecting the theoretical and practical work of cartographers on generalization with the practical demands of GIS applications, opens up new and extended ways for dealing with GIS data and generalization. This paper hopes to contribute to this development.

 

2. Return to the Roots

Covered by the hubris of the wide-ranging debates in geography and cartography, the GIS community has often failed to turn over the soil upon which GIS has so successfully grown. Particularly, I mean the linkage between geographic understanding and cartographic generalization that is no longer apparent. Returning to  Alfred Hettner's writings, I find a connection between generalization and geography in his work on regions. Here, he differentiates between conceptual abstraction, on one hand, and their representation and a process of generalization that alters information about geographic phenomenon to produce a more consistent representation, on the other hand. In terms of GIS, there is a connection between Hettner's approach and the role of generalization in the flexible production of cartographic materials using GIS.

To understand this connection we should review Hettner's work on geographic conceptualization and cartographic representation. This is a fundamental theoretization of the process of abstracting geographic observations to cartographic representations.

 Hettner refered to conceptualization as the process of abstracting phenomena from a collection of observations, review, discussions, and oral reports (Hettner, 1927, p. 220ff). It involves the intellectual comprehension  of "the conceptual division or analysis of nature and consists of the mental synthesis of all known facts" (Hettner, 1927, p. 220). Abstraction, the selection and simplification of the unimaginable number of elements in nature, concerns both textual and cartographic representations of geographic phenomena. The essential question in this process is the determination of a point-of-view that defines the abstraction. Hettner refers to three aspects. The first is the manifestation (Sinnf?lligkeit), in terms of size and complexity, of the phenomena and area in question. Second are teleological values in relationship to social values. Finally, scientific aspects, in terms of the relationships of phenomena to one another are considered (Hettner, 1927, pp. 224-227).

Cartographic representation involves the abstraction of geographic concepts to graphic presentations of geographic phenomena in time and space. There are three aspects to consider in the wider context of work on generalization and GIS:

  1. primitives - points, lines, and areas are the basic graphical building blocks for structuring and representing geographic conceptualizations
  2. transformations - geographic conceptualizations can be transformed from one cartographic representation to many others
  3. arrangements - cartographic representations include time, space, and attributes. These aspects must be organized to visualize geographic phenomena in time and space clearly.

Briefly, cartographers construct spatial representations using symbols based on points, lines, and areas. These are referred to as symbol types (Dent, 1993) or spatial dimensionality categories (MacEachren, 1994). Qualitative or quantitative attributes of geographic phenomena are represented through manipulation of the visual variables. Following Bertin (Bertin, 1983), they are called symbol dimensions, including shape, size, color hue, color value, color intensity, pattern orientation, pattern arrangement, and pattern texture (Morrison, 1974).  Depending on the level of measurement (nominal, ordinal, interval, or ratio) (Stevens, 1946) different symbols are employed as graphic variables.

 

 

 

 

 

Figure 1 Idealized process of abstracting geographic information for cartographic presentation

In theory, geographic conceptualization precedes cartographic representation, but in practice the creation of geographic representation intertwines aspects of both as an individual's or group's work. A person preparing a map thinks of both aspects simultaneously. Separating geographic conceptualization and cartographic representation is an analytical technique that enables the examination of GIS to understand the detailed social construction of geographic information technology.

In cartographic representation, the selection of symbol elements involves knowledge of the measurements used to describe the geographic phenomena. More importantly, representation requires a broad knowledge and understanding of the geographic phenomena and interrelationships in order to consruct clear representations. This knowledge is often irreducible. In Fordist cartographic production, the disciplinary knowledge of cartographers regulates and wields its capabilities and makes the necessary transformation between geographic conceptualizations to representations for production. Standardized production processes are sought after. In flexible cartographic production the linear production line is gone. The borders between conceptualization and representation are washed away by many factors. Still the fundamental principle of producing cartographic products remains: Cartographic representation is transformation.

Waldo Tobler formulated this insightful description of cartography as transformations between points, lines, and areas (locative) and interpolation, filtering, and generalization (substantive) (Tobler, 1979). Cartography as transformations goes beyond the traditional understanding of transformations as projections of coordinate locations on the spherical earth to planar coordinates in a projection system (Raisz, 1962). Transformations also involve conversions from one type of conceptual and representational element to another. For example, changing the representation of a town from a dot, its size indicating the population, to an area representing its actual shape is a transformation from point to area. Represented as an area, the information about the size of its population is lost, unless another graphical variable, such as shading, is used. This is not only a transformation of a graphical variable. It is just as well a transformation of the geographic concepts presented. Transformations can also involve converting a particular geographic conceptualization to a cartographic representation. Interpolating the demographic characteristics of census blocks to census tracts involves aggregation.

We must bear in mind that Hettner's work considered generalization as an essential part of geographic inquiry, but he also utterly relied on maps. The next section of this paper will go to some length to connect the pertinent theoretical thinking to more recent developments in generalization. Hettner's fundamental theoretization of the relationships between cartography and geography are still valid today. For instance, some of his general principles of cartographic generalization in 1927 will still be recognized today. His terms, translated from German, are: limiting and selecting the material, simplification, and selective removal (Hettner, 1927). He connects these terms to the semantics and purposes behind a particular representation of geographic phenomena. This applies to any cartographic activity, topographic maps as well as global overviews of shipping. Generalization, connected to the purpose driven representation of geographic phenomena, lies at the heart of cartography and geography. Hartshorne reflects this approach, quintessentially arguing, that Ōif (the geographer's) problem cannot be studied fundamentally by maps . . . then it is questionable whether or not it is within the field of geographyĶ (Hartshorne, 1939/1956, p. 249).

 

3. Conceptualization, Representation, Generalization, and Semantics

These days, Hartshorne's aphorism will still find wide-spread agreement, although some serious doubt whether geography can only be carried out using maps. This as also been a debate in geography, but here the focus is on the relationship geography/cartography. Therefore, we must consider conceptualization and representation in the broadest meaning of generalization. If we remove our focus from the medium (the map) and instead consider the process of abstracting geographic meaning to cartographic representation utilizing many different kinds of media, than semantics looms as a key issue. The key to connecting the triad cartography, geography, and GIS and articulating the importance of broadly considered generalization is the representation of meaning or semantics.

In the generalization literature it is commonly accepted that a holistic perspective is necessary to carry out generalization. Ruas and Lagrange for instance write, ŌThe authors think that in order to cope with holistic nature of the generalization process . . . . a lot of attention has to be paid to information/knowledge entailed in initial data and which influences generalizations decisionsĶ (Ruas & Lagrange, 1995, p. 87). W. Mackaness has also drawn attention to the importance of holistically considering the wide realm of geographic and cartographic issues involved in generalization (Mackaness, 1994). Following on the work from Hettner, and broadly considering the changes GIS introduces, I will argue that theses holistic concerns and perspective must also consider geographical aspects. The information presented in a cartographic product is inseparable from the geographic setting and conceptualization.

Certainly holism awakes concerns about the breadth of knowledge required. There is unanimous agreement that the complexity of the world exceeds meaningful, totalizing conceptualization or representation. Like Lewis Carroll's famous example of the problem of creating a map at a uniform scale of 1 to 1, we accept that many issues in geography are irreducible. Often we can develop a great deal of knowledge, but simultaneously the awareness of aspects we have yet to integrate with our understanding. Our knowledge and understanding is also culturally embedded and differentiated by the multitude of perspectives in society. Holism refers to an awareness of these complexities and an acknowledgement of the effects purpose has on the abstraction and representation of geographic phenomenon.

Holism is as much an attitude as an approach. It aids the understanding of irreducible phenomenon and simultaneously reminds us of the limits that purposes and perspectives lead to. Helping to understand geographic meaning, holism is part of the common roots of geography and generalization. It is essential part of a conceptual framework for current GI science and practice.

From the previous section, it is evident that geographic conceptualization is the systematic analysis of phenomena from a distinct perspective following a purpose. It leads to the creation of a system for describing the phenomena in question, their relationships, and contextual limitations. The resulting system may even explicitly reference and highlight cultural meaning and values in its conceptualization of reality. Less distinct, but more important are the implicit cultural values engendered in the system. This 'tacit knowledge' is frequently how social values and meanings are manifest and likewise mediate the creation of a system that responds to a plethora of social needs and values. Instead of maps as representations, we first have GIS databases as the technological artefacts resulting from the abstraction of meaning (Nyerges, 1991a; Nyerges, 1991b). These databases often consider conceptualization as well as representation in their construction.

The importance of purpose and perspective in creating a GIS database can be shown for an example, a highway database for vehicular pollution modeling. The linkage of geographic conceptualization and cartographic representation can be also shown. Beginning with the highway itself, pertinent aspects for pollution (width, traffic type, average speed at particular times, vehicle mix, etc) are conceptualized and the highways described following these characteristics. At the logical level, these characteristics are modeled as attributes and relationships in the database, which is created physically through the storage of the pertinent data. The cartographic representation of the highway database may guide conceptualization through constraints based on cartographic and graphic limitations in presenting the data of linear elements for a large area in a meaningful manner. It may also be an issue considered only after completion of the database.

In GIS flexible production, there is a symbiotic relationship between geographic conceptualization and cartographic representation in the technology employed. If the intervals between data points is 5 miles, then representing 1 mile sections of highway is senseless. Likewise, if the project requires a map at a particular representation or scale, conceptualization will bear this out. Crucial is the retention of meaning throughout the process of conceptualization and abstraction. Above all, we must reconsider the relationships between cartographic elements and meaning.

Following Tobler's insight that cartography consists of transformations, it is clear that they embrace the changes between types of symbols and the manipulation of attributes to create different representations. However, the question of how a geographic conceptualization is transformed to a cartographic representation is still open. Starting with cartographic elements (point, line, area) and transformations, we can describe the underlying framework in terms of time, space and attribute.

David Sinton (1978) proposed a framework that links the three aspects of geographic observation (space, time, and attribute) to a representation. Sinton's framework accounts also for traditionally non-cartographic representations of geographic phenomena such as strip charts. Sinton's work clearly ties into the conceptualization/representation duality.

This framework is elegant in its simplicity. In creating a representation one of the aspects is fixed, one is controlled and one is measured (Sinton, 1978). In a map of soil types for instance, time is fixed. In other words, time has one homogeneous value for the whole map. The attribute, soil type, only varies to the degree defined by a given set. It has been classified before the data was collected. The collection of data involves the measurement of the location of soil types. Space is measured. A strip chart that records stream heights has its space fixed, time controlled to the interval between observations, and the attribute, height, is measured.

Sinton's framework can be applied to cartographic representations to comprehend how the underlying geographic conceptualization was transformed into a cartographic representation. In looking at the average map, for instance an US highway map, the juxtaposition of multiple features makes this analysis very convoluted, as a map can include several measurement frameworks. The actual cartographic product may hide or distort the procedures involved. 

Throughout this process, the semantics of the conceptualization and representation are crucial. Theoretically, at any stage we should be able to trace the semantical history of any phenomena back to the original entity. Especially the final product must provide enough explicit and implicit meaning to facilitate communication of content.  Generalization is crucial in amalgamating this communicative requirement with the expressive purpose(s) behind the cartographic product. Going through the series of transformations from entity to object, from object to representation, it should be possible to delimit semantics.

 

4. Connecting Conceptual Issues to Generalization Practice

Alone, without practical application, this framework is interesting. With practical applications, it becomes useful. Through a consideration of semantics, scale and purpose we move towards a this goal. Specifically keeping the issues of multi-purpose and multi-participant GIS in mind, this section will illustrate the substantial role that generalization research can play for GIS. Additionally, I will argue that recognizing the differences between generalized data sets as a function of purpose is crucial to connecting generalization issues to geographic integration.

In a practical sense, semantics and purpose are the crucial issues. Because of its fundamental role in generalization practice, scale is also crucial. The previous sections have shown their relationship to fundamental geographic and cartographic thought. As the semantics of a data set are contingent on scale and purpose, these three issues are connected. All the same, considering the wide-ranging uses, generalization methods, purposes, and scales in question, an emphasis on semantics is best suited for connecting generalization operations to the geographic phenomena represented in a data set. A semantically based framework will best connect generalization issues to geographic integration aspects.

The literature on research in generalization offers many outstanding examples of how measurements can be applied to elucidate or complement semantics. Such measurements are described by numerous authors (Buttenfield, 1991; Lagrange & Ruas, 1994; Mokhtarian & Mackworth, 1992; Plazanet, 1995; Plazanet, 1997; Ruas & Plazanet, 1996). It is possible using these measurements to distinguish different types of features and feature constellations for which specific generalization functions can be carried out. These measurements can be applied as well to assess the quality and also the semantics of different part of data sets. Therefore, they can be used for determining the effects of generalization operations and open up semantic aspects of conceptualization and representation to quantification and formalization.

This work is continued by additional work on knowledge acquisition (McMaster 1995) that illustrates the roles generalization measurements have in informing judgements about semantics and data quality.

The framework of conceptualization and representation permits the identification of implicit and explicit semantic choices made during different transformations. After identification of the underlying purpose(s), it is possible in this framework to use linear measurements to quantify changes made to the data and qualify the quality of this data. The ability to describe quality in terms of geographic information processing enables a dynamic and viable means to evaluate the semantic suitability of data and its accuracy.

I now turn briefly to examine how such a  framework can be used to extend the geometric matching approach (Harvey, 1994; Harvey & Vauglin, 1996). Geometric matching extends fuzzy vector overlay by utilizing individual tolerances for each data set and prioritizing processing by accuracy. Basically, these tolerances are assigned based on the positional accuracy of each data set. The relationship of positional accuracy to semantics is often observable. On a narrow lake shore, where a railroad and trail pass within 8 meters of each other below a very dominant building, at smaller scales, feature displacement may be used to represent the varying importance of these feature. In this case, semantical information is retained, but the positional accuracy is reduced. What can help in this situation is providing the original locations of the displaced features. Ideally, this information would be embedded in the data set's data structure, but other access forms are possible. Clearly, this paper has only begun to scratch the surface. But even in this rough framework, it may provide benefits.

Additional information about data semantics, shape and linear measurements described in the generalization literature can certainly provide valuable information for either automatic processing, or enhancing the GIS tool box. As McMaster writes, recent foci of generalization research on Ō(1) more robust data structures for supporting the process, (2) continued work in algorithmic development, (3) semantic support for the generalization process, and (4) the modeling of geometric featuresĶ (1996), would definitely also benefit improved understanding of information integration in GIS, besides GIS in general.

 

5. Changing Demands - Changing Roles

Generalization and GIS are fundamentally connected. Generalization is part of an entire process of turning geographic observations into cartographic representations. To be effective in communication, in the sense of the underlying purpose, this process must provide the means for the user to associate cartographic representation with the real-world phenomenon. Basically, from a representation and knowing the purpose, the phenomena must be accessible to the user.

Going over Hettner's and more recent works about generalization and the abstraction of geographic phenomena, it seems clear that purposes distinguish the myriad roles generalization plays in GIS flexible production. This paper reconsiders the role of generalization in terms of purpose and the semantics necessitated by aiming to fulfill particular requirements and communicate cartographicly with a distinct audience. The importance of semantics is clear to geographers and cartographers alike. Connecting the concepts from these fields and transferring methods will be an important aid for addressing generalization related issues in GIS.

This is complicated because the myriad needs placed on generalization by people using GIS eludes established approaches to solution finding that characterized large cartographic production houses. The problems users face with GIS and generalization lie in the complexity of the real world. Frequently these problems are irreducible to algorithmic approaches. Complex heuristics that engage human as well as technical capabilities provide the only viable means to resolve the complex issues we face as we keep trying to understand our natural and physical worlds.

Although much as changed in the instruments and technology used in generalization, we still recognize Hettner's principles in generalization literature. It is the connection to purpose that we no longer clearly distinguish. This is also what distinguishes the role of generalization in the future of GIS. The flexible production process of using GIS requires rethinking the role generalization plays. In a holistic sense, generalization is essential the process of abstraction and repesentation. Thinking of this process and understanding the results the fundamental importance of purpose is evident. It is just as crucial to provide information about the purpose as it is scale. These are crucial information for interpreting semantics.

Clearly for the GIS community to profit from digital generalization techniques, several strategies will be required: improving tools, embedding accuracy and positional data, tighter linking of metadata, and improving algorithms. These are just a few from the gamut of opportunities. This non-exhaustive list is only intended to show that the link that connects generalization and GIS: semantics and purpose, although recognized for some time, still presents much potential for geographers and cartographers alike.

 

References

Bertin, J. (1983). Semiology of Graphics: Diagrams, networks, maps. Madison, WI: University of Wisconsin Press. 

Buttenfield, B. (1991). A rule for describing line feature geometry. In B. P. Buttenfield & R. B. McMaster (Eds.), Map Generalization: Making Rules for Knowledge Representation Essex: Longman Scientific and Technical.

Dent, B. D. (1993). Cartography. Thematic Map Design (Third Edition ed.). Dubuque, IA: Wm. C. Brown. 

Goodchild, M. F. (1996). Generalization, uncertainty, and error modeling. In GIS/LIS '96, 1 (pp. 765-774). Denver, Co: ASPRS/AAG/URISA/AM-FM.

Hettner, A. (1927). Die Geographie. Ihre Geschichte, Ihr Wesen und Ihre Methoden. Breslau: Ferdinand Hirt. 

Lagrange, J.-P. (1997). Generalization: Where are we? Where should we go? In M. Craglia & H. Couclelis (Eds.), Geographic Information Research. Bridging the Atlantic (pp. 187-204). London: Taylor & Francis.

Lagrange, J.-P., & Ruas, A. (1994). Geographic information modelling: GIS and generalisation. In T. C. Waugh & R. G. Healey (Ed.), Sixth International Symposium on Spatial Data Handling, 2 (pp. 1099-1117). Edinburgh, UK: IGU.

MacEachren, A. M. (1994). Some Truth with Maps: A Primer on Symbolization and Design. Washington D. C.: American Association of Geographers. 

Mackaness, W. A. (1994). Issues in resolving visual spatial conflicts in automated map design. In T. C. Waugh & R. G. Healey (Ed.), Sixth International Symposium on Spatial Data Handling, 1 (pp. 325-340). Edinburgh, UK: IGU.

McMaster, R. B. (1995). Knowledge acquisition for cartographic generalization: experimental methods. In J.-C. Müller, J.-P. Lagrange, & R. Weibel (Eds.), GIS and Generalization. Methodology and Practice (pp. 161-179). London: Taylor & Francis.

Mokhtarian, F., & Mackworth, A. K. (1992). A theory of multiscal, curvature-based shape representation for planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(8), 789-805.

Morehouse, S. (1995). GIS-based map base compilation and generalization. In J.-C. Müller, J.-P. Lagrange, & R. Weibel (Eds.), GIS and Generalization. Methodology and Practice (pp. 21-30). London: Taylor & Francis.

Morrison, J. (1974). A theoretical framework for cartographic symbolisation. International Yearbook of Cartography, 14, 115-127.

Müller, J.-C., Lagrange, J.-P., & Weibel, R. (Ed.). (1995).  GIS and Generalization. Methodology and Practice. London:   Taylor & Francis.

Nyerges, T. (1991a). Geographic information abstractions: conceptual clarity for geographic modeling. Environment and Planning A, 23, 1483-1499.

Nyerges, T. (1991b). Representing geographical meaning. In B. P. Buttenfield & R. B. McMaster (Eds.), Map Generalization: Making Rules for Knowledge Representation (pp. 59-85). Essex: Longman Scientific and Technical.

Plazanet, C. (1995). Measurement, Characterization and classification for automated line feature generalization. In AutoCarto 12, 4 (pp. 59-68). Charlotte, NC: ACSM.

Plazanet, C. (1997). Modelling geometry for linear feature generalization. In M. Craglia & H. Couclelis (Eds.), Geographic Information Research. Bridging the Atlantic (pp. 264-279). London: Taylor & Francis.

Raisz, E. (1962). Principles of Cartography. New York: McGraw-Hill. 

Ruas, A., & Lagrange, J.-P. (1995). Data and knowledge modelling for generalization. In J.-C. Müller, J.-P. Lagrange, & R. Weibel (Eds.), GIS and Generalization. Methodology and Practice (pp. 73-90). London: Taylor & Francis.

Ruas, A., & Plazanet, C. (1996). Strategies for automated generalization. In M. J. Krakk & M. Molenaar (Ed.), The Seventh International Symposium on Spatial Data Handling (SDH'96), 1 (pp. 6.1 - 6.18). Delft, Holland: International Geographical Union (IGU).

Sinton, D. F. (1978). The Inherent Structure of Information as a Constraint to Analysis: Mapped Thematic Data as a Case Study. In G. Dutton (Ed.), First International Advanced Study Symposium on Topological Data Structures for Geographic Information Systems, 7 (pp. 1-17). Harvard: Laboratory for Computer Graphics and Spatial Analysis.

Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677-680.

Tobler, W. R. (1979). A transformational view of cartography. The American Cartographer, 6(2), 101-106.

[Home Page] [Ext Abstract] [GI Generalization] [ICAWA99]

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.