Create Presentation
Download Presentation

Download Presentation
## Temple University – CIS Dept. CIS616– Principles of Data Management

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Temple University – CIS Dept.CIS616– Principles of Data**Management V. Megalooikonomou Spatial Access Methods (SAMs) (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU)**General Overview**• Multimedia Indexing • Spatial Access Methods (SAMs) • k-d trees • Point Quadtrees • MX-Quadtree • z-ordering • R-trees**SAMs - Detailed outline**• spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer spatial queries (like??)**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ queries)**Spatial Access Methods - problem**• Given a collection of geometric objects (points, lines, polygons, ...) • organize them on disk, to answer • point queries • range queries • k-nn queries • spatial joins (‘all pairs’ within ε)**SAMs - motivation**• Q: applications?**SAMs - motivation**traditional DB GIS age salary**SAMs - motivation**traditional DB GIS age salary**SAMs - motivation**CAD/CAM find elements too close to each other**SAMs - motivation**CAD/CAM**SAMs - motivation**eg,. std S1 F(S1) 1 365 day F(Sn) Sn eg, avg 1 365 day**SAMs: solutions**• K-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees • (grid files) Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page)**SAMs - Detailed outline**• spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees**k-d trees**• Used to store k dimensional point data • It is not used to store region data • A 2-d tree (i.e., for k=2) stores 2-dimensional point data while a 3-d tree stores 3-dimensional point data, etc.**2-d trees – node structure**• Binary trees • Info: information field • Xval,Yval: coordinates of a point associated with the node • Llink, Rlink: pointers to children • Properties (N: node): • If level N even -> • for all nodes M in the subtree rooted at N.Llink: M.Xval < N.Xval • for all nodes P in the subtree rooted at N.Rlink: P.Xval >= N.Xval • If level N odd -> • Similarly use Yvals**2-d trees: Insertion/Search**• To insert a node N into the tree pointed by T • If N and T agree on Xval, Yval then overwrite T • Else, branch left if N.Xval < T.xval, right otherwise (even levels) • Similarly for odd levels (branching on Yvals)**2-d trees – Example of Insertion**Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Sinj**2-d trees: Deletion**• Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise either Tl (left subtree) or Tr (right subtree) is non-empty • Find a “candidate replacement” node R in Tl or Tr • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • Recursion guaranteed to terminate - Why?**2-d trees: Deletion**• Finding candidate replacement nodes for deletion • Replacement node R must bear same spatial relation to all nodes in Tl and Tr as node N**2-d trees: Range Queries**• Q: Given a point (xc, yc) and a distance r find all points in the 2-d tree that lie within the circle • A: Each node N in a 2-d tree implicitly represents a region RN – If the circle (specified by the query) has no intersection with RN then there is no point in searching the subtree rooted at node N**SAMs - Detailed outline**• spatial access methods • problem dfn • k-d trees • point quadtrees • z-ordering • R-trees**Point Quadtrees**• Represent point data • Always split regions into 4 parts • 2-d tree: a node N splits a region into two by drawing one line through the point (N.xval, N.yval) • Point quadtree: a node N splits a region by drawing a horizontal and a vertical line through the point (N.xval, N.yval) • Four parts: NW, SW, NE, and SE quadrants • Q: Quadtree nodes have 4 children?**Point Quadtrees**• Nodes in point quadtrees represent regions**Point quadtrees - Insertion**Splitting of region by Banja Luka Splitting of region by Derventa Splitting of region by Toslic Splitting of region by Tuzla Splitting of region by Sinj**Point quadtrees: Deletion**• Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty • Find a “candidate replacement” node R in one of the subtrees such that: • Every other node R1 in N.NW is to the NW of R • Every other node R2 in N.SW is to the SW of R • etc… • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • In general, it may not always be possible to find such as replacement node • Q: What happens in the worst case?**Point quadtrees: Deletion**• Deletion of point (x,y) from T • If N is a leaf node easy • Otherwise a subtree (N.NW, N.SW, N.NE. N.SE) is non-empty • Find a “candidate replacement” node R in one of the subtrees such that: • Every other node R1 in N.NW is to the NW of R • Every other node R2 in N.SW is to the SW of R • etc… • Replace all of N’s non-link fields by those of R • Recursively delete R from Ti • In general, it may not always be possible to find such as replacement node • Q: What happens in the worst case? May require all nodes to be reinserted**Point quadtrees: Range Searches**• Each node in a point quadtree represents a region • Do not search regions that do not intersect the circle defined by the query**SAMs - Detailed outline**• spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees**MX-Quadtrees**• Drawbacks of 2-d trees, point quadtrees: • shape of tree depends upon the order in which objects are inserted into the tree • splits may be uneven depending upon where the point (N.xval, N.yval) is located inside the region (represented by N) • MX-quadtrees: shape (and height) of tree independent of number of nodes and order of insertion**MX-Quadtrees**• Assumption: the map is represented as a grid of size (2k x 2k) for some k • When a region gets “split” it splits down the middle**MX-Quadtrees - Insertion**After insertion of A, B, C, and D respectively**MX-Quadtrees - Insertion**After insertion of A, B, C, and D respectively**MX-Quadtrees - Deletion**• Fairly easy – why? • All point are represented at the leaf level • Total time for deletion: O(k)**MX-Quadtrees –Range Queries**• Same as in point quadtrees • One difference: • Checking to see if a point is in the circle defined by the range query needs to be performed at the leaf level (points are stored at the leaf level)**SAMs - Detailed outline**• spatial access methods • problem dfn • k-d trees • point quadtrees • MX-quadtrees • z-ordering • R-trees**z-ordering**Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points(!!) Q1: why? A: Q2: how?**z-ordering**Q: how would you organize, e.g., n-dim points, on disk? (C points per disk page) Hint: reduce the problem to 1-d points (!!) Q1: why? A: B-trees! Q2: how?**z-ordering**Q2: how? A: assume finite granularity; z-ordering = bit-shuffling = N-trees = Morton keys = geo-coding = ...**z-ordering**Q2: how? A: assume finite granularity (e.g., 232x232 ; 4x4 here) Q2.1: how to map n-d cells to 1-d cells?**z-ordering**Q2.1: how to map n-d cells to 1-d cells?**z-ordering**Q2.1: how to map n-d cells to 1-d cells? A: row-wise Q: is it good?**z-ordering**Q: is it good? A: great for ‘x’ axis; bad for ‘y’ axis**z-ordering**Q: How about the ‘snake’ curve?**z-ordering**Q: How about the ‘snake’ curve? A: still problems: 2^32 2^32