;+
; NAME:
; MATCH_2D
;
; PURPOSE:
;
; Perform a match between two sets of 2D coordinates, finding
; the closest coordinate match (in the Euclidean sense) to the
; search set, within some search radius.
;
; CALLING SEQUENCE:
; match=MATCH_2D(x1,y1,x2,y2,search_radius,MATCH_DISTANCE=)
;
; INPUTS:
;
; x1,y1: The target list to search for matches, of length n1.
;
; x2,y2: The search list of length n2, to be searched for
; matches to [x1,y1].
;
; search_radius: The search radius within which matches will be
; found. Only if the closest matching coordinate in [x2,y2]
; is within the search radius will it be returned. This is a
; critical variable in tuning the resources required by
; MATCH_2D. It can be a scalar or two element vector (for
; asymmetric radii). See NOTES and WARNING.
;
; KEYWORD PARAMETERS:
;
; MATCH_DISTANCE: On output, the distances between the matches
; and the returned coordinate from the set [x1,y1]
; (<=search_radius), is returned.
;
; OUTPUTS:
;
; match: A 1D vector of length n1 containing the indices of x2
; and y2 for the closest match to each [x1,y1], within the
; search_radius. If no match was found within the search
; radius, -1 is returned at that location.
;
; EXAMPLE:
;
; n=100000
; x1=randomu(sd,n) & y1=randomu(sd,n)
; x2=randomu(sd,n) & y2=randomu(sd,n)
; match=match_2d(x1,y1,x2,y2,.002,MATCH_DISTANCE=md)
;
; SEE ALSO:
;
; HISTOGRAM, HIST_ND
;
; NOTES:
;
; This match program uses HIST_ND to pre-bin the 2D search
; coordinates based on position, within some canonical
; search_radius. See:
;
; http://www.dfanning.com/code_tips/matchlists.html
;
; for a discussion of its methods. Of principle importance in
; the efficiency and behavior of the match is the SEARCH_RADIUS
; parameter. This is the maximum radial distance within which a
; match point must lie to be returned. Here's a diagram
; illustrating the problem and method. For each target point,
; four separate bins each of width 2*search_radius are searched
; among the binned search list, depending on its location within
; the bin.
;
;
; +----------+----------+
; | t1 | |
; | | |
; | | |
; | | |
; | | |
; +----------+----------+ -
; | | | |
; | | o | |
; | | | | 2*search_radius
; | | | |
; | | | |
; +----------+----------+ -
; t2
;
;
; The point `t2' is the closest to the search point `o', but it
; is not within the search_radius, therefore it is not
; considered. Instead, point `t1' is found (and discarded),
; despite the fact that `t2' is closer.
;
; Ideally, the search radius should be set to something useful
; in terms of the match (e.g. positional uncertainty, etc.).
; However, if the input target coordinates ([x2,y2]) span a
; large range, it may be necessary to use a larger search
; radius to avoid an excessively large number of bins.
; Typically there will be an optimal search radius which is
; fastest. The tradoff is as follows: the larger the search
; radius, the smaller the number of bins to search, but the
; more search points must be considered per target point. The
; smaller the search radius, the smaller the number of search
; points per bin, but the greater the number of bins.
; Something like the median inter-point separation is probably
; close to optimal.
;
; Since lines of longitude converge towards the poles, a simple
; trick to obtain roughly the same matching radius in both
; longitude and latitude is to give an asymmetric
; search_radius: [eps/cos(mean(latitude)),eps] for [longitude,
; latitude]. This works only when the range of latitude is
; small.
;
;
; WARNING:
;
; Distance is evaluated in a strict Euclidean sense. For
; points on a sphere, the distance between two given
; coordinates is *not* the Euclidean distance. As an extreme
; example, consider two points very near the N. pole, but on
; opposite sides (one due E, one due W). For small patches
; away from the poles, this Euclidean assumption is
; approximately valid, and the method works. See NOTES above
; for a tip regarding obtaining a (more) uniform match
; criterion on the sphere. For regions very near the pole, the
; match could be made after suitably projecting all coordinates
; using a map projection which alleviates this behavior.
;;
;
; MODIFICATION HISTORY:
;
; Wed Apr 22 17:56:26 2009 (J.D. Smith): Correctly account for
; list ranges and "one bin off" matches.
;
; Mon Jul 30 10:56:31 2007, J.D. Smith
;
; Written.
;
;-
;#############################################################################
;
; LICENSE
;
; Copyright (C) 2007, 2009 J.D. Smith
;
; This file is free software; you can redistribute it and/or modify
; it under the terms of the GNU General Public License as published
; by the Free Software Foundation; either version 2, or (at your
; option) any later version.
;
; This file is distributed in the hope that it will be useful, but
; WITHOUT ANY WARRANTY; without even the implied warranty of
; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
; General Public License for more details.
;
; You should have received a copy of the GNU General Public License
; along with this file; see the file COPYING. If not, write to the
; Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
; Boston, MA 02110-1301, USA.
;
;##############################################################################
function match_2d,x1,y1,x2,y2,search_radius,MATCH_DISTANCE=min_dist
bs = 2*search_radius ;this is the smallest binsize allowed
mx=[max(x2,MIN=mnx2),max(y2,MIN=mny2)]
mn=[mnx2,mny2]
mn-=1.5*bs & mx+=1.5*bs ;expand the range by (more than) the bin size
h = hist_nd([1#x2,1#y2],bs,REVERSE_INDICES=ri,MIN=mn,MAX=mx)
d = size(h,/DIMENSIONS)
;; Bin location of X1,Y1 in the X2,Y2 grid
xoff = 0.> (x1-mn[0])/bs[0] < (d[0]-1.)
yoff = 0.> (y1-mn[1])/(n_elements(bs) gt 1?bs[1]:bs) < (d[1]-1.)
xbin = floor(xoff) & ybin = floor(yoff)
bin = xbin + d[0]*ybin ;The 1D index of the bin it's in
;; We must search 4 bins worth for closest match, depending on
;; location within bin (i.e. towards any of 4 quadrant directions
;; ul, ur, ll, lr).
xoff = 1-2*((xoff-xbin) lt 0.5) ;add bin left or right
yoff = 1-2*((yoff-ybin) lt 0.5) ;add bin down or up
n1=n_elements(x1)
min_pos = make_array(n1,VALUE=-1L)
min_dist = fltarr(n1,/NOZERO)
rad2=search_radius^2
for i=0,1 do begin ;; Loop over 4 bins in the correct quadrant direction
for j=0,1 do begin
;; One of 4 search bins for all the target points
b = 0L>(bin+i*xoff+j*yoff*d[0])<(d[0]*d[1]-1)
;; Dual HISTOGRAM method, loop by count in the search bins
h2 = histogram(h[b],OMIN=om,REVERSE_INDICES=ri2)
;; Process all bins with the same repeats (>= 1) at a time
for k=long(om eq 0),n_elements(h2)-1 do begin
if h2[k] eq 0 then continue
these_bins = ri2[ri2[k]:ri2[k+1]-1] ;bins with k+om search points
if k+om eq 1 then begin ; single point
these_points = ri[ri[b[these_bins]]]
endif else begin ; range over k+om points, (n x k+om)
targ=[h2[k],k+om]
these_points = ri[ri[rebin(b[these_bins],targ,/SAMPLE)]+ $
rebin(lindgen(1,k+om),targ,/SAMPLE)]
these_bins = rebin(temporary(these_bins),targ,/SAMPLE)
endelse
;; Closest distance squared within this quadrant's bin
dist = (x2[these_points]-x1[these_bins])^2 + $
(y2[these_points]-y1[these_bins])^2
if k+om gt 1 then begin ;multiple points in bin: find closest
dist = min(dist,DIMENSION=2,p)
these_points = these_points[p] ;index of closest point in bin
these_bins = ri2[ri2[k]:ri2[k+1]-1] ;original bin list
endif
;; See if a minimum is already set there
set = where(min_pos[these_bins] ge 0, nset, $
COMPLEMENT=unset,NCOMPLEMENT=nunset)
if nset gt 0 then begin
;; Only update those where the new point is closer
closer = where(dist[set] lt min_dist[these_bins[set]], cnt)
if cnt gt 0 then begin
set = set[closer]
min_pos[these_bins[set]] = these_points[set]
min_dist[these_bins[set]] = dist[set]
endif
endif
if nunset gt 0 then begin ;; Nothing set, it's closest by default
wh=where(dist[unset] lt rad2,cnt) ;demand it's within radius
if cnt gt 0 then begin
unset=unset[wh]
min_pos[these_bins[unset]] = these_points[unset]
min_dist[these_bins[unset]] = dist[unset]
endif
endif
endfor
endfor
endfor
if arg_present(min_dist) then min_dist=sqrt(min_dist)
return,min_pos
end