GarysTurn wrote:
The only files that I know of that can be merged in PAF automatically are identical records with the same Ancestral File Number (AFN), so if you use the merge by AFN option, PAF can auto merge those records. When you download someone else's entire database you can compare that file with yours side by side, record by record and just select the information you want to move to your record using "PAF Insight", this program is available at most FHC's to use and is available to purchase online. Do a Google search for "PAF Insight" for more info.
This has probably already been discussed, but I've always wondered about how hard it would be to create a more intelligent merging program: Say that Mary Jones, born in 1901 in Geneva, NY married John Smith, born in 1895 in Topeka, Kansas, and had several children, one of whom was Stephen Smith, born in 1920 in Buffalo, NY. If multiple people upload this family, why couldn't a merging program figure out that these are the same people. The argument I've heard is 'there are hundreds of thousands of Mary Jones, etc.' Sure, but how many were born in 1901 in Geneva, NY, married John Smith, born in 1895 in Topeka, Kansas, and had a son named Stephen, born in 1920 in Buffalo, NY?
I would put the odds of two different Mary Jones matching all that at 1 in a million or less. I think you could have an accurate match, even if you didn't have all the dates and places, especially if you had more children's names.
A second element of an intelligent match, even without the above should be that it should figure out that if I merged child A1 in Family A with Child B1 in Family B, then it's a good guess that I can merge everyone in family A with their matches in family B, assuming they have the same first name and birth dates. In other words, once I identify a match, it could go on it's merry way: Multiple generations could be merged very quickly with this approach, simply by making one match, as parents would be children in the next family, and so on.
Again, this sounds easy on paper, but how hard is that? I would imagine it taking a fair amount of CPU time, and probably couldn't be done in real time. The first suggestion could be like a web crawler, constantly looking for possible merges, and the second one would be a batch process that runs asynchronously.
Even better - the program that looks for possible matches could simply flag the matches to the two tree owners as possible matches, and let them decide.