PDA

View Full Version : Image location software??



bobt
18-12-2016, 6:41am
I know there are lots of programs that find duplicate images on your PC, but does anyone know of a program that finds specific duplicate images? I have 300,000 images of one sort or another (or so my current software tells me, and that includes backups and yes - I do need to cull them). Running a program to find all duplicates finds a zillion matches. However, if I have several versions of one specific image I have no way of finding just those specific duplicates without finding *all* duplicate images!

I'd just like to give the software one image and tell it to go find all the other versions of that one image rather than all duplicates. Is there such a program?? I can't find anything via Google so far.

Warbler
18-12-2016, 7:06am
ACDSee will do it Bob. I thought you already had that? Put one copy of what you believe is one of the duplicates you mentioned in a separate directory and then search for for duplicates from that directory in the other locations (drives or directories) and it just search for duplicates of what is in that single directory. It's pretty easy really. It will also preview the images if you click on the duplicates so you can double-check. You can either select which versions to delete, or batch delete from the duplicate location. It doesn't just match by filename either. Got a job on at 7:30, but I'll check when I get back, and can give you some more tips on ACDSee. Download the trial.

arthurking83
18-12-2016, 8:17am
XnView will also do it for 'ya too Bob.

XnView has two different modes of installation.
One you can install properly (the .exe) as you normally do with other programs, the other is a portable install(the zip file).

XnView is freeware (http://www.xnview.com/en/xnviewmp/) for personal use(pretty good too).

I first used it to do other stuff(which it can't do) but found it was a handy program to have sitting on my PC.
I personally use the zip version as it doesn't install itself completely onto the PC.
(ps. reason for the zip version is that you can run the program from a USB thumbdrive if needed .. hence the 'portable' part of the name! ;))

(if you try it) Once you have it installed, navigate to your file and highlight it. Then, go to Tools->Search similar files. A new dialogue box will open up with that file listed. On the RHS you have a selection box area where it allows you to add other locations.
use that to point it to any other sources where the file could reside.

Program speed is inversely proportional to the size of your image store and directly proportional to the speed of that storage.

One last note: if you look carefully at the bottom edge of that search dialogue box you will see three other options you can choose from. This is the Method for searching.
You can try the search by image content method, but it's slow .. glacially slow!

You can use the simple search method of 'same file name' which doesn't search for embedded data, but I always use the same file data search method, which obviously searches for metadata as well as file name.
Slower, but more thorough.

bobt
18-12-2016, 12:45pm
ACDSee will do it Bob. I thought you already had that?

I do indeed ... and I hadn't realised it had that search facility. Thus far I haven't been able to get it to do what I want, which is actually more than finding duplicates, but more finding the various different versions I have of the same image. As they will differ slightly due to different processing methods, an exact file search isn't going to find them. However, it's a useful tip because it's much quicker than others due to the fact that it already has a database of all the images, whereas other programs need to start from scratch. I'll keep fiddling with it and see whether it will do more than I've managed thus far - thanks!


XnView will also do it for 'ya too Bob.

Thanks Arthur, that's an interesating little program - not heard of that before! Theoretically it should do what I want, but so far it hasn't. I've tried the various combinations, and given it searches with varying levels of specificity, but it keeps finding waaaay more images than just the ones I want it to find. I'll keep playing with this too seeing that it does have a bit of flexibility in its search methods. Might be useful for other stuff too!

Warbler
18-12-2016, 1:43pm
Now I've read your reply, I can say that ACDSee won't do what you want. It will only detect exact matches. What you're after is something that will search the IPTC or EXIF data and match the camera settings. Sorry. Edits will alter file size, so ACDSee doesn't see it as a match anymore.

ameerat42
18-12-2016, 1:43pm
The on I use is FastStone Viewer (the viewer only) from here:
http://www.faststone.org/

Needless to say, it's free. (Sorry about that!)
The others are trials... and/or don't work.

Warbler
18-12-2016, 1:50pm
Faststone won't do it either, free or not.

bobt
18-12-2016, 2:34pm
ACDSee will do it Bob. I thought you already had that? Put one copy of what you believe is one of the duplicates you mentioned in a separate directory and then search for for duplicates from that directory in the other locations (drives or directories) and it just search for duplicates of what is in that single directory. It's pretty easy really. It will also preview the images if you click on the duplicates so you can double-check. You can either select which versions to delete, or batch delete from the duplicate location. It doesn't just match by filename either. Got a job on at 7:30, but I'll check when I get back, and can give you some more tips on ACDSee. Download the trial.


Now I've read your reply, I can say that ACDSee won't do what you want. It will only detect exact matches. What you're after is something that will search the IPTC or EXIF data and match the camera settings. Sorry. Edits will alter file size, so ACDSee doesn't see it as a match anymore.

Yeah ... it's a real problem when you work with various versions of the image. Handy that you put me on to what ACDSee *can* do, but I think I'm struggling to find one that can find similar but nearly identical images. Arthur's idea nearly hits the spot, but as far as I can tell it still brings up every image on the planet. I need something that just looks at one image and searches for matches or near matches on that one image.

ameerat42
18-12-2016, 3:50pm
Faststone won't do it either, free or not.

I see. Ta for the info:(


The one I use is FastStone Viewer (the viewer only) from here:
http://www.faststone.org/

Needless to say, it's free. (Sorry about that!)
The others are trials... and/or don't work.

farmmax
18-12-2016, 11:04pm
A quick Google check came up with this page http://www.makeuseof.com/tag/5-ways-to-find-duplicate-image-files-on-your-computer-windows/

See if one of those pieces of software will help

arthurking83
19-12-2016, 12:29am
..... What you're after is something that will search the IPTC or EXIF data and match the camera settings. ....

That's what XnView can do.

This is why I prefer my tag data embedded in the raw file, rather than the way many software do it via their own proprietary database/side car file versions.

@ Bob, if you try searching with XnView and choose the "same file data" the search can be(more likely, will be!!) slow.

The only other suggestion I can offer is to search your images by date shot. Not date modified.
You may have processed your images differently on different dates, so date created could be different, as well as date modified.
But date shot is set in stone .. as long as you haven't stripped the exif out of the images.

So if you shot similar images on a specific date(as one tends to do when on a shoot) at least they are grouped all together and you can visually see their similarities and differences.

Of course this is of no help if you shot similar/same subject matter at two different times and try to compare them.

XnView could (big ? tho!) try to find similar images by content if that's something you're after.

Another quite annoying, but useful software, still in the realm of free to use, is Microsoft's Photo Gallery.
On the whole, it's quite annoying to use, but it does have some handy features.
One: is the above mentioned sort by date shot method of sifting through your images.
Two: is the ability to write keyword metadata directly into a raw file. The only caveat here is that you have the codec for that raw file type installed(and of course that you're using Windows).
Once the raw codec is installed Windows will natively allow you to see your raw images as previews in Windows Explorer.(this is good)

So what Bob could do:
Use Photo Galley to view all images on his entire PC(you can select only specific paths if required)
After many days of waiting .. :p .. for Gallery to finally build it's database, it finally allows you to do stuff.
If you sort by date, and see a whole bunch of images that you want to group together(irrespective of their file type), you'd highlight them all and then add a tag, or selection of tags to them via Gallery.
Once you've started building a collection of tags built up in the system, it helps you autocomplete the tagging process. Very helpful when there are still 40+K of images that require tagging.
if you have any tag/keyword metadata already entered into any images, Gallery will build on that too. Photo Galley and Windows Explorer interoperate really well together, so ...

Handy part #3 is that as you build up the tag/keyword collection into as many images as you can .. you subsequently don't need any other software to search your images other than Windows Explorer.
You can use other software to search/edit/or whatever your tagged images, but the point of handy part #3 is that if all you want is to do a quick search for any keyword data you just do it directly in Windows Explorer.

I do this all the time, and it makes life quite easy sometimes.

Something to watch for with duplicate names is: (if you weren't regimented in your organisation ability from day one!) .. you could have a few files with the same name, that are actually different.
I wasn't prepared for this when I first started shooting, and I use camera generated names as my file names. (I have since changed when I first noted the problem).
So, I have DSC_0001 that is one image, and DSC_0001 that is also another non related image.
Then I got a new camera, it it wanted my to name the images DSC_0001 too, but I held my ground and changed it on the camera! :D .. not all cameras(eg. Nikon D70) allow this, but thankfully the D300 did.
I also then changed the way I transfer to the PC too, in that I subsequently learned that a camera model name would be handy to (auto) add as a prefix too during upload.
So D300 became D300_DSC_0001 but D800E became D800E_DSC_0001. This way there is no confusion as to who the real DSC_0001 really was :p

Just some gotchas to be aware of when searching for images if you weren't pre-prepared.

Note too, if you do want to give Windows Photo Gallery a shot too(and I also recommend that one too) just be aware of what you install when you try to install Windows Live crap.
You have a choice of what you want to install. Just untick everything other than Gallery .. and you then only install Gallery only .. not Messenger and Writer and whatever other crapware they bundle all together in the one package.
Last note is that after finally finding some software that tags (raw) image the way I prefer, M$ have recently announced that Galley (and Live itself) is not going to be supported any longer(Jan 10 2017!).
They want to migrate folks to their new Photos app on Win10! :angry0:

That doesn't mean that Galley won't work any more tho, just probably wont install on Win11. So for me personally, I'm kind'a in a mad rush to finally try to tag/keyword those last 40+K images before I'm forced to update to Win11 :rolleyes:

Oh! ... and I highly recommend that you tag/keyword your images .. quickly! .. before the next time comes around that you have to search for any of them ;)

bobt
19-12-2016, 10:35am
A quick Google check came up with this page http://www.makeuseof.com/tag/5-ways-to-find-duplicate-image-files-on-your-computer-windows/

See if one of those pieces of software will help

Thanks for that - some of those look quite promising! I'll start working my way through them and see if any of them do the trick. Someone in the world must have solved this problem effectively! :rolleyes:

- - - Updated - - -


This is why I prefer my tag data embedded in the raw file, rather than the way many software do it via their own proprietary database/side car file versions.

Cataloging images is a bit like jumping out of a plane without a parachute ....... you find yourself saying "Why the heck didn't I do that *before* I jumped!". Now I have so many images without any sort of manual tags, and the task to go back and do it is now insurmountable! I should have thought of that years ago when i only had a few ..... :eek:



The only other suggestion I can offer is to search your images by date shot. Not date modified.

I think this makes a lot of sense. If some of the software that's been suggested doesn't do it for me, that's the way to go!



Something to watch for with duplicate names is: (if you weren't regimented in your organisation ability from day one!) .. you could have a few files with the same name, that are actually different.

I discovered that when I was searching - same problem. Once you change cameras or run out of numbers you go back and start again - I have numerous images of the same name - it's a real trap. The other trap is to let any software do *anything* automatically without getting your permission first! Some programs like to go ahead and delete all sorts of stuff on the basis that you don't want duplicates - but sometimes you do want duplicates, such as in your backup directories. Software has to know its place and ask before doing stuff. Just like I do with my wife ....... :D


Thanks for all your ideas and suggestions - very much appreciated! Now to the road testing phase ........ :eek:

Warbler
19-12-2016, 10:56am
I'd agree with Arthur about the need to plan how you store, catalogue, and keyword your files, I'd add another suggestion. I rename all my files when I load them onto my computer. When I edit them, I add "_E" to the filename, or "_E2" for the second edit. I'll have the same original filename in all edits regardless of whether they are PSD, PSB, JPG, or any other format. Makes it easy to use search engines to find them. Requires a bit of discipline though. I have had all three of those software apps discussed above installed on my computer for a couple of years now. I use them all for different tasks from time-to-time. I've never found one app that does everything. Good luck with it Bob.

ameerat42
19-12-2016, 11:04am
I do the same, Warbs, using "a", "a2" for "adjust", "c"rop. "lr" for low-res. So files can end up with
all of "aclr" in the names. Just saying this because I agree with you that it's a useful guide.

bobt
19-12-2016, 11:12am
Requires a bit of discipline though.

Ahhhh ..... now *THAT's* what I need an app for!! I need not only more memory (that's for me, not the computer) but a lot more discipline (not that sort Ameerat) :D

I should be getting rid of so much on my PC, but I always take the easy way out and just buy another hard drive. Yup ... discipline ...... definitely gotta find that app ......

landyvlad
19-12-2016, 2:47pm
Like anything if you want something that does the job properly, you have to pay for it.
I've used a trial of this and it worked very well. Actually works on visual similarity comparison rather than just filenames like cheaper/stupider programs. That way it identifies copies where you have an original, and maybe several lower-res versions. You can chose to keep the original and ditch the rest etc.

I purchased their duplicate file finder software (for a purpose other than photos) and it is really good, so I see no reason this wouldn't do the job for you, and more !

http://www.mindgems.com/products/VS-Duplicate-Image-Finder/VSDIF-About.htm

arthurking83
20-12-2016, 12:05am
..... Now I have so many images without any sort of manual tags, and the task to go back and do it is now insurmountable! I should have thought of that years ago when i only had a few ..... :eek:
....

You and me both.

I started keywording my images in about 2010-ish .. maybe later(can't remember exactly, but well into the depths of having many thousands of images NOT keyworded).

I tried a few programs, and one of them was IDImager version (4 I think). This one was my fave. Could be slow, but that issue was nothing compared to when the developer stopped supporting it and turned to a new program(Photo Supreme).
I didn't like it as much, so had to kind'a switch.

ps. (and note to self) .. must try it again to see how well it's progressed since my last try.

... anyhow, much maddening searching and fumbling with Nikon's software, and one day I stumbled onto that annoying Windows Photo Gallery.
The main reason I recommended it is that it's probably the most helpful program to start the redress action of not having keyworded your previous images.
The way it works, in that it groups images as you tag them, so the more you tag them, the more refined the grouping, and the easier it subsequently becomes to tag them again with more detail.

Of my image store, I think I started with about 75K images NOT tagged, and now I'm down to about 40K left to go .. about a year or so, of half hearted efforts to tag as many of them as I could. It does get tedious, but you do go back over iamges you may have forgotten about for a while too .. so good and bad points about it all.

Also note that many images may not be worth tagging either, so that above value of 40K images may not be as accurate as it seems.
Duplicate file names aren't as big a problem if you're directory structure and hierarchy is solid and sound.
if your software allows renaming on import, then I'd suggest a date suffix would be a handy way to show them listed in a easy to follow orderly process.
Date formatted by year-month- day .. so that DSC_0001 would become DSC_0001_20161219.

Warbler
20-12-2016, 8:50pm
You know I've always thought that putting the date in a filename was a waste of time. Even windows will sort by date taken in file manager. But, each to his own I suppose. Bob, try right clicking on the header bar in file manager and you can add EXIF fields, and IPTC fields to that, and then sort by them. Hell, you can even sort by whether the flash fired or not.

arthurking83
21-12-2016, 12:33am
You know I've always thought that putting the date in a filename was a waste of time. .....

Sometimes it's not just about sorting, it's useful for other reasons.

eg. I used to use my D70, I then updated to the D300 and then to the D800E.

Each camera made an image named DSC_0001.
Problem then is, I have three DSC_0001 images on my PC/storage somewhere.
I'm fairly good with keeping a track on what goes where and have well defined structured directories so that there is no cause for mistaken identity.
But, the fact remains that I have 3 images all called DSC_0001 which are all different.

I have prefixes added to images on download, which reflect the camera they came from(ie. D70s_DSC_0001) now, but up until such time that it wasn't a problem, I didn't do that.
So I have a few replicated names of images(eg. all from the D70s wayy back when) .. so in this situation, using the date that those images were taken I'd find that a date suffix on those files is probably the most efficient way to 'arrange them' and not confuse them with each other if the time comes to clean up the storage space.


So if I see two(or more) images both named DSC_0001.NEF I could be inclined to delete one(ie. space saving), without double checking the files.
Of course a methodical person would double and triple check them, and there's the possibility that an automated program that cleans up duplicate files may not be necessarily so smart to double check either.
Or that a backup program may update the latest date stamped file over the older date stamped file.

The idea is that it's eliminating the possibility for mistaken identity between DSC_0001_20161231 and DSC_0001_20170101 .. not so much for the recognition, or organisation, of the date itself.

The other thing I do(on camera) is to manually set a new naming regime for the captured images.
The cameras all seem to start at DSC_0001(so there's a high chance that we all have a file named DSC_0001 somewhere on our storages! :p)
Once I reach (approx) DSC_9999, I then reset the naming on the camera to the next level up for the alphabetic component.
ie. from DSC_ to DSD_. The numerical sequence is automatically increased by the camera, but it's not smart enough to update the alphabetical component.

Some cameras allow for in camera file name setting, others don't.



..... Bob, try right clicking on the header bar in file manager and you can add EXIF fields, and IPTC fields to that, and then sort by them .....

I think I've mentioned it previously, but a neat trick that you can do to add tags to images quickly efficiently and painlessly.
Using Windows Explorer:
On a jpg file, you need nothing special .. if you have a known store of images of a particular session/genre/style/subject matter/whatever, you highlight all the relevant images, rightclick them select properties and choose the details tab.
In the details tab, if you hover the mouse icon over the tag field box, it turns into an editable field. click it and add a tag.
That tag, will be embedded into the IPTC field in the image.

note that IPTC field is a specific field. Think of it as part of the EXIF field. But to be technical IPTC isn't EXIF, it's a separate area .. it's a specific part of the image known as IPTC! This can be important depending on the software used to view/add/edit the embedded data.

So while this is a native part of Windows system which works on jpgs, you can't do this to a raw image by default. Install the camera manufacturer's codec relevant to the raw file type you work with, and Windows can then add tag data in the same way.
Because it is added into a standardised field within the image, not only can you search for it as Warbler describes in Windows (Explorer) but all other software can also use that added tag data too.

farmmax
21-12-2016, 12:34am
You know I've always thought that putting the date in a filename was a waste of time. Even windows will sort by date taken in file manager.

I'm with you on this one :nod:

I batch rename the files as I move them from the camera with Faststone. The names will contain a few keywords to describe the files. I have folders set up with all the different subjects I've found I needed over the years and I drag and drop the files into the relevant folders. I like the file names to reflect the the file content, then any filing system can sort and find a topic quickly. If that information is only in the exif information, not all software can access it and sort on it.

bobt
21-12-2016, 5:41pm
A quick Google check came up with this page http://www.makeuseof.com/tag/5-ways-to-find-duplicate-image-files-on-your-computer-windows/

See if one of those pieces of software will help

Thanks for those tips - currently working my way thorugh them to see whether they can be useful - much obliged. :th3:

bobt
21-12-2016, 7:28pm
You know I've always thought that putting the date in a filename was a waste of time. Even windows will sort by date taken in file manager. But, each to his own I suppose. Bob, try right clicking on the header bar in file manager and you can add EXIF fields, and IPTC fields to that, and then sort by them. Hell, you can even sort by whether the flash fired or not.

I must say that EXIF data is one of the most useful inventions known to photographers! The information that is there by default is such a bonus. I use EXIF data all the time (although I am sometimes suspicious of "date created" which I am never sure of). EXIF data has prevailed where my poor housekeeping has failed!

Warbler
21-12-2016, 7:33pm
... (although I am sometimes suspicious of "date created" which I am never sure of)...

Try "Date Taken". Date created refers more to when you copied the file to your computer from memory.

bobt
21-12-2016, 7:37pm
Arthur .... thanks for all your tips and information. I think the best solution might be to employ you to organise all my zillion images, thus cutting out the middle software! 8*)

One solution I have which certainly helps is the idea of a database - and fortunately, ACDSee has already produced that for me - and it is very useful if a tad slow. It's certainly great to have an existing database because any new software has to start from scratch and build one.

The longer term solution is going to be a thorough clean out of my numerous hard drives and backup - but that will require some considerable courage and determination!!! In the meantime I'll be trying out the various options suggested in this thread - yet another demonstration of why many heads are better than one!!

- - - Updated - - -


Try "Date Taken". Date created refers more to when you copied the file to your computer from memory.

Those terms have always puzzled me I must admit. I must research what the default action is in relation to EXIF changes, and which dates remain unchanged. Poor terminology really when "date taken" and "date created" are to me essentially the same thing (or should be in a logical sense.)

Warbler
21-12-2016, 7:39pm
Date created means when the file was created on the media you're looking at. Copy it to another location and that will change.

bobt
21-12-2016, 7:47pm
Date created means when the file was created on the media you're looking at. Copy it to another location and that will change.

Thanks! That explains a lot ...... seems illogical to me, but at least know I know what illogic I'm dealing with! :confused013

Warbler
21-12-2016, 8:33pm
A quick and dirty way to do some matching Bob. Use windows explorer to do a search on the root of a single hdd. Search the drive for all image types you're looking for and include the sub-directories in your search. When the results appear, go up to the headings of the columns, right-click and select date taken. Then click on that column to sort by that field. At least it will put all the images unto chronological order of date taken.

bobt
21-12-2016, 9:29pm
A quick and dirty way to do some matching Bob. Use windows explorer to do a search on the root of a single hdd. Search the drive for all image types you're looking for and include the sub-directories in your search. When the results appear, go up to the headings of the columns, right-click and select date taken. Then click on that column to sort by that field. At least it will put all the images unto chronological order of date taken.

The ACDSee database does pretty all that and with thumbnails - so with that and a new understanding of the exif data I'm well on the way to at least visualising the duplicates if not finding a specific one without all the rest as well. The Explorer option also sounds useful.