 |
 |
 |
 |
| Programming & Packaging A place to discuss programming and packaging. |

14th April 2008, 10:36 PM
|
 |
Registered User
|
|
Join Date: Apr 2005
Location: Littleton, CO
Age: 28
Posts: 2,855

|
|
|
List duplicate filenames and relative path.
I'm trying to write a script that will give me a list of locations for files with identical names. Let me explain.
I want to run this script on, or in, folder A. Folder A contains folders 1 and 2, with a file named test.txt in both folders. Folder 1 also contains folder 1a which also has a file test.txt in it. Folder 2 has a file named test2.txt in it. Folders 1a and 2 also contain a file named test3.txt.
When I run my script I want it to output something like this.
Code:
./A/1/1a/test.txt
./A/1/test.txt
./A/2/test.txt
./A/1/1a/test3.txt
./A/2/test3.txt
That way I know where all the files named test.txt, and test3.txt, are. Since there is only one file named test2.txt it is not listed.
I'm also sure that I want filenames, not actual file contents, to be the basis for this output. The reason for the filename being important is a little convoluted.
|

14th April 2008, 11:19 PM
|
 |
Administrator
|
|
Join Date: Sep 2006
Location: Connellsville, PA, USA
Posts: 11,289

|
|
|
Hi leadgolem:
Do you need this to be a script, or can you use a package like fslint or kleansweep? They (or at least kleansweep) will run a recursive scan for duplicate files on selected folders, and you can save the report. Lots of other things those two apps can do, but you can do just a scan for dupes, nothing else, on selected folders.
V
|

14th April 2008, 11:47 PM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,346

|
|
|
Mwwwahahahahahaha - Since before your star burned bright in space, I have awaited a question.
I love a good quiz !
I assume you don't know the file names but you want to see all duplicates and only duplicates.
How does this feel ? (feel free to diddle the "type f" find option for your needs.
find ./A -type f -printf "%p %f\n" | sort -k 2 | uniq -f 1 -D | cut -d' ' -f1
If you have to deal with a lot of files with embedded spaces ' ' in their names you'll have to change the delimiter throughout.
Last edited by stevea; 14th April 2008 at 11:58 PM.
|

15th April 2008, 01:36 AM
|
 |
Registered User
|
|
Join Date: Apr 2005
Location: Littleton, CO
Age: 28
Posts: 2,855

|
|
@Hlinger, not specifically. It just seemed to me that a shell script should be suited to this type of search. Hmm, I might look at kleansweep for some other applications.
@stevea, with the slight modification of changing the find to "./" and running it from inside the folder those commands work perfectly.  I can also guarantee that in my application none of the files have space or any other troublesome special characters.
Here's the very slightly modifies line, so you can see what I mean.
Code:
find ./ -type f -printf "%p %f\n" | sort -k 2 | uniq -f 1 -D | cut -d' ' -f1
EDIT: This is not something I was able to put together myself as I have never use sort, unique, or printf.
|

15th April 2008, 07:21 AM
|
 |
Administrator
|
|
Join Date: Sep 2006
Location: Connellsville, PA, USA
Posts: 11,289

|
|
Code:
find ./ -type f -printf "%p %f\n" | sort -k 2 | uniq -f 1 -D | cut -d' ' -f1
Kwik Kwiz
The above line is: - Malware code
- A diabolical magical formula
- Linux command-line syntax
- All of the above
Correct Answer: D
|

15th April 2008, 08:30 AM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,346

|
|
|
Long ago when Unix was young the shell pipe was considered to be a framework for connecting small powerful utilities into specific tools. The swiss army knife of software; an elegant tool for a more civilized age. It's no surprise that all of the constituent binaries are from FSF coreutils and findutils.
They've expanded the possibilities with bash, but it's rarely if ever used.
|

15th April 2008, 10:16 AM
|
 |
Registered User
|
|
Join Date: Apr 2005
Location: Littleton, CO
Age: 28
Posts: 2,855

|
|
|
That's to bad, pipe is extremely useful.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
Current GMT-time: 06:54 (Wednesday, 19-06-2013)
|
|
 |
 |
 |
 |
|
|