 |
 |
 |
 |
| Using Fedora General support for current versions. Ask questions about Fedora and it's software that do not belong in any other forum. |

9th November 2012, 09:56 PM
|
|
Registered User
|
|
Join Date: May 2012
Location: New York
Posts: 5

|
|
|
find, grep, and white spaces
Hello,
My objective is to make a tar file (a backup) of all files smaller then 5M.
This file should exclude file in the backup/YY subdirectory and all hidden
files. Some of the file name contain whitespaces.
I attempted this with the commands below. This resulted in an
empty archive. Taking it piecemeal grep isn't working correctly. So, what
would be the easiest way to fix this?
Thanks for the help,
H.
Code:
find . -type f -not -size +5M -print0 | grep -v "backups/[0-9][0-9]*" | grep -v "[.]/[.]" | xargs -0 -n1 tar -cf foo.tar
|

9th November 2012, 10:02 PM
|
 |
"Shells" (of a sub world)
|
|
Join Date: May 2011
Location: Helvetic Federation (Swissh)
Age: 33
Posts: 2,607

|
|
|
Re: find, grep, and white spaces
You start looking for files in your current directory.
Lets say you're in $HOME/Desktop you'll search for files starting with $HOME/Desktop/...
I dont konw much about regex, but to me it looks like you're removing entries containing a "/", which would be everything other than current directory.
__________________
Fedora Manual: http://docs.fedoraproject.org
Script-Tools: https://sourceforge.net/projects/script-tools/
sudo st tweak repo toggle fedora-rawhide ; st iso dl-fed -respin && st iso usb
|

10th November 2012, 08:51 AM
|
|
Registered User
|
|
Join Date: Dec 2006
Posts: 1,718

|
|
|
Re: find, grep, and white spaces
What did you expect?
print0 writes everything into one line and grep removes this line ;-)
Code:
-print0
True; print the full file name on the standard output, followed
by a null character (instead of the newline character that
-print uses). This allows file names that contain newlines or
other types of white space to be correctly interpreted by pro‐
grams that process the find output. This option corresponds to
the -0 option of xargs.
Your file names include newline characters?
|

10th November 2012, 11:05 AM
|
|
Registered User
|
|
Join Date: Jul 2009
Location: England, UK
Posts: 821

|
|
|
Re: find, grep, and white spaces
Personally, I should be tempted to use some extra options to find rather than piping the output through grep. Something like
Code:
find . -path ./backup -prune -or -type f -not -iname '.*' -not -size +5M -print
should dump the list of filenames that you want, separated by newlines.
Then, assuming you don't have filenames that include newline characters (and who would?), perhaps the following script will do what you want. Pass the base directory as the first argument and the name of the tar file as the second. (Warning! I've not tested this too hard...)
Code:
#!/bin/sh
IFS="$(echo -e '\n')"
to_backup=$(find "$1" -path "$1"/backup -prune -or -type f -not -iname '.*' -not -size +5M -print)
tar -cf "$2" $to_backup
|

10th November 2012, 12:45 PM
|
|
Registered User
|
|
Join Date: Aug 2009
Location: Waldorf, Maryland
Posts: 6,105

|
|
|
Re: find, grep, and white spaces
The problem is that the "$to_backup" could be several thousand files... and thus overflow the command line parameter buffer.
|

10th November 2012, 01:06 PM
|
|
Registered User
|
|
Join Date: Dec 2006
Posts: 1,718

|
|
|
Re: find, grep, and white spaces
Quote:
|
find . -path ./backup -prune -or -type f -not -iname '.*' -not -size +5M -print
|
This still includes hidden directories like .cache
|

10th November 2012, 09:57 PM
|
|
Registered User
|
|
Join Date: May 2012
Location: New York
Posts: 5

|
|
|
Re: find, grep, and white spaces
1.
Quote:
|
find . -not -path '*/.*/*' -not -name '.*' > foo
|
excludes the hidden files and directories
2.
Quote:
|
find . -path './backups/[0-9][0-9]' -prune -or -type f -not -iname '.*' -not -size +5M > foo
|
excludes the backups/YY directories.
3. What I need is for both of these to work together (if this is possible).
|

11th November 2012, 09:39 AM
|
|
Registered User
|
|
Join Date: Jul 2009
Location: England, UK
Posts: 821

|
|
|
Re: find, grep, and white spaces
Quote:
Originally Posted by jpollard
The problem is that the "$to_backup" could be several thousand files... and thus overflow the command line parameter buffer.
|
A very good point.
So, as an alternative, perhaps... ?
Code:
find . -path './backups/[0-9][0-9]' -prune -or -type f -not -size +5M -print0 | grep -FzZ './\\.*' | xargs -0 tar -rf foo.tar
|

12th November 2012, 06:37 PM
|
|
Registered User
|
|
Join Date: May 2012
Location: New York
Posts: 5

|
|
|
Re: find, grep, and white spaces
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M
does obtain all filename (some of which have a space).
But,
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -cf foo.tar
foo.tar has missing filenames.
|

12th November 2012, 07:00 PM
|
|
Registered User
|
|
Join Date: Jul 2009
Location: England, UK
Posts: 821

|
|
|
Re: find, grep, and white spaces
If I've understood things correctly, xargs parcels up the filenames you pass to it and calls tar multiple times; so (if I'm correct) the archive you have created will only contain the files whose names were in the last batch.
I think you might need to use the '-r' option for tar (append to an archive), not '-c' (create the archive). Then you should get everything.
|

13th November 2012, 01:14 PM
|
|
Registered User
|
|
Join Date: May 2012
Location: New York
Posts: 5

|
|
|
Re: find, grep, and white spaces
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M > foo
Does exclude the backups/YY and hidden directories.
But,
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -rf foo.tar
includes the backups/YY directories (which I do not want). How can this be?
|

13th November 2012, 09:43 PM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: find, grep, and white spaces
Quote:
Originally Posted by marriedto51
If I've understood things correctly, xargs parcels up the filenames you pass to it and calls tar multiple times; so (if I'm correct) the archive you have created will only contain the files whose names were in the last batch.
I think you might need to use the '-r' option for tar (append to an archive), not '-c' (create the archive). Then you should get everything.
|
Won't work ! I haven't run into the limit on xargs in a long time (it's abt 2MB on F17), but the last time I did xargs was not selecting a nice point in the input stream to iterate the command . It will (I believe) cut the filelist in the middle of a file name and leave you with a mess. so 'xargs tar -r..' is not a good approach.
xargs is great for command line doodling, but not for a script that could generate large output.
So AVOID XARGS !
You want to use the tar -T option ...
Code:
find . -type f -some -gawdawful -options -print | tar -cf foo.tar -T -
-print0 will not work, as it separates the names with a null and not a newline.
There is no need to specially process the space characters in this case. The spaces are writtted to stdout via -print and read as stdin in the "tar ... -T -"
---------- Post added at 05:43 PM ---------- Previous post was at 05:27 PM ----------
====
Quote:
Originally Posted by henry7849
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M > foo
Does exclude the backups/YY and hidden directories.
But,
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -rf foo.tar
includes the backups/YY directories (which I do not want). How can this be?
|
Oh it be !
The problem is that your find command was printing (and PLEASE lose the -print0) the names of files AND directories.
When tar gets a directory name - it adds the entire directory and all it's sub-directories to the tarball.
a/ use "find . -type f ..." to avoid listing directories.
b/ stop using -print0 , it's useless in this case. -print is the ticket !
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
Last edited by stevea; 13th November 2012 at 09:33 PM.
|

18th November 2012, 04:10 PM
|
|
Registered User
|
|
Join Date: May 2012
Location: New York
Posts: 5

|
|
|
Re: find, grep, and white spaces
tar's -T option solved it. Thanks much.
H.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
Current GMT-time: 17:42 (Thursday, 23-05-2013)
|
|
 |
 |
 |
 |
|
|