Fedora Linux Support Community & Resources Center
  #1  
Old 9th November 2012, 09:56 PM
henry7849 Offline
Registered User
 
Join Date: May 2012
Location: New York
Posts: 9
linuxfirefox
find, grep, and white spaces

Hello,

My objective is to make a tar file (a backup) of all files smaller then 5M.
This file should exclude file in the backup/YY subdirectory and all hidden
files. Some of the file name contain whitespaces.

I attempted this with the commands below. This resulted in an
empty archive. Taking it piecemeal grep isn't working correctly. So, what
would be the easiest way to fix this?

Thanks for the help,
H.

Code:
find . -type f -not -size +5M -print0 | grep -v "backups/[0-9][0-9]*" | grep -v "[.]/[.]" | xargs -0 -n1 tar -cf foo.tar
Reply With Quote
  #2  
Old 9th November 2012, 10:02 PM
sea Offline
"Shells" (of a sub world)
 
Join Date: May 2011
Location: Confoederatio Helvetica (Swissh)
Age: 34
Posts: 3,369
linuxfedorachrome
Re: find, grep, and white spaces

You start looking for files in your current directory.
Lets say you're in $HOME/Desktop you'll search for files starting with $HOME/Desktop/...

I dont konw much about regex, but to me it looks like you're removing entries containing a "/", which would be everything other than current directory.
__________________
Laptop: Toshiba satellite p50-a-11 CPU: Intel i7 8*2400 MHz GPU: GeForce GT 745M RAM: 8192 MB Distro: Fedora (Rawhide) DE: Awesome
Text User Interface (TUI) // Windows 8+ & Fedora 20+ Dualboot
Reply With Quote
  #3  
Old 10th November 2012, 08:51 AM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 2,077
linuxfirefox
Re: find, grep, and white spaces

What did you expect?

print0 writes everything into one line and grep removes this line ;-)

Code:
       -print0
              True;  print the full file name on the standard output, followed
              by a null character  (instead  of  the  newline  character  that
              -print  uses).   This allows file names that contain newlines or
              other types of white space to be correctly interpreted  by  pro‐
              grams  that process the find output.  This option corresponds to
              the -0 option of xargs.
Your file names include newline characters?
Reply With Quote
  #4  
Old 10th November 2012, 11:05 AM
marriedto51 Offline
Registered User
 
Join Date: Jul 2009
Location: England, UK
Posts: 910
linuxfirefox
Re: find, grep, and white spaces

Personally, I should be tempted to use some extra options to find rather than piping the output through grep. Something like
Code:
find . -path ./backup -prune -or -type f -not -iname '.*' -not -size +5M -print
should dump the list of filenames that you want, separated by newlines.

Then, assuming you don't have filenames that include newline characters (and who would?), perhaps the following script will do what you want. Pass the base directory as the first argument and the name of the tar file as the second. (Warning! I've not tested this too hard...)
Code:
#!/bin/sh
IFS="$(echo -e '\n')"
to_backup=$(find "$1" -path "$1"/backup -prune -or -type f -not -iname '.*' -not -size +5M -print)
tar -cf "$2" $to_backup
Reply With Quote
  #5  
Old 10th November 2012, 12:45 PM
jpollard Online
Registered User
 
Join Date: Aug 2009
Location: Waldorf, Maryland
Posts: 6,847
linuxfirefox
Re: find, grep, and white spaces

The problem is that the "$to_backup" could be several thousand files... and thus overflow the command line parameter buffer.
Reply With Quote
  #6  
Old 10th November 2012, 01:06 PM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 2,077
linuxfirefox
Re: find, grep, and white spaces

Quote:
find . -path ./backup -prune -or -type f -not -iname '.*' -not -size +5M -print
This still includes hidden directories like .cache
Reply With Quote
  #7  
Old 10th November 2012, 09:57 PM
henry7849 Offline
Registered User
 
Join Date: May 2012
Location: New York
Posts: 9
linuxfirefox
Re: find, grep, and white spaces

1.
Quote:
find . -not -path '*/.*/*' -not -name '.*' > foo
excludes the hidden files and directories

2.
Quote:
find . -path './backups/[0-9][0-9]' -prune -or -type f -not -iname '.*' -not -size +5M > foo
excludes the backups/YY directories.

3. What I need is for both of these to work together (if this is possible).
Reply With Quote
  #8  
Old 11th November 2012, 09:39 AM
marriedto51 Offline
Registered User
 
Join Date: Jul 2009
Location: England, UK
Posts: 910
linuxfirefox
Re: find, grep, and white spaces

Quote:
Originally Posted by jpollard View Post
The problem is that the "$to_backup" could be several thousand files... and thus overflow the command line parameter buffer.
A very good point.

So, as an alternative, perhaps... ?
Code:
find . -path './backups/[0-9][0-9]' -prune -or -type f -not -size +5M -print0 | grep -FzZ './\\.*' | xargs -0 tar -rf foo.tar
Reply With Quote
  #9  
Old 12th November 2012, 06:37 PM
henry7849 Offline
Registered User
 
Join Date: May 2012
Location: New York
Posts: 9
linuxfirefox
Re: find, grep, and white spaces

Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M
does obtain all filename (some of which have a space).

But,

Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -cf foo.tar
foo.tar has missing filenames.
Reply With Quote
  #10  
Old 12th November 2012, 07:00 PM
marriedto51 Offline
Registered User
 
Join Date: Jul 2009
Location: England, UK
Posts: 910
linuxfirefox
Re: find, grep, and white spaces

If I've understood things correctly, xargs parcels up the filenames you pass to it and calls tar multiple times; so (if I'm correct) the archive you have created will only contain the files whose names were in the last batch.

I think you might need to use the '-r' option for tar (append to an archive), not '-c' (create the archive). Then you should get everything.
Reply With Quote
  #11  
Old 13th November 2012, 01:14 PM
henry7849 Offline
Registered User
 
Join Date: May 2012
Location: New York
Posts: 9
linuxfirefox
Re: find, grep, and white spaces

Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M > foo
Does exclude the backups/YY and hidden directories.

But,

Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -rf foo.tar
includes the backups/YY directories (which I do not want). How can this be?
Reply With Quote
  #12  
Old 13th November 2012, 09:43 PM
stevea Offline
Registered User
 
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,832
linuxfirefox
Re: find, grep, and white spaces

Quote:
Originally Posted by marriedto51 View Post
If I've understood things correctly, xargs parcels up the filenames you pass to it and calls tar multiple times; so (if I'm correct) the archive you have created will only contain the files whose names were in the last batch.

I think you might need to use the '-r' option for tar (append to an archive), not '-c' (create the archive). Then you should get everything.
Won't work ! I haven't run into the limit on xargs in a long time (it's abt 2MB on F17), but the last time I did xargs was not selecting a nice point in the input stream to iterate the command . It will (I believe) cut the filelist in the middle of a file name and leave you with a mess. so 'xargs tar -r..' is not a good approach.

xargs is great for command line doodling, but not for a script that could generate large output.
So AVOID XARGS !



You want to use the tar -T option ...
Code:
find .  -type f -some -gawdawful -options -print | tar -cf foo.tar -T -
-print0 will not work, as it separates the names with a null and not a newline.

There is no need to specially process the space characters in this case. The spaces are writtted to stdout via -print and read as stdin in the "tar ... -T -"

---------- Post added at 05:43 PM ---------- Previous post was at 05:27 PM ----------

====

Quote:
Originally Posted by henry7849 View Post
Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M > foo
Does exclude the backups/YY and hidden directories.

But,

Code:
find . -path './backups/[0-9][0-9]' -prune -or -not -path '*/.*/*' -not -name '.*' -not -size +5M -print0 | xargs -0 tar -rf foo.tar
includes the backups/YY directories (which I do not want). How can this be?
Oh it be !

The problem is that your find command was printing (and PLEASE lose the -print0) the names of files AND directories.

When tar gets a directory name - it adds the entire directory and all it's sub-directories to the tarball.

a/ use "find . -type f ..." to avoid listing directories.
b/ stop using -print0 , it's useless in this case. -print is the ticket !
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe

Last edited by stevea; 13th November 2012 at 09:33 PM.
Reply With Quote
  #13  
Old 18th November 2012, 04:10 PM
henry7849 Offline
Registered User
 
Join Date: May 2012
Location: New York
Posts: 9
linuxfirefox
Re: find, grep, and white spaces

tar's -T option solved it. Thanks much.

H.
Reply With Quote
Reply

Tags
grep, spaces, white

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spaces Linux Nut Programming & Packaging 6 30th November 2010 06:08 AM
Need help using grep, need to grep something from a txt file. qsub Using Fedora 6 9th January 2010 10:28 PM
Beryl White Cube White Screen XGL whats up? DMD Using Fedora 5 29th November 2006 03:57 AM
"grep" username in samba or grep file owner? overlord Servers & Networking 0 22nd July 2005 02:12 PM


Current GMT-time: 13:50 (Friday, 19-09-2014)

TopSubscribe to XML RSS for all Threads in all ForumsFedoraForumDotOrg Archive
logo

All trademarks, and forum posts in this site are property of their respective owner(s).
FedoraForum.org is privately owned and is not directly sponsored by the Fedora Project or Red Hat, Inc.

Privacy Policy | Term of Use | Posting Guidelines | Archive | Contact Us | Founding Members

Powered by vBulletin® Copyright ©2000 - 2012, vBulletin Solutions, Inc.

FedoraForum is Powered by RedHat