Notices
Computer & Technology Related Post here for help and discussion of computing and related technology. Internet, TVs, phones, consoles, computers, tablets and any other gadgets.

regexp

Thread Tools
 
Search this Thread
 
Old 01 August 2005, 10:19 AM
  #1  
NotoriousREV
Scooby Regular
Thread Starter
 
NotoriousREV's Avatar
 
Join Date: Jan 2002
Posts: 11,581
Likes: 0
Received 0 Likes on 0 Posts
Default regexp

Can anbody help, I just can't seem to get my head around regular expressions. I have about 20Gb of logfiles. I want to create a single logfile with only specific information in it. All the logfiles are from one day, so I don't need to search on date, but I want to search between 2 times and for a particular website.

The time format is hh:mm:ss and assume the web server is 10.0.0.1

The 2 times are 01:00:00 and 02:00:00

I assume I'd simply do something along the lines of grep [regexp] > file.log

Any help appreciated. My poor brain hurts trying to understand escape characters and modifiers

Last edited by NotoriousREV; 01 August 2005 at 10:32 AM.
Old 01 August 2005, 10:40 AM
  #2  
stevencotton
Scooby Regular
 
stevencotton's Avatar
 
Join Date: Jan 2001
Location: behind twin turbos
Posts: 2,710
Likes: 0
Received 1 Like on 1 Post
Default

Originally Posted by NotoriousREV
The time format is hh:mm:ss and assume the web address is www.foobar.com (which would include sub-directories). The 2 times are 01:00:00 and 02:00:00

I assume I'd simply do something along the lines of grep [regexp] > file.log
Not quite as simple as that since you need to do some range checking on the time. If you remove the colons from the time that'll help.

Assuming the time would be at index 0 in the logfile if I split on whitespace, something like this would do it, on a unix-like system, with perl installed:

$ perl -nle '@s = split(/\s+/, $_); $s[0] =~ tr/://; $s[0] >= 10000 && $s[0] <= 20000 && print' yourlogfile > newlogfile
Old 01 August 2005, 07:42 PM
  #3  
NotoriousREV
Scooby Regular
Thread Starter
 
NotoriousREV's Avatar
 
Join Date: Jan 2002
Posts: 11,581
Likes: 0
Received 0 Likes on 0 Posts
Default

Thanks steven. I couldn't get it to work so I ended up using a far less elegant method which worked well enough! I think I'll be doing some serious regexp swotting 'cos it would be very handy if I could get this to work.
Old 02 August 2005, 10:16 AM
  #4  
stevencotton
Scooby Regular
 
stevencotton's Avatar
 
Join Date: Jan 2001
Location: behind twin turbos
Posts: 2,710
Likes: 0
Received 1 Like on 1 Post
Default

You won't be able to do it (with any kind of efficiency) with a regular expression because you need to bounds-check the time. Doing something like 0[12]:\d{2}:\d{2} will allow 02:34:12 for example, so you'd end up having logic embedded within the regex checking that $2 and $3 don't go over 0 if $1 is 2, etc.




All times are GMT +1. The time now is 07:28 PM.