Pentesters Prerequisites

Regular Expressions

Sometimes regular expressions are a concept difficult to understand and use. A regular expression is a set of characters that describes a search pattern. You can use this pattern in a very different way, for example you can search its presence in a string or in a text (pattern-matching).

Usually a pentester uses regular expressions to filter and extract information in documents, client-server communications, tools output and much more.

For instance, we could use them to extract all the email addresses of a web page as well as filter nmap results. From a "defensive" point of view, regular expressions are also commonly used to verify and sanitize inputs. This may be used to avoid the input having bad character or invalid text.

>> "Hello World!!!" =~ /World/
=> 6

You can create a regexp object with:

literal notation (as shown)
%r notation
OO notation

The %r notation works like % notation of strings. The r tells the interpreter to treat the string inside the delimiter as a regular expression. Similar to the string notation, delimiters are custom:

/hello/
%r{hello}
%r!hello! # using ! delimiter

OO notation is simple. Just use new with Regexp class to create the corresponding Regexp object. You can also use Regexp.compile as a synonym for Regexp.new:

%r[hello]
%r&hello&
Regexp.new("hello")
Regexp.compile("hello")

If you use a literal notation you can add a character modifier after the last / of the Regexp. The most commonly used modifier is the i character, which is used for case insensitive matching. If you use OO notation, you shoudl specify the correct attribute when you create the Regexp:

"Hello World!!!" =~ /hello/i         # => 0
"Hello World!!!" =~ /world/i         # => 6

reg = Regexp.new("hello", Regexp::IGNORECASE) # => /hello/i
"Hello World!!!" =~ reg              # => 0

Match Method

Regexp class provides some very useful methods. One of these is match. With a MatchData object you can get some information about the matching such as the position of the matched substring, the matched words and much more. You can treat MatchData as an array, where at each position you can find the matching substring.

>> matching = /world/.match("Hello World!")     # => nil
>> matching = /world/i.match("Hello World!")    
>> matching[0]    # => "World"
>> matching[1]    # => nil

Special characters

There are some characters with special meanings:

() [] {} . ? + * | ^ $

If you want to use them, you have to use a backslash \ in order to escape them. As in:

/\|/

Regular Expression Syntax

Rule

Matching

A single character (it does not match newline)

[]

At least one of the character in square brackets

[^]

At least one of the character not in square brackets

A digit. Same as [0-9]

A non digital characters. Same as [^0-9]

A white space

A non whitespace

A word character, same as [A-Za-z0-9]

A non word characters

The following are some examples that will explain these special characters:

# the string does not contain /auh/
"Hello World" =~ /auh/             # => nil

# the string does not contain 'a', or 'u' or 'h' chars:
"Hello World" =~ /[auh]/           # => nil

# NB: case-insensitive modifier H at index 0
"Hello World" =~ /[auh]/i          # => 0

# no 0,1,2,...9
"Hello World" =~ /[0-9]/           # => nil
"Hello World" =~ /[\d]/            # => nil

# 5 at index 4
"I'am 50." =~ /[0-9]/              # => 4
"I'am 50." =~ /[\d]/               # => 4

# whitespace at index 3
"I'am 50." =~ /[\s]/               # => 3

# only digit
"123456789" =~ /[\D]/              # => nil
"123456789" =~ /[^0-9]/            # => nil

# no word characters
"!=()/& =~ /[\w]/                  # => nil

# ! at index 10
"HellowWorld!!" =~ /[\W]/          # => 10

# A digit followed by upper-case letter at index 6
"Code: 4B" =~ /\d[A-Z]/            # => 6

# 3 digit followed by a whitespace at index 4
"abc 123 abc " =~ /\d\d\d\s/       # => 4

# Matching of Ruby or Rubber
"I'm Ruby" =~ /ruby|rubber/i       # => 4
"I'm Rubber" =~ /ruby|rubber/i     # => 4

# a whitespace or a point
"Hello World" =~ /\s|\./           # => 5
"Hello.World" =~ /\s|\./           # => 5

# Sequences are easier and shorter to use with groups:
"I'm Ruby" =~ /Rub(y|ber)/         # => 4
"I'm Rubber" =~ /Rub(y|ber)/       # => 4
"I'm Ruber" =~ /Rub(y|ber)/        # => nil

reg = /(Ruby).(Perl)/
matching = reg.match("I like Ruby&Perl")
matching[0]                        # => "Ruby&Perl"
matching[1]                        # => "Ruby"
matching[2]                        # => "Perl"

Repetitions

Most used syntax rules of regular expression:

Rule

Matching

exp*

Zero or more occurrences of exp

exp+

One or more occurrences of exp

exp?

Zero or one occurrence of exp

exp{n}

n or more occurrences of exp

exp{n,m}

at least n and at most m occurrences of exp

"RubyRubyRuby" =~ /(ruby){3}/    # => 0
"RubyRubyRuby" =~ /(ruby){4}/    # => nil

# String contains at least one digit
"I'm 50" =~ /\d+/                # => 4

# No digit in the string
"I'm Steve" =~ /\d+/             # => nil

Anchors

Used to specify the position of the pattern matching. The most commonly used are:

Rule

Matching

^exp

exp must be at the begin of a line

exp$

exp must be at the end of a line

\Aexp

exp musb be at the begin of the whole string

exp\Z

exp must be at the end of the whole string

exp\z

same as \Z but match newline too

# string starts with "Hello"
"Hello World =~ /^Hello/    # => 0
"Hello World =~ /\AHello/   # => 0

# string does not start with 'A'
"Hello World" =~ /^A/       # => nil
"Hello World" =~ /\AA/      # => nil

# string ends with "World"
"Hello World" =~ /World$/       # => 6
"Hello World" =~ /World\z/      # => 6
"Hello World" =~ /World\Z/      # => 6

# string does not end with 'l'
"Hello World" =~ /l$/           # => nil

# Check string that contains an IP address
/(\d{1,3}).(\d{1,3}).(\d{1,3}).(\d{1,3})/

Global variables

Variable

Description

The MatchData object of the last match, Rest are derived from this one.

The substring that matches the first group pattern.

The substring that matches the second group pattern.

$2,$3, etc

And so on...

Working with strings

If you take a look at the string class methods, you will notice that many of them can have a regular expression as argument. You can use regexp for: gsub, sub, split and more.

scan allows to iterate through more occurrences of the text matching pattern.

text = "abcd 192.168.1.2 some text 192.168.4.20"
pattern = /(?:\d{1.3}\.){3}(?:\d){1,3}/
text.scan(pattern) { |x| puts x }
# => 192.168.1.2
# => 192.168.4.20

Dates and Time

There are different classes to treat them in Ruby:

Time
Date
DateTime

Time class provides methods to work with your operating system date and time functionality.

# current system time
Time.new # synonymous for Time.new
Time.now

# current time converted in utc
Time.new.utc

Time.local(2014)
Time.local(2014,2)

t = Time.local

t.year
t.month
t.day
t.hour
t.yday

Predicates and Conversions

t = Time.now
t.tuesday?
t.monday?

t = Time.new.utc
t.zone
t.localtime

t.to_i
t.to_a

# Adds 20 seconds to t
t + 20

# adds an hour to t
t + 60*60

# add 6 days to t
t + 6*(60*60*24)

Comparisons

now = Time.now
before = now -50
after = now +50
now > before         # true
after < before       # false

From time to string

There are many other methods that can be used on Time objects. For example, you can obtain a string with to_s or ctime method according to the wanted format.

t = Time.new
t.to_s
t.getustc.to_s
t.ctime
t.getutc.ctime

t.strftime("%Y/%m/%d %P %Z")

Other classes

Ruby provides other classes to manage dates and time data:

Date it is used to manage date
DateTime it is a subclass of Date and it allows to manage time too

Both Date and DateTime can be used as Time . The main difference between `Time and the other two is the internal implementation.

Usually Date and DateTime are slower than Time . They provide different methods that may be useful for your script.

A very useful method is _parse which allows you to create a time Object from a string.

Files and Directories

Ruby provides two classes:

Dir for directories. Defines class methods that allows you to work with directories. It provides a variety of ways to list directories as well as their content. It can also be used to know where the Ruby script is executed or to navigate between file system directories.
File for files. Open a file, get information about it, change its name, chage its permissions and much more.

File

File.exist? "file_example.txt"
File.size "lorem.txt" 
File.zero? "empty_file.txt" # true
File.file? "lorem.txt"
File.directory? "nested_dir"
File.symlink? "lorem-link"

# ftype tests if input is a file, directory or a link
File.ftype "lorem.txt"     # => "file"

File.readable? "lorem.txt"
File.writable? "lorem.txt"
File.executable? "lorem.txt"

# last modification time and last access time as Time object
File.atime "lorem.txt"
File.mtime "lorem.txt"

# basename & dirname
path ="~/ruby/file_example/lorem.txt"
File.basename path                 # => "lorem.txt"
File.basename(path,".txt")         # => "lorem"
File.dirname path                  # => "~/ruby/file_example"
File.extname path                  # => ".txt"
Fifle.split path                   # => ["~/ruby/file_example", "lorem.txt"]

# Working with names
File.join("~", "ruby", "file_example")    # => "~/ruby/file_example"

# extend_path converts relative paths to absolute
File.expand_path("nested_dir")            # => "/root/ruby/nested_dir"

# fnmatch: tests if a filename string matches a specified pattern
File.fnmatch("*.txt", "lorem.txt")

# creation/deletion/renaming
File.open("a_file.txt", "w")
File.new("a_file.txt", "w")
File.rename("a_file.txt", "newname.txt")
File.delete "newname.txt"

# Change permissions
File.chmod(0666, "lorem.txt")