This idea was given to me by a friend to see how
I would check any number of URLs for Status Codes. The requirements were pretty straight forward:
1. Must send an email with
results that is nicely formatted
2. Must print out how many
URLs it will be checking
3. Must be able to comment out
specific URLs don’t want to check
4. Must be able to handle
extra spaces in the URL file
5. Must print out number of
URLs that return a HTTP 404 Status
6. Must have basic error
handling
I am going to break this down to better
understand my thought process. I try to
get the most basic thing working first and in this case was getting the email
to send. The link is below on how to
send internet mail using SMTP along with my test code:
require 'net/smtp'
email_message = <<MESSAGE_END
From: Your Name your@mail.address
To: Destination Address someone@example.com
Subject: TEST
Body of email goes here -- Hello World!
MESSAGE_END
smtp = Net::SMTP.new(‘your.smtp.server’, 25)
smtp.enable_starttls
smtp.start(‘your.smtp.server’, ‘your@mail.address’, ‘Your Password’, :login)
smtp.send_message email_message, ‘your@mail.address’,‘someone@example.com’
smtp.finish
You will need to fill in the correct smtp.server
used and the correct email addresses. Now
you can run the Ruby file in the terminal and get an automated test email. That was pretty easy thanks to the wonderful Ruby
documentation.
For the next part you will need to create a text
file with some test URLs. For now, all
you need to have are some that are working and some that are commented out. An example would be:
www.google.com
http://www.yahoo.com
#www.espn.com
#http://nfl.com
As you can see, we have two valid ones and two
commented ones. Notice that the valid
ones both work but one has the full URL with “HTTP://”. There is a reason for this and I will explain
that a little later.
Now that we have our text file created, we can see
how many URLs from the text file will be tested. While writing this code, we also have to
handle the commented URLs and the blank lines to get an accurate number. Here is how I opened the file, read through
every URL, separated it into either URL Found or URL IGNORED. The ignored ones
are the ones commented out. Here is that
code with a printout showing how many URLs will be tested:
urls_found = []
urls_ignored = []
File = File.open(“/Path/to/txt/file”, “r”)
File.each do |line|
Next if line.strip! == “”
line.insert(0, http://) unless(line.match(/^http\:\/\//) || line.match(/^#/))
if line.match/^#/)
urls_ignored.push(line)
else
urls_found.push(line)
end
end
file.close
puts “URLs to be tested: #{urls_found.length}”
This code is clean and easy to read. It opens the file, loops over every URL,
strips out the whitespace, adds HTTP:// to any uncommented URL that doesn’t
have it(why we have some in our text file with HTTP:// and some without), then
adds any commented lines to the urls_ignored array and all others get added to
the urls_found array. Then we simply
print out how many were in the urls_found array to see how many will be tested.
The last bit of code will handle separating the
404 Status Codes, the invalid URLs and give the results of all others. You will need to add “require 'net/http'” to
test URLs. The link below is how to use
Net::HTTP. Let’s take a look at the
code:
status_code_404 = []
result = []
invalid_urls = []
urls_found.each_with_index do |url, i|
begin
res = Net::HTTP.get_response(URL(url))
if res.code == “404”
status_code_404.push(res)
end
result.push(“#{url} returns: #{res.code}, #{res.message}”)
rescue
result.push(“#{url} returns: Error occurred – please check your URL.”)
invalid_urls.push(url)
end
print “* “
end
puts “\nTotal # of 404’s: #{status_code_404.length}”
puts “Total # of Ignored URLs: #{urls_ignored.length}”
puts “Total # of Invalid URLs: #{invalid_urls.length}”
Let’s break down this block of code. First we loop over all the array with all the
urls_found from earlier in the code. We check
to see if the Status Code return matches “404” add it to the status_code_404
array. If not, it pushes it to the
result array unless it has Error Occurred then it pushes it into the
invalid_urls array. The Begin-End block
handles the exception to make sure invalid URLs are handled properly. Then we simply print the results of all the
URLs tested.
The code is working properly and will handle any
amount of URLs in the text file. The
last step to follow up on is to format the email with all the results.
If you would like to see the finished product,
you can click the link to my Github account.