Category Archives: PHP

Experiments with Getting Web Content Using ColdFusion and PHP

Share this:  

So you want to get content from another web site out there to use on your site. You may be doing screen scraping of a page out there… But a better use is to get info from some sort of a web API.

Case in point: Calling a URI from the U.S. Weather Service to get current weather information.

http://w1.weather.gov/xml/current_obs/KMDW.xml

… This URI will return weather data from Chicago’s Midway Airport

Using ColdFusion

Here is a ColdFusion file for getting info on the current weather in Chicago:

It works fine. I do any Ajax call to “weather.cfm” from my home page, parse of the current temperature, and weather description and display it on my home page.  Nice!

 

If I call “weather.cfm” directly, I get weather data back for Chicago’s weather:

Below you can roughly see how it displays this page in Safari:

If you were to view the source, this is approximately what you would see, XML output:

Again, I have a JavaScript routine that calls weather.cfm via AJAX and pulls data from the <weather></weather> and the <temp_f></temp_f> tags.

This was running in ColdFusion Developer on my iMac and I can access on my personal wifi network.

 


Using PHP

I also wanted to do the same sort of thing using PHP. I decided I would use cUrl. Here is the code I used in a file called “weather.php”:

Notice that the URL is the same one as I’m using in the ColdFusion example.

 

So what happens if I directly call this “weather.php” page I created in my Safari web browser?

Well, we get a page like the one displayed above. Bummer! This is running on a PHP MAMP server on the same iMac as the ColdFusion server is. And trust me, the ColdFusion server is not calling this URL with any special permissions!

This led me to suspect that there was something different about the HTTP Header being sent by the ColdFusion server than was being sent by the PHP server using cURL.

But how to figure out what ColdFusion is doing differently than PHP?  Create a new PHP page to call instead of calling the xml file…

 


 

HTTP Header Test Page

I was going to create a PHP page that would look at it’s HTTP Header values and output them to the page for me to be able to see!

Here it is:

 

Now, if we changed weather.php to point to this URL, what do we get?

Note that the ‘1’ at the bottom is an artifact of cUrl (unless you set the CURLOPT_RETURNTRANSFER option to true.

What about doing the same thing with weather.cfm ?

There is definitely a difference between the two. Both have the same value for “Host”. Not much else is the same! It could be that the PHP request has HTTP Headers that the server does not like… But I’m going with the assumption that the PHP request is MISSING one or more HTTP headers that the web server (w1.weather.gov) is expecting. So lets modify our weather.php file:

 

Notice above how I added a new block of code (lines 8 through 12). This is adding three headers to our HTTP Header: ‘User-Agent’, ‘Connection’, and ‘Accept-Encoding’. I saved my changes and refreshed this page.

BINGO! IT WORKED!

But is the server looking for all three of these headers?

I remove ‘Accept-Encoding’.  I refresh the browser.  It still works.

I remove ‘Connection’.  I refresh the browser.  It still works.

And (of course) I remove ‘User-Agent’.  I refresh the browser.  And of course it fails.

So, ‘User-Agent’ is the key. Currently, in our example, it is set to the value of ‘ColdFusion’. Because that is the value I got when running the ColdFusion page. But actually (of course) our page is a PHP page when is requesting the info.

I change the value of ‘User-Agent’ from ‘ColdFusion’ to ‘PHP’. I refresh the browser and it works.

I wonder, is:   w1.weather.gov   looking for specific values for this header, or just that the ‘User-Agent’ header is present in the HTTP header?

So, I change the value of ‘User-Agent’ to: ‘SugarBoogers’.  I refresh the page and it works! This means that the server (at least in this case) is just checking to make sure that the ‘User-Agent’ HTTP Header is present and has a value… but doesn’t care WHAT the value is (I’m sure that ‘SugarBoogers’ is not a common user agent to check for!

Wrapping It Up

You might be able to “screen scrape” a web page without custom setting any HTTP headers. But I suspect that if your calling some sort of XML feed, JSON resource, or web service URI, there’s a good chance that you will need to set the ‘User-Agent’ HTTP header in order to get it to work.

Any comments? Thoughts? Let me know.

Happy Coding!

 

Resources

Time-Out ,3D Printing Stuff, Web Stuff and More

Share this:  

I’ve been off of work since just before Thanksgiving and for the month December. After a really hard year of work I finally have time off. It is wonderfully strange!

  • Extruder on my Makerbot 2 3D printer got totally messed up (I may do a post on this going into more detail later).
  • Started watching more videos on Lynda.com on modeling in Sketchup. Also on YouTube.
  • My 3D Printer was out of commission for awhile.
  • I joined the 3D Printing Group over on Facebook. You have to submit a request for membership in order to join since it is a closed group.
    • This is an awesome group!
    • There are a lot of members here who know a lot and are passionate about 3D Printing!
  • I subscribed to the 3D Printing Nerd YouTube channel run by Joel Telling.
    • One thing I saw on his channel was his review of the pre-release version of Lulzbot TAZ 6 3D Printer. I was intrigued. I could buy it on Amazon if I wanted to.  $2,500.00  … a Lot of money… hmmm…
  • I started learning more about how to code a PhP web site. I set up several different internal sites to do useful stuff for me and my family.
  • I started looking into writing my own blogging software again. We’ll see where THAT goes!
  • Saw on the 3D Printing Group a contest to design a printable Nerf dart gun. Got in on the contest rather late… I did submit an entry… basically all I had time to do was an air piston that shoots a single Nerf dart out of it’s barrel that is powered by two rubber bands.
  • Also saw on the 3D Printing Group someone asked about what anyone thought about the Lulzbot TAZ 6 3D Printer.
    • There was a bunch of responses of users who had the printer, loved it, loved the service, and were showing pictures of things that they printed on their new printer.
    • This pushed me over the edge. My Makerbot still was not working at that time, and I just did it!
  • I really like my new TAZ 6 printer!  I like that it auto-levels!  I like how many different kinds of filament I can use with it.  More on the TAZ in another post.
  • I bought some parts to fix my Makerbot from Fargo 3D Printing.
  • Eventually I fixed my Makerbot and got it working again. Yay!
  • Upgraded my version of WordPress here to the latest version today (its about time)!

Time Off and Some Progress on my Own Blog Software

Share this:  

Yay! I am off until the end of the year! When it comes to vacation where I work its “use it or lose it!”

Blog Software
Its on Apache, using PHP, mySQL, and a .htaccess  file. After futsing about I got the .htaccess file doing what I want for my new blog software (I am no expert on this file by any means though).

The development environment is my iMac using MAMP. I’ve got it reading a site table in my mySQL database. This software will support multiple sites on one server!

Next I’ve got to look up how to use sub-string functions in PHP. I need to take the rootPath field value from my sites table and compare it with the beginning of the URI that the user typed in in their browser. If it matches, I will set the site id, if not, I want to have some sort of fall-out page.

I Began working on new Blogging Software

Share this:  

For all the benefits of using WordPress for my blog, there are many more draw-backs. Sometimes I wonder if I would have been better hand-coding my blog! Writing a post like this that only contains your basic text, its fine. You start wanting to embed pictures or perhaps videos and it behaves very badly. I lost a whole long post recently 🙁 .

So, over vacation I started filling out a notebook with designs for my own blogging software. I’ve been doing a lot of content management software at work so doing content management type stuff is fresh in my mind. My goal is to get off of WordPress and get off soon. I want to be able to put together posts efficiently. I want to spend my time writing and refining my posts, not struggling with the technical problems of the platform I am using.

The database I am using is mySQL. I was going to write the blog software in Coldfusion (since that is what I am most comfortable with). But I have decided to use PHP. All nice and open source! 🙂 I’m developing it in a web instance running on my iMac using MAMP.

Goal: That I have a local instance working with basic functionality by this Saturday.