Why isnt this returning anything...

Status
Not open for further replies.

Rascagua

New member
Jun 18, 2007
281
3
0
This should return the technical details for any amazon page I enter..

PHP:
$ch = curl_init($page);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 10);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 5);
        curl_exec($ch);
        $data = curl_exec($ch);
        
//scrape technical details
            $regex = '/id="technical_details"><\/a>(.+?)<div id="technicalProductFeatures/';
            preg_match($regex, $data, $match);
            $techdetails = $match[1];

ECHO "$techdetails";
This is the html from the site, why isnt it returning anything?

Code:
<a name="technical_details" id="technical_details"></a><h2>Technical Details</h2>
    
    <div class="content">
      
        <ul style="list-style: disc; padding-left: 25px;">
          
          <li>Now the world's most popular music player lets you enjoy up to 5 hours of TV shows, movies, video podcasts, and more</li>
          <li>An enhanced interface offers a whole new way to browse and view your music and video</li>
          <li>iPod nano sports a larger, 320-by-240-resolution display that's 65 percent brighter than before</li>
          <li>In anodized aluminum and polished stainless steel, iPod nano is now 6.5 mm thin and even more beautiful</li>
          <li>Measures 2.75 x 2.06 x 0.26 inches (H x W x D), weighs 1.74 ounces</li>
          
        </ul>
       
      <div id="technicalProductFeatures"></div>
      
      <span class="caretnext">›</span> 




<a href="http://www.amazon.com/Apple-iPod-nano-GB-Silver/dp/tech-data/B000JO7PIM/ref=de_a_smtd">See more technical details</a>

   </div>
 


Add some debugging that should tell you a lot. Whether you're getting your page or not using regex properly.

$ch
= curl_init($page);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 10);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 5);
curl_exec($ch);
$data = curl_exec($ch);
echo $data;

//scrape technical details
$regex = '/id="technical_details"><\/a>(.+?)<div id="technicalProductFeatures/';
preg_match($regex, $data, $match);
var_dump($match);
$techdetails = $match[1];

ECHO
"$techdetails";
 
Add some debugging that should tell you a lot. Whether you're getting your page or not using regex properly.

$ch
= curl_init($page);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 10);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 5);
curl_exec($ch);
$data = curl_exec($ch);
echo $data;

//scrape technical details
$regex = '/id="technical_details"><\/a>(.+?)<div id="technicalProductFeatures/';
preg_match($regex, $data, $match);
var_dump($match);
$techdetails = $match[1];

ECHO
"$techdetails";

well, see the thing is, this is my full script, where I am pulling several things out of the page:

PHP:
<?php
    if($_POST["page"]!=""){
        $page = $_POST["page"];
        $referal = $_POST["referal"];
        
    $ch = curl_init($page);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 10);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 5);
        curl_exec($ch);
        $data = curl_exec($ch);
        echo $data;
        
//scrape productname (.+?)
            $regex = '/<span id="btAsinTitle">(.+?)</';
            preg_match($regex, $data, $match);
            $productname = $match[1];
            var_dump($match);
      echo "<br>";
//scrape price     
            $regex = '/<span class="price">(.+?)<\/span>/';
            preg_match($regex, $data, $match);
            $price = $match[1];
            var_dump($match);
            echo "<br>";
//scrape $productdescription
            $regex = '/<br\/><b>Product Description<\/b><br\/>(.+?)<\/div>/';
            preg_match($regex, $data, $match);
            $productdescription = $match[1];
            var_dump($match);
            echo "<br>";
//scrape technical details
            $regex = '/id="technical_details"><\/a>(.+?)<div id="technical/';
            preg_match($regex, $data, $match);
            $techdetails = $match[1];
            var_dump($match);
            echo "<br>";

This is what it gives me back:

Code:
    array(2) {   [0]=>   string(56) "Apple iPod nano 4 GB Silver (3G)<"   [1]=>   string(32) "Apple iPod nano 4 GB Silver (3G)" } 
array(2) {   [0]=>   string(34) "$139.99"   [1]=>   string(7) "$139.99" } 
array(0) { } 
array(0) { }

basically, no tech details or product details
 
im not a regex expert but i did notice you are calling curl_exec twice which is not needed...

PHP:
curl_exec($ch);
$data = curl_exec($ch);

all you need is

PHP:
$data = curl_exec($ch);

perhaps that might be affecting your output
 
Here is a new pattern for you:

Code:
$pattern = '/id="technical_details"><\/a>(.*)<div id="technicalP/ms';

::emp::
 
Gumby, no no no.. I have just had fun with my comment, no offense taken.

You have been absolutely right.
"Save code wherever possible" is one of the key mantras. Less code is always better.

::emp::
 
Status
Not open for further replies.