UPDATE: NEW BUY.COM SPIDERS (website change)

Help each other out

UPDATE: NEW BUY.COM SPIDERS (website change)

Postby jlr2000 on Sat Nov 12, 2005 11:48 pm

I have used the spiders and have also just logged on to Buy.com to grab their images, which are some of the best quality IMO. The website has changed and now if you click on the smaller image a larger is presented however you cannot right click and "save as".

You can get around this by doing a view source and grabbing the large image url and opening it separately, but this is a bit more complicated. Is it possible to change the spider to do this automagically?
Last edited by jlr2000 on Mon Nov 14, 2005 2:47 am, edited 1 time in total.
jlr2000
 
Posts: 46
Joined: Thu Sep 01, 2005 3:55 pm

Postby dgemily on Sun Nov 13, 2005 1:32 am

http://www.france.xlobby.com/spiders/dv ... uy.com.txt

let me know if it's works correctly ;)
dgemily
 
Posts: 793
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France

Postby jlr2000 on Sun Nov 13, 2005 1:54 am

I tried it but it did notwork. No covers were listed or displayed. Thanks for the attempt!
jlr2000
 
Posts: 46
Joined: Thu Sep 01, 2005 3:55 pm

Postby hjackson on Sun Nov 13, 2005 7:56 am

Dgemily, the spider would have to change one word in the coverart URL to get the large image. The small image URL has the word "prod" in it (ie. http://ak.buy.com/db_assets/prod_images ... 725040.jpg) . Changing "prod" to "large" will bring up the large image (ie. http://ak.buy.com/db_assets/large_image ... 725040.jpg)
I hope this helps.

hjackson

PS. Actually, saving the large image is not complicated at all if you use this shortcut: When right clicking over the small image, click "View Image". That will bring up the URL of the small image, then change "prod" to "large" as I mentioned above.
hjackson
 
Posts: 371
Joined: Sat Nov 29, 2003 7:12 am
Location: Tampa, Florida

Postby dgemily on Sun Nov 13, 2005 4:21 pm

I did it quickly yesterday before going to sleep ;) but it was working with my tests... maybe when you save the txt file, it will make a wrong structure....
hjackson, the spider was looking for this cover:
ie:
Code: Select all
<a href="javascript:largeIM('http://ak.buy.com/db_assets/large_images/303/40722303.jpg');hide

this is the line to find, so in the spider we ca write:
Code: Select all
largeIM_hide\('(?<coverart>.*?)'\)">

or
Code: Select all
'(?<coverart>http://ak.buy.com/db_assets/large_images/.*?)'


etc...

so the full spider will be:


Code: Select all
url=http://www.buy.com/retail/searchresults.asp?search_store=4&querytype=video%5Fdvd&qu=%searchstring%
results=<a tabindex="." href="(?<url>.*?)" class="medBlueText"><b>(?<display>.*?)</b></a>

//find coverart <url> tag will open that url and use next regex
largeIM_hide\('(?<coverart>.*?)'\)">


or

Code: Select all
url=http://www.buy.com/retail/searchresults.asp?search_store=4&querytype=video%5Fdvd&qu=%searchstring%
results=<a tabindex="." href="(?<url>.*?)" class="medBlueText"><b>(?<display>.*?)</b></a>

//find coverart <url> tag will open that url and use next regex
'(?<coverart>http://ak.buy.com/db_assets/large_images/.*?)'


later,
dgemily
 
Posts: 793
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France

Postby jlr2000 on Sun Nov 13, 2005 6:42 pm

Fantastic! Works great......

is it simple enough to change the Music spider for Buy.com as well?


Thanks!
jlr2000
 
Posts: 46
Joined: Thu Sep 01, 2005 3:55 pm

Postby dgemily on Mon Nov 14, 2005 12:06 am

for: music - buy.com.txt

Code: Select all
url=http://www.buy.com/retail/searchresults.asp?search_store=6&querytype=music&qu=%searchstring%
results=<a tabindex="." href="(?<url>/prod/.*?)" class="medBlueText"><b>(?<display>.*?)</b></a>

//find coverart <url> tag will open that url and use next regex
largeIM_hide\('(?<coverart>.*?)'\)">


or

Code: Select all
url=http://www.buy.com/retail/searchresults.asp?search_store=6&querytype=music&qu=%searchstring%
results=<a tabindex="." href="(?<url>/prod/.*?)" class="medBlueText"><b>(?<display>.*?)</b></a>

//find coverart <url> tag will open that url and use next regex
'(?<coverart>http://ak.buy.com/db_assets/large_images/.*?)'



for both ( dvd and music), these spiders will find a cover only if a large cover exist.... (and nothing even if a small one exist...)
dgemily
 
Posts: 793
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France

Postby jlr2000 on Mon Nov 14, 2005 2:48 am

Great! Thanks so much for the those.....

I've changed the subject for anyone else searching for these....


Thanks dgmeily!
jlr2000
 
Posts: 46
Joined: Thu Sep 01, 2005 3:55 pm

Postby dgemily on Mon Nov 14, 2005 8:15 am

I built a topic about spiders : http://www.xlobby.com/forum/viewtopic.php?t=3640

I will upload those spiders then update the post.... ;)
dgemily
 
Posts: 793
Joined: Thu May 13, 2004 6:24 am
Location: Paris, France